[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463548#comment-13463548
 ] 

Devaraj Das commented on HBASE-6679:


bq. For sure the regions was not doubly-assigned? Split happened of the region 
on one server but on another server, the same region was being compacted? You'd 
need the master logs to figure it a dbl-assign

Unfortunately, didn't save the master logs when the failure happened.. 

bq. Can you figure a place where we'd be running compactions on a region 
concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look 
like any locks while SplitTransaction is running (closing parent, it'll need 
write lock... thats after daughters open though).

I can't figure out a place where this could happen in the natural execution of 
the regionserver.

bq. Storefiles are an ImmutableList.

Yes.. but that still could be exposed to the problems of memory inconsistencies 
when multiple threads are accessing the object in unsynchronized/non-volatile 
ways, no?

bq. @Deva

After a long time, someone addressed me by that name :-)

bq. So before this itself the region got closed. I feel the store file list 
should have been updated by the time. No ?

Can't say Ram for sure. There is no guarantee unless the access (read/write) 
are synchronized or the field is declared volatile..


> RegionServer aborts due to race between compaction and split
> 
>
> Key: HBASE-6679
> URL: https://issues.apache.org/jira/browse/HBASE-6679
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.92.3
>
> Attachments: rs-crash-parallel-compact-split.log
>
>
> In our nightlies, we have seen RS aborts due to compaction and split racing. 
> Original parent file gets deleted after the compaction, and hence, the 
> daughters don't find the parent data file. The RS kills itself when this 
> happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463543#comment-13463543
 ] 

stack commented on HBASE-4565:
--

[~svarma] So we should apply the patch to 0.92 and 0.94?  The v3 patch still 
works on windows?  Thanks for checking trunk.

> Maven HBase build broken on cygwin with copynativelib.sh call.
> --
>
> Key: HBASE-4565
> URL: https://issues.apache.org/jira/browse/HBASE-4565
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.92.0
> Environment: cygwin (on xp and win7)
>Reporter: Suraj Varma
>Assignee: Suraj Varma
>  Labels: build, maven
> Fix For: 0.96.0
>
> Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, 
> HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch
>
>
> This is broken in both 0.92 as well as trunk pom.xml
> Here's a sample maven log snippet from trunk (from Mayuresh on user mailing 
> list)
> [INFO] [antrun:run {execution: package}]
> [INFO] Executing tasks
> main:
>[mkdir] Created dir: 
> D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform}
> [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: 
> No such file or directory
> [exec] tar (child): Cannot connect to D: resolve failed
> [INFO] 
> 
> [ERROR] BUILD ERROR
> [INFO] 
> 
> [INFO] An Ant BuildException has occured: exec returned: 3328
> There are two issues: 
> 1) The ant run task below doesn't resolve the windows file separator returned 
> by the project.build.directory - this causes the above resolve failed.
> 
> 
> if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then
> 2) The tar argument value below also has a similar issue in that the path arg 
> doesn't resolve right.
> 
>  dir="${project.build.directory}/${project.artifactId}-${project.version}">
> 
>  value="/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz"/>
> 
> 
> In both cases, the fix would probably be to use a cross-platform way to 
> handle the directory locations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463535#comment-13463535
 ] 

ramkrishna.s.vasudevan commented on HBASE-6679:
---

@Deva
Am not able to tell clearly what is the problem.  I too went thro those logs 
and found that the region 5689a8785bbc9a8aa8e526cd7ef1542a has completed the 
compaction.

{code}
2012-08-28 06:15:34,107 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed 
compaction: 
regionName=TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.,
 storeName=f1, fileCount=3, fileSize=27.3m, priority=3, time=14360293782301; 
duration=4sec

{code}
and later the split has started for the region (after 2 ms)
{code}
2012-08-28 06:15:34,109 INFO 
org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region 
TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.
{code}
The offlining of the region is done here

{code}
2012-08-28 06:15:34,788 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Offlined parent region 
TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.
 in META
{code}
So before this itself the region got closed. I feel the store file list should 
have been updated by the time. No ?


> RegionServer aborts due to race between compaction and split
> 
>
> Key: HBASE-6679
> URL: https://issues.apache.org/jira/browse/HBASE-6679
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.92.3
>
> Attachments: rs-crash-parallel-compact-split.log
>
>
> In our nightlies, we have seen RS aborts due to compaction and split racing. 
> Original parent file gets deleted after the compaction, and hence, the 
> daughters don't find the parent data file. The RS kills itself when this 
> happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463532#comment-13463532
 ] 

liang xie commented on HBASE-6882:
--

Thanks [~saint@gmail.com] for nice guiding ! My plan is to resolve some 
outstanding thrift related issues firstly, afterwards i could know more 
details, then maybe i'll have a good feeling on how to fuse thrift&thrift2. 
Don't worry, i'll send a design note before making any big change:)

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463528#comment-13463528
 ] 

stack commented on HBASE-6702:
--

+1 on commit after addressing Jesse comments.  The rest of the convertion work 
would be done in another issue?  Good stuff N.

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch, 6702.v4.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463523#comment-13463523
 ] 

stack commented on HBASE-6679:
--

For sure the regions was not doubly-assigned? Split happened of the region on 
one server but on another server, the same region was being compacted?  You'd 
need the master logs to figure it a dbl-assign.

Storefiles are an ImmutableList.

Can you figure a place where we'd be running compactions on a region concurrent 
w/ our splitting it?  Compacting we take out write lock.  Doesnt look like any 
locks while SplitTransaction is running (closing parent, it'll need write 
lock... thats after daughters open though).

> RegionServer aborts due to race between compaction and split
> 
>
> Key: HBASE-6679
> URL: https://issues.apache.org/jira/browse/HBASE-6679
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.92.3
>
> Attachments: rs-crash-parallel-compact-split.log
>
>
> In our nightlies, we have seen RS aborts due to compaction and split racing. 
> Original parent file gets deleted after the compaction, and hence, the 
> daughters don't find the parent data file. The RS kills itself when this 
> happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463525#comment-13463525
 ] 

stack commented on HBASE-6882:
--

@Liang ... or just pick up any outstanding thrift issues and take a look at 
resolving them?

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.

2012-09-25 Thread Suraj Varma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463518#comment-13463518
 ] 

Suraj Varma commented on HBASE-4565:


This is no longer an issue on trunk, it appears. The build script 
modularization changes have completely done away with the copynativelibs.sh 
which caused the original issue. I am able to build from trunk successfully via 
cygwin now.

> Maven HBase build broken on cygwin with copynativelib.sh call.
> --
>
> Key: HBASE-4565
> URL: https://issues.apache.org/jira/browse/HBASE-4565
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.92.0
> Environment: cygwin (on xp and win7)
>Reporter: Suraj Varma
>Assignee: Suraj Varma
>  Labels: build, maven
> Fix For: 0.96.0
>
> Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, 
> HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch
>
>
> This is broken in both 0.92 as well as trunk pom.xml
> Here's a sample maven log snippet from trunk (from Mayuresh on user mailing 
> list)
> [INFO] [antrun:run {execution: package}]
> [INFO] Executing tasks
> main:
>[mkdir] Created dir: 
> D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform}
> [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: 
> No such file or directory
> [exec] tar (child): Cannot connect to D: resolve failed
> [INFO] 
> 
> [ERROR] BUILD ERROR
> [INFO] 
> 
> [INFO] An Ant BuildException has occured: exec returned: 3328
> There are two issues: 
> 1) The ant run task below doesn't resolve the windows file separator returned 
> by the project.build.directory - this causes the above resolve failed.
> 
> 
> if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then
> 2) The tar argument value below also has a similar issue in that the path arg 
> doesn't resolve right.
> 
>  dir="${project.build.directory}/${project.artifactId}-${project.version}">
> 
>  value="/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz"/>
> 
> 
> In both cases, the fix would probably be to use a cross-platform way to 
> handle the directory locations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463514#comment-13463514
 ] 

stack commented on HBASE-6882:
--

@liang That'd be great. Would suggest first a survey of thrift1 and thrift2.  
Figure what the difference is.  Do you want to have the two packages achieve 
parity?  Or do you want to add what is in thrift2 to thrift1 and keep up 
thrift1?  The exmamples package has stuff to exercise the thrift stuff.  A few 
more unit tests would probably not go amiss.  Good on you.

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463511#comment-13463511
 ] 

liang xie commented on HBASE-6882:
--

Got it, [~saint@gmail.com]
I'd like to have a try:)

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463507#comment-13463507
 ] 

stack commented on HBASE-6882:
--

@Lang thrift2 tries to make the thrift apis more align w/ current trunk.  
thrift1 has most usage and hence more trust.  What is lacking is an owner for 
either package.   Without this folks show up and fix their particular issue in 
whatever package they are using and then move on.  Would be grand if someone 
could drive thrift2 so it had all of thrift1 and was better aligned w/ the 
native apis.

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463504#comment-13463504
 ] 

liang xie commented on HBASE-6882:
--

Hi Mikhail, seems attached file is not for current community TRUNK version? 
since i saw :
{code:title=Hbase.thrift|borderStyle=solid}
 exception IOError {
   1: string message,
-  2: i64 backoffTimeMillis
+  2: i64 backoffTimeMillis,
+  3: string exceptionClass
 }
{code} 

there is no backoffTimeMillis parameter in struct IOError on current trunk code

and another thing, do we encourage using thrift2 more than thrift right now ? 
if that's right, maybe changing thrift2's TIOError is great ? 

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463503#comment-13463503
 ] 

liang xie commented on HBASE-6882:
--

Hi Mikhail, seems attached file is not for current community TRUNK version? 
since i saw :
{code:title=Hbase.thrift|borderStyle=solid}
 exception IOError {
   1: string message,
-  2: i64 backoffTimeMillis
+  2: i64 backoffTimeMillis,
+  3: string exceptionClass
 }
{code} 

there is no backoffTimeMillis parameter in struct IOError on current trunk code

and another thing, do we encourage using thrift2 more than thrift right now ? 
if that's right, maybe changing thrift2's TIOError is great ? 

> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split

2012-09-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463494#comment-13463494
 ] 

Devaraj Das commented on HBASE-6679:


Okay, did some digging into the logs (that was attached in the jira earlier) 
and the code. Doesn't seem like a race between compaction and split (apologies 
for the confusion I might have created). The two are sequential (at the end of 
a compaction, split is requested for). But I'll note that the split happens in 
a separate thread.

The problem is that the daughter tries to open a reader to a file that doesn't 
exist. 
{noformat}
java.io.IOException: Failed 
ip-10-4-197-133.ec2.internal,60020,1346119706203-daughterOpener=4efb1c92918bbf3c54d0ead3345bb735
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:368)
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:456)
at 
org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException: File does not exist: 
/apps/hbase/data/TestLoadAndVerify_1346120615716/5689a8785bbc9a8aa8e526cd7ef1542a/f1/5a55df83829f401993d95ecf2e539ba1
{noformat}

The method SplitTransaction.createDaughters creates the reference files (via a 
call to the method SplitTransaction.splitStoreFiles) that the daughter then 
tries to open. The list of files to create references to is the set of entries 
in the storeFiles field in Store.java (obtained via the call to 
this.parent.close in createDaughters). The storeFiles is last updated (in the 
thread doing the compaction) in the method Store.completeCompaction.

My suspicion is that the problem is due to the fact that accesses to storeFiles 
is not synchronized, and it not volatile either. This leads to inconsistencies 
in the compaction-thread and split-thread and the split thread doesn't see the 
last updated value of the field.

If the above theory is right (and I have this theory only), then the solution 
could be to make the storeFiles field volatile.

Thoughts?

> RegionServer aborts due to race between compaction and split
> 
>
> Key: HBASE-6679
> URL: https://issues.apache.org/jira/browse/HBASE-6679
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.92.3
>
> Attachments: rs-crash-parallel-compact-split.log
>
>
> In our nightlies, we have seen RS aborts due to compaction and split racing. 
> Original parent file gets deleted after the compaction, and hence, the 
> daughters don't find the parent data file. The RS kills itself when this 
> happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463471#comment-13463471
 ] 

Hudson commented on HBASE-6025:
---

Integrated in HBase-TRUNK #3379 (See 
[https://builds.apache.org/job/HBase-TRUNK/3379/])
HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface; 
REAPPLY (Revision 1390240)
HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface; REVERT -- 
OVERCOMMIT (Revision 1390239)
HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface (Revision 
1390238)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* 
/hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp

stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* 
/hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp
* /hbase/trunk/hbase-server/src/main/ruby/hbase/admin.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/hbase.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/commands.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/formatter.rb

stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp
* 
/hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp
* /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp
* /hbase/trunk/hbase-server/src/main/ruby/hbase/admin.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/hbase.rb
* /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/commands.rb
* /hbase/trunk/hbase-server/src/main/ruby/shell/formatter.rb


> Expose Hadoop Dynamic Metrics through JSON Rest interface
> -
>
> Key: HBASE-6025
> URL: https://issues.apache.org/jira/browse/HBASE-6025
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
> Attachments: HBASE-6025-0.patch, HBASE-6025-1.patch, 
> HBASE-6025-2.patch, HBASE-6025-3.patch, HBASE-6025-4.patch, hbase-jmx2.patch, 
> hbase-jmx.patch, hbase-jmx.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2012-09-25 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463466#comment-13463466
 ] 

chunhui shen commented on HBASE-6870:
-

[~v.himanshu]
These two if statements is not made by this patch, so I just keep the previous.

{code}
public LinkedHashMap getKeysToRegionsInRange(
{code}

Yes, it could be private.

Thanks for the review.

I will rework patch with other comments later

> HTable#coprocessorExec always scan the whole table 
> ---
>
> Key: HBASE-6870
> URL: https://issues.apache.org/jira/browse/HBASE-6870
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 0.94.1
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
> HBASE-6870v2.patch, HBASE-6870v3.patch
>
>
> In current logic, HTable#coprocessorExec always scan the whole table, its 
> efficiency is low and will affect the Regionserver carrying .META. under 
> large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463433#comment-13463433
 ] 

Jesse Yates commented on HBASE-6055:


I was going through the offline snapshot code 
(https://github.com/jyates/hbase/tree/offline-snapshots) and noticed that 
apparently I wrote the following:
{code}
Path editsdir = 
HLog.getRegionDirRecoveredEditsDir(HRegion.getRegionDir(tdir,regionInfo.getEncodedName()));
WALReferenceTask op = new WALReferenceTask(snapshot, this.monitor, editsdir, 
conf, fs, "disabledTableSnapshot");
{code}

For referencing the current hfiles for a disabled table, this makes no sense. 
However, it got me thinking about dealing with recovered edits for a table. 
Even if a table is disabled, it may have recovered edits that haven't been 
applied to the table (a RS comes up, splits the logs, but then dies again 
before replaying the split log). 

If I'm reading the log-splitting code correctly, I think it archives the 
original HLog after splitting, but not before the edits are applied to the 
region. This would mean we also need to reference the recovered.edits directory 
under each region, if we keep the current implementation...right?

I was thinking that instead we can keep the hfiles around in the .logs 
directory until the recovered.edits files for that log file have been replayed. 
This way we can avoid another task for snapshotting (referencing all the 
recovered edits) and keep everything simple fairly simple. There would need to 
be some extra work to keep track of the source hlog - either an 'info' file for 
the source hlog that lists the written recovered.edits files or special naming 
of the recovered.edits files that point back to the source file. 

Thoughts?

> Snapshots in HBase 0.96
> ---
>
> Key: HBASE-6055
> URL: https://issues.apache.org/jira/browse/HBASE-6055
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, master, regionserver, snapshots, Zookeeper
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: hbase-6055, 0.96.0
>
> Attachments: Snapshots in HBase.docx
>
>
> Continuation of HBASE-50 for the current trunk. Since the implementation has 
> drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463409#comment-13463409
 ] 

stack commented on HBASE-5844:
--

Looking at this w/ j-d, now we no longer do nohup so the parent process can 
stick around to watch out for the server crash. This make it so now there are 
two  hbase processes listed per launched daemon.  This is kinda ugly.

When we have this bash script watching the running java process we verge into 
the territory normally occupied by babysitters like supervise.   Our parent 
bash script will always be less than a real babysitter -- supervise, god, etc. 
-- so maybe we should just have this kill znode as an optional script w/ 
prescription for how to set it up -- e.g. run znode remover on daemon crash 
before starting new one (if we want supervise to start a new one).

I'm thinking we should back this out since there are open questions still.

> Delete the region servers znode after a regions server crash
> 
>
> Key: HBASE-5844
> URL: https://issues.apache.org/jira/browse/HBASE-5844
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver, scripts
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.96.0
>
> Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
> 5844.v3.patch, 5844.v4.patch
>
>
> today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
> So the recovery process will stop only after a timeout, usually 30s.
> By deleting the znode in start script, we remove this delay and the recovery 
> starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-09-25 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463401#comment-13463401
 ] 

Jean-Daniel Cryans commented on HBASE-5844:
---

One thing that worries about this patch is the situation where the pid file is 
gone and someone tries to start the region server. It happened to me a bunch of 
times. I tried it with you patch and since it removes ephemeral znode it 
_kills_ the region server that's already running and doesn't start a new one 
because the ports are already occupied.

I'm not sure if this is related to this patch, but we're now missing info when 
using the scripts. We used to have:

{noformat}
su-jdcryans-2:0.94 jdcryans$ ./bin/start-hbase.sh 
localhost: starting zookeeper, logging to 
/Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-zookeeper-h-25-185.sfo.stumble.net.out
starting master, logging to 
/Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-master-h-25-185.sfo.stumble.net.out
localhost: starting regionserver, logging to 
/Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-regionserver-h-25-185.sfo.stumble.net.out
{noformat}

Now we have:

{noformat}
su-jdcryans-2:trunk-commit jdcryans$ ./bin/start-hbase.sh 

su-jdcryans-2:trunk-commit jdcryans$ 
{noformat}

> Delete the region servers znode after a regions server crash
> 
>
> Key: HBASE-5844
> URL: https://issues.apache.org/jira/browse/HBASE-5844
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver, scripts
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.96.0
>
> Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
> 5844.v3.patch, 5844.v4.patch
>
>
> today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
> So the recovery process will stop only after a timeout, usually 30s.
> By deleting the znode in start script, we remove this delay and the recovery 
> starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface

2012-09-25 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6025:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Applied to trunk.  Thanks for the patch Elliott.

> Expose Hadoop Dynamic Metrics through JSON Rest interface
> -
>
> Key: HBASE-6025
> URL: https://issues.apache.org/jira/browse/HBASE-6025
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
> Attachments: HBASE-6025-0.patch, HBASE-6025-1.patch, 
> HBASE-6025-2.patch, HBASE-6025-3.patch, HBASE-6025-4.patch, hbase-jmx2.patch, 
> hbase-jmx.patch, hbase-jmx.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6353) Snapshots shell

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6353:
---

Issue Type: Sub-task  (was: New Feature)
Parent: HBASE-6055

> Snapshots shell
> ---
>
> Key: HBASE-6353
> URL: https://issues.apache.org/jira/browse/HBASE-6353
> Project: HBase
>  Issue Type: Sub-task
>  Components: shell
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Attachments: HBASE-6353-v0.patch
>
>
> h6. hbase shell with snapshot commands
> * snapshot  
> ** Take a snapshot of the specified name with the specified name 
> * restore_snapshot 
> ** Restore specified snapshot on the original table
> * mount_snapshot   [readonly]
> ** Load the snapshot data as specified table (optional readonly flag)
> * list_snapshots [filter]
> ** Show a list of snapshots
> * delete_snapshot 
> ** Remove a specified snapshot
> h6. Restore Table
> Given a "snapshot name" restore override the original table with the snapshot 
> content.
> Before restoring a new snapshot of the table is taken, just to avoid bad 
> situations.
> (If the table is not disabled we can keep serving reads)
> This allows a full and quick rollback to a previous snapshot.
> h6. Mount Table (Aka Clone Table)
> Given a "snapshot name" a new table is created with the content of the 
> specified snapshot.
> This operation allows:
>  * To have an old version of the table in parallel with the current one.
>  ** Look at snapshot side-by-side with the "current" before making the 
> decision whether to roll back or not
>  * To Restore only "individual items" (only some small range of data was lost 
> from "current")
>  ** MR job that scan the cloned table and update the data in the original 
> one. (Partial restore of the data)
>  * if the table is not marked as read-only
>  ** To Add/Remove data from this table without affecting the original one or 
> the snapshot.
> h6. Open points
>  * Add snapshot type option on take snapshot command (global, timestamp)?
>  * Keep separate the "restore" from "mount"?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5691) Importtsv stops the webservice from which it is evoked

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463347#comment-13463347
 ] 

Hudson commented on HBASE-5691:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


> Importtsv stops the webservice from which it is evoked
> --
>
> Key: HBASE-5691
> URL: https://issues.apache.org/jira/browse/HBASE-5691
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: debarshi basak
>Priority: Minor
>
> I was trying to run importtsv from a servlet. Everytime after the completion 
> of job, the tomcat server was shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463346#comment-13463346
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390013)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463344#comment-13463344
 ] 

Hudson commented on HBASE-6637:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-6637 Argghh... Missed deleted files too (Revision 1390040)
HBASE-6637 Missed new files (Revision 1390035)
HBASE-6637 Move DaemonThreadFactory into Threads and Threads to hbase-common 
(Jesse Yates) (Revision 1390034)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


> Move DaemonThreadFactory into Threads and Threads to hbase-common
> -
>
> Key: HBASE-6637
> URL: https://issues.apache.org/jira/browse/HBASE-6637
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
> hbase-6637-v0.patch, hbase-6637-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3678) Add Eclipse-based Apache Formatter to HBase Wiki

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463345#comment-13463345
 ] 

Hudson commented on HBASE-3678:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


> Add Eclipse-based Apache Formatter to HBase Wiki
> 
>
> Key: HBASE-3678
> URL: https://issues.apache.org/jira/browse/HBASE-3678
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>Priority: Trivial
> Fix For: 0.92.0
>
> Attachments: eclipse_formatter_apache.xml
>
>
> Currently, on http://wiki.apache.org/hadoop/Hbase/HowToContribute , we tell 
> the user to follow Sun's code conventions and then add a couple things.  For 
> lazy people like myself, it would be much easier to just tell us to import an 
> Apache formatter into your Eclipse project and not worry about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6572) Tiered HFile storage

2012-09-25 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6572:
--

Description: 
Consider how we might enable tiered HFile storage. If HDFS has the capability, 
we could create certain files on solid state devices where they might be 
frequently accessed, especially for random reads; and others (and by default) 
on spinning media as before. We could support the move of frequently read 
HFiles from spinning media to solid state. We already have CF statistics for 
this, would only need to add requisite admin interface; could even consider an 
autotiering option. 

Dhruba Borthakur did some early work in this area and wrote up his findings: 
http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . It 
is important to note the findings but I suggest most of the recommendations are 
out of scope of this JIRA. This JIRA seeks to find an initial use case that 
produces a reasonable benefit, and serves as a testbed for further 
improvements. If I may paraphrase Dhruba's findings (any misstatements and 
errors are mine): First, the DFSClient code paths introduce significant 
latency, so the HDFS client (and presumably the DataNode, as the next 
bottleneck) will need significant work to knock that down. Need to investigate 
optimized (perhaps read-only) DFS clients, server side read and caching 
strategies. Second, RegionServers are heavily threaded and this imposes a lot 
of monitor contention and context switching cost. Need to investigate reducing 
the number of threads in a RegionServer, nonblocking IO and RPC.

  was:Consider how we might enable tiered HFile storage. If HDFS has the 
capability, we could create certain files on solid state devices where they 
might be frequently accessed, especially for random reads; and others (and by 
default) on spinning media as before. We could support the move of frequently 
read HFiles from spinning media to solid state. We already have CF statistics 
for this, would only need to add requisite admin interface; could even consider 
an autotiering option. 


> Tiered HFile storage
> 
>
> Key: HBASE-6572
> URL: https://issues.apache.org/jira/browse/HBASE-6572
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>
> Consider how we might enable tiered HFile storage. If HDFS has the 
> capability, we could create certain files on solid state devices where they 
> might be frequently accessed, especially for random reads; and others (and by 
> default) on spinning media as before. We could support the move of frequently 
> read HFiles from spinning media to solid state. We already have CF statistics 
> for this, would only need to add requisite admin interface; could even 
> consider an autotiering option. 
> Dhruba Borthakur did some early work in this area and wrote up his findings: 
> http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . 
> It is important to note the findings but I suggest most of the 
> recommendations are out of scope of this JIRA. This JIRA seeks to find an 
> initial use case that produces a reasonable benefit, and serves as a testbed 
> for further improvements. If I may paraphrase Dhruba's findings (any 
> misstatements and errors are mine): First, the DFSClient code paths introduce 
> significant latency, so the HDFS client (and presumably the DataNode, as the 
> next bottleneck) will need significant work to knock that down. Need to 
> investigate optimized (perhaps read-only) DFS clients, server side read and 
> caching strategies. Second, RegionServers are heavily threaded and this 
> imposes a lot of monitor contention and context switching cost. Need to 
> investigate reducing the number of threads in a RegionServer, nonblocking IO 
> and RPC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6424) TestReplication frequently hangs

2012-09-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463308#comment-13463308
 ] 

Jimmy Xiang commented on HBASE-6424:


May relate to HBASE-6880

> TestReplication frequently hangs
> 
>
> Key: HBASE-6424
> URL: https://issues.apache.org/jira/browse/HBASE-6424
> Project: HBase
>  Issue Type: Bug
>  Components: Replication, test
>Affects Versions: 0.94.0
>Reporter: Andrew Purtell
> Attachments: testReplication.jstack
>
>
> TestReplication frequently hangs. Separated out from HBASE-6406.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463307#comment-13463307
 ] 

Hadoop QA commented on HBASE-6881:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546581/trunk-6881.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2932//console

This message is automatically generated.

> All regionservers are marked offline even there is still one up
> ---
>
> Key: HBASE-6881
> URL: https://issues.apache.org/jira/browse/HBASE-6881
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: trunk-6881.patch
>
>
> {noformat}
> +RegionPlan newPlan = plan;
> +if (!regionAlreadyInTransitionException) {
> +  // Force a new plan and reassign. Will return null if no servers.
> +  newPlan = getRegionPlan(state, plan.getDestination(), true);
> +}
> +if (newPlan == null) {
>this.timeoutMonitor.setAllRegionServersOffline(true);
>LOG.warn("Unable to find a viable location to assign region " +
>  state.getRegion().getRegionNameAsString());
> {noformat}
> Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463294#comment-13463294
 ] 

Phabricator commented on HBASE-6882:


Liyin has accepted the revision "[jira] [HBASE-6882] [89-fb] Thrift IOError 
should include exception class".

  LGTM !

REVISION DETAIL
  https://reviews.facebook.net/D5679

BRANCH
  ioerror_class_name

To: Liyin, Karthik, aaiyer, chip, JIRA, mbautin


> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6883) CleanerChore treats .archive as a table and throws TableInfoMissingException

2012-09-25 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6883:
--

 Summary: CleanerChore treats .archive as a table and throws 
TableInfoMissingException
 Key: HBASE-6883
 URL: https://issues.apache.org/jira/browse/HBASE-6883
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang


{noformat}
2012-09-25 14:52:21,902 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: 
Exception during readTableDecriptor. Current table name = .archive
org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under 
hdfs://c0322.hal.cloudera.com:56020/hbase/.archive
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:417)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:408)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:170)
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:201)
at 
org.apache.hadoop.hbase.master.HMaster.getTableDescriptors(HMaster.java:2205)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:357)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5456) Introduce PowerMock into our unit tests to reduce unnecessary method exposure

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5456:
---

Attachment: hbase-5456-v0.patch

Attaching patch to add jmockit and powermock to the test depdendencies.

For more discussion and examples of why its the right way to go, see 
http://search-hadoop.com/m/HbsjjRSKLc2

> Introduce PowerMock into our unit tests to reduce unnecessary method exposure
> -
>
> Key: HBASE-5456
> URL: https://issues.apache.org/jira/browse/HBASE-5456
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
> Attachments: hbase-5456-v0.patch
>
>
> We should introduce PowerMock into our unit tests so that we don't have to 
> expose methods intended to be used by unit tests.
> Here was Benoit's reply to a user of asynchbase about testability:
> OpenTSDB has unit tests that are mocking out HBaseClient just fine
> [1].  You can mock out pretty much anything on the JVM: final,
> private, JDK stuff, etc.  All you need is the right tools.  I've been
> very happy with PowerMock.  It supports Mockito and EasyMock.
> I've never been keen on mutilating public interfaces for the sake of
> testing.  With tools like PowerMock, we can keep the public APIs tidy
> while mocking and overriding anything, even in the most private guts
> of the classes.
>  [1] 
> https://github.com/stumbleupon/opentsdb/blob/master/src/uid/TestUniqueId.java#L66

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463218#comment-13463218
 ] 

Mikhail Bautin commented on HBASE-6882:
---

Phabricator diff for 0.89-fb: https://reviews.facebook.net/D5679


> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6881:
---

Attachment: trunk-6881.patch

> All regionservers are marked offline even there is still one up
> ---
>
> Key: HBASE-6881
> URL: https://issues.apache.org/jira/browse/HBASE-6881
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: trunk-6881.patch
>
>
> {noformat}
> +RegionPlan newPlan = plan;
> +if (!regionAlreadyInTransitionException) {
> +  // Force a new plan and reassign. Will return null if no servers.
> +  newPlan = getRegionPlan(state, plan.getDestination(), true);
> +}
> +if (newPlan == null) {
>this.timeoutMonitor.setAllRegionServersOffline(true);
>LOG.warn("Unable to find a viable location to assign region " +
>  state.getRegion().getRegionNameAsString());
> {noformat}
> Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6881:
---

Status: Patch Available  (was: Open)

> All regionservers are marked offline even there is still one up
> ---
>
> Key: HBASE-6881
> URL: https://issues.apache.org/jira/browse/HBASE-6881
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: trunk-6881.patch
>
>
> {noformat}
> +RegionPlan newPlan = plan;
> +if (!regionAlreadyInTransitionException) {
> +  // Force a new plan and reassign. Will return null if no servers.
> +  newPlan = getRegionPlan(state, plan.getDestination(), true);
> +}
> +if (newPlan == null) {
>this.timeoutMonitor.setAllRegionServersOffline(true);
>LOG.warn("Unable to find a viable location to assign region " +
>  state.getRegion().getRegionNameAsString());
> {noformat}
> Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-6882:
---

Attachment: D5679.1.patch

mbautin requested code review of "[jira] [HBASE-6882] [89-fb] Thrift IOError 
should include exception class".
Reviewers: Liyin, Karthik, aaiyer, chip, JIRA

  Return exception class as part of IOError thrown from the Thrift proxy or the 
embedded Thrift server in the regionserver.

TEST PLAN
  Unit tests
  Test through C++ HBase client

REVISION DETAIL
  https://reviews.facebook.net/D5679

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/RegionException.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/main/java/org/apache/hadoop/hbase/thrift/generated/IOError.java
  src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/13341/

To: Liyin, Karthik, aaiyer, chip, JIRA, mbautin


> Thrift IOError should include exception class
> -
>
> Key: HBASE-6882
> URL: https://issues.apache.org/jira/browse/HBASE-6882
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: D5679.1.patch
>
>
> Return exception class as part of IOError thrown from the Thrift proxy or the 
> embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6882) Thrift IOError should include exception class

2012-09-25 Thread Mikhail Bautin (JIRA)
Mikhail Bautin created HBASE-6882:
-

 Summary: Thrift IOError should include exception class
 Key: HBASE-6882
 URL: https://issues.apache.org/jira/browse/HBASE-6882
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin


Return exception class as part of IOError thrown from the Thrift proxy or the 
embedded Thrift server in the regionserver.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463162#comment-13463162
 ] 

Jimmy Xiang commented on HBASE-6881:


This is NOT an issue caused by HBASE-6438 actually. I fixed the description.  
It is an existing issue.

During unit test, there could be just one region server. This can lead to 
HBASE-6880, and hanging tests.

> All regionservers are marked offline even there is still one up
> ---
>
> Key: HBASE-6881
> URL: https://issues.apache.org/jira/browse/HBASE-6881
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> {noformat}
> +RegionPlan newPlan = plan;
> +if (!regionAlreadyInTransitionException) {
> +  // Force a new plan and reassign. Will return null if no servers.
> +  newPlan = getRegionPlan(state, plan.getDestination(), true);
> +}
> +if (newPlan == null) {
>this.timeoutMonitor.setAllRegionServersOffline(true);
>LOG.warn("Unable to find a viable location to assign region " +
>  state.getRegion().getRegionNameAsString());
> {noformat}
> Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6881:
---

Description: 
{noformat}
+RegionPlan newPlan = plan;
+if (!regionAlreadyInTransitionException) {
+  // Force a new plan and reassign. Will return null if no servers.
+  newPlan = getRegionPlan(state, plan.getDestination(), true);
+}
+if (newPlan == null) {
   this.timeoutMonitor.setAllRegionServersOffline(true);
   LOG.warn("Unable to find a viable location to assign region " +
 state.getRegion().getRegionNameAsString());
{noformat}

Here, when newPlan is null, plan.getDestination() could be up actually.



  was:
This is an issue caused by HBASE-6438:

{noformat}
+RegionPlan newPlan = plan;
+if (!regionAlreadyInTransitionException) {
+  // Force a new plan and reassign. Will return null if no servers.
+  newPlan = getRegionPlan(state, plan.getDestination(), true);
+}
+if (newPlan == null) {
   this.timeoutMonitor.setAllRegionServersOffline(true);
   LOG.warn("Unable to find a viable location to assign region " +
 state.getRegion().getRegionNameAsString());
{noformat}

Here, when newPlan is null, plan.getDestination() could be up actually.




> All regionservers are marked offline even there is still one up
> ---
>
> Key: HBASE-6881
> URL: https://issues.apache.org/jira/browse/HBASE-6881
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> {noformat}
> +RegionPlan newPlan = plan;
> +if (!regionAlreadyInTransitionException) {
> +  // Force a new plan and reassign. Will return null if no servers.
> +  newPlan = getRegionPlan(state, plan.getDestination(), true);
> +}
> +if (newPlan == null) {
>this.timeoutMonitor.setAllRegionServersOffline(true);
>LOG.warn("Unable to find a viable location to assign region " +
>  state.getRegion().getRegionNameAsString());
> {noformat}
> Here, when newPlan is null, plan.getDestination() could be up actually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6881) All regionservers are marked offline even there is still one up

2012-09-25 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6881:
--

 Summary: All regionservers are marked offline even there is still 
one up
 Key: HBASE-6881
 URL: https://issues.apache.org/jira/browse/HBASE-6881
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


This is an issue caused by HBASE-6438:

{noformat}
+RegionPlan newPlan = plan;
+if (!regionAlreadyInTransitionException) {
+  // Force a new plan and reassign. Will return null if no servers.
+  newPlan = getRegionPlan(state, plan.getDestination(), true);
+}
+if (newPlan == null) {
   this.timeoutMonitor.setAllRegionServersOffline(true);
   LOG.warn("Unable to find a viable location to assign region " +
 state.getRegion().getRegionNameAsString());
{noformat}

Here, when newPlan is null, plan.getDestination() could be up actually.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463144#comment-13463144
 ] 

nkeywal commented on HBASE-6401:


HDFS-3701 has just been fixed, so we may have a reasonable hdfs 1.1 version as 
HDFS-3703 made it as well. We need HDFS-3912 to be complete from a failure 
management point of view. Then there is the question of durability...

> HBase may lose edits after a crash if used with HDFS 1.0.3 or older
> ---
>
> Key: HBASE-6401
> URL: https://issues.apache.org/jira/browse/HBASE-6401
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.96.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: TestReadAppendWithDeadDN.java
>
>
> This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the 
> hdfs jira for this.
> Context: HBase Write Ahead Log features. This is using hdfs append. If the 
> node crashes, the file that was written is read by other processes to replay 
> the action.
> - So we have in hdfs one (dead) process writing with another process reading.
> - But, despite the call to syncFs, we don't always see the data when we have 
> a dead node. It seems to be because the call in DFSClient#updateBlockInfo 
> ignores the ipc errors and set the length to 0.
> - So we may miss all the writes to the last block if we try to connect to the 
> dead DN.
> hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853&view=markup
> hdfs branch-2 or trunk: we should not have the issue (but not tested)
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
> The attached test will fail ~50 of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6880) Failure in assigning root causes system hang

2012-09-25 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6880:
--

 Summary: Failure in assigning root causes system hang
 Key: HBASE-6880
 URL: https://issues.apache.org/jira/browse/HBASE-6880
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang


In looking into a TestReplication failure, I found out sometimes assignRoot 
could fail, for example, RS is not serving traffic yet.  In this case, the 
master will keep waiting for root to be available, which could never happen.
 
Need to gracefully terminate master if root is not assigned properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463115#comment-13463115
 ] 

Lars Hofhansl commented on HBASE-6401:
--

Hadoop-2 has other issues, though (see last few comments on HDFS-744).

> HBase may lose edits after a crash if used with HDFS 1.0.3 or older
> ---
>
> Key: HBASE-6401
> URL: https://issues.apache.org/jira/browse/HBASE-6401
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.96.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: TestReadAppendWithDeadDN.java
>
>
> This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the 
> hdfs jira for this.
> Context: HBase Write Ahead Log features. This is using hdfs append. If the 
> node crashes, the file that was written is read by other processes to replay 
> the action.
> - So we have in hdfs one (dead) process writing with another process reading.
> - But, despite the call to syncFs, we don't always see the data when we have 
> a dead node. It seems to be because the call in DFSClient#updateBlockInfo 
> ignores the ipc errors and set the length to 0.
> - So we may miss all the writes to the last block if we try to connect to the 
> dead DN.
> hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853&view=markup
> hdfs branch-2 or trunk: we should not have the issue (but not tested)
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
> The attached test will fail ~50 of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463109#comment-13463109
 ] 

Jesse Yates commented on HBASE-5961:


hmmm, looks like we might need to add this to the rat excludes file too.

> New standard HBase code formatter
> -
>
> Key: HBASE-5961
> URL: https://issues.apache.org/jira/browse/HBASE-5961
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBase-Formmatter.xml
>
>
> There is currently no good way of passing out the formmatter currently the 
> 'standard' in HBase. The standard Apache formatter is actually not very close 
> to what we are considering 'good'/'pretty' code. Further, its not trivial to 
> get a good formatter setup.
> Proposing two things: 
> 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
> in the docs
> 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5691) Importtsv stops the webservice from which it is evoked

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463106#comment-13463106
 ] 

Hudson commented on HBASE-5691:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


> Importtsv stops the webservice from which it is evoked
> --
>
> Key: HBASE-5691
> URL: https://issues.apache.org/jira/browse/HBASE-5691
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4
>Reporter: debarshi basak
>Priority: Minor
>
> I was trying to run importtsv from a servlet. Everytime after the completion 
> of job, the tomcat server was shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463103#comment-13463103
 ] 

Hudson commented on HBASE-6637:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-6637 Argghh... Missed deleted files too (Revision 1390040)
HBASE-6637 Missed new files (Revision 1390035)
HBASE-6637 Move DaemonThreadFactory into Threads and Threads to hbase-common 
(Jesse Yates) (Revision 1390034)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java

larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


> Move DaemonThreadFactory into Threads and Threads to hbase-common
> -
>
> Key: HBASE-6637
> URL: https://issues.apache.org/jira/browse/HBASE-6637
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
> hbase-6637-v0.patch, hbase-6637-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3678) Add Eclipse-based Apache Formatter to HBase Wiki

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463104#comment-13463104
 ] 

Hudson commented on HBASE-3678:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028)
HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add 
Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml

stack : 
Files : 
* /hbase/trunk/dev-support/hbase_eclipse_formatter.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


> Add Eclipse-based Apache Formatter to HBase Wiki
> 
>
> Key: HBASE-3678
> URL: https://issues.apache.org/jira/browse/HBASE-3678
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>Priority: Trivial
> Fix For: 0.92.0
>
> Attachments: eclipse_formatter_apache.xml
>
>
> Currently, on http://wiki.apache.org/hadoop/Hbase/HowToContribute , we tell 
> the user to follow Sun's code conventions and then add a couple things.  For 
> lazy people like myself, it would be much easier to just tell us to import an 
> Apache formatter into your Eclipse project and not worry about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463105#comment-13463105
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390013)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463098#comment-13463098
 ] 

Jesse Yates commented on HBASE-6879:


[~saint@gmail.com] here's a stab at a code template to go with the 
formmatter from HBASE-5961

> Add HBase Code Template
> ---
>
> Key: HBASE-6879
> URL: https://issues.apache.org/jira/browse/HBASE-6879
> Project: HBase
>  Issue Type: Bug
>  Components: build, documentation
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Attachments: HBase Code Template.xml
>
>
> Add a standard code template to do along with the code formatter for HBase. 
> This helps make sure people have the correct license and general commenting 
> for auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6879:
---

Attachment: HBase Code Template.xml

Attaching template to go into hbase/dev-support. Easier to see this way than as 
an actual patch.

> Add HBase Code Template
> ---
>
> Key: HBASE-6879
> URL: https://issues.apache.org/jira/browse/HBASE-6879
> Project: HBase
>  Issue Type: Bug
>  Components: build, documentation
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Attachments: HBase Code Template.xml
>
>
> Add a standard code template to do along with the code formatter for HBase. 
> This helps make sure people have the correct license and general commenting 
> for auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates reassigned HBASE-6879:
--

Assignee: Jesse Yates

> Add HBase Code Template
> ---
>
> Key: HBASE-6879
> URL: https://issues.apache.org/jira/browse/HBASE-6879
> Project: HBase
>  Issue Type: Bug
>  Components: build, documentation
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>
> Add a standard code template to do along with the code formatter for HBase. 
> This helps make sure people have the correct license and general commenting 
> for auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6879) Add HBase Code Template

2012-09-25 Thread Jesse Yates (JIRA)
Jesse Yates created HBASE-6879:
--

 Summary: Add HBase Code Template
 Key: HBASE-6879
 URL: https://issues.apache.org/jira/browse/HBASE-6879
 Project: HBase
  Issue Type: Bug
  Components: build, documentation
Reporter: Jesse Yates


Add a standard code template to do along with the code formatter for HBase. 
This helps make sure people have the correct license and general commenting for 
auto-generated elements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463090#comment-13463090
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94 #488 (See 
[https://builds.apache.org/job/HBase-0.94/488/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6637:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Move DaemonThreadFactory into Threads and Threads to hbase-common
> -
>
> Key: HBASE-6637
> URL: https://issues.apache.org/jira/browse/HBASE-6637
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
> hbase-6637-v0.patch, hbase-6637-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6637:
-


Committed to 0.96 (for the new files first, added those in a 2nd commit).

> Move DaemonThreadFactory into Threads and Threads to hbase-common
> -
>
> Key: HBASE-6637
> URL: https://issues.apache.org/jira/browse/HBASE-6637
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
> hbase-6637-v0.patch, hbase-6637-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463081#comment-13463081
 ] 

Hudson commented on HBASE-6784:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6784 TestCoprocessorScanPolicy is sometimes flaky when run locally 
(Revision 1389619)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestCoprocessorScanPolicy.java


> TestCoprocessorScanPolicy is sometimes flaky when run locally
> -
>
> Key: HBASE-6784
> URL: https://issues.apache.org/jira/browse/HBASE-6784
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6784.txt
>
>
> The problem is not seen in jenkins build.  
> When we run TestCoprocessorScanPolicy.testBaseCases locally or in our 
> internal jenkins we tend to get random failures.  The reason is the 2 puts 
> that we do here is sometimes getting the same timestamp.  This is leading to 
> improper scan results as the version check tends to skip one of the row 
> seeing the timestamp to be same. Marking this as minor.  As we are trying to 
> solve testcase related failures just raising this incase we need to resolve 
> this also.
> For eg,
> Both the puts are getting the time
> {code}
> time 1347635287360
> time 1347635287360
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463079#comment-13463079
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463080#comment-13463080
 ] 

Hudson commented on HBASE-6851:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6851  Fix race condition in TableAuthManager.updateGlobalCache() 
(Revision 1388898)

 Result = SUCCESS
garyh : 
Files : 
* 
/hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java
* 
/hbase/branches/0.94/security/src/test/java/org/apache/hadoop/hbase/security/access/TestTablePermissions.java


> Race condition in TableAuthManager.updateGlobalCache()
> --
>
> Key: HBASE-6851
> URL: https://issues.apache.org/jira/browse/HBASE-6851
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.94.1, 0.96.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
> Fix For: 0.94.2, 0.96.0
>
> Attachments: HBASE-6851_2.patch, HBASE-6851_3.patch, HBASE-6851.patch
>
>
> When new global permissions are assigned, there is a race condition, during 
> which further authorization checks relying on global permissions may fail.
> In TableAuthManager.updateGlobalCache(), we have:
> {code:java}
> USER_CACHE.clear();
> GROUP_CACHE.clear();
> try {
>   initGlobal(conf);
> } catch (IOException e) {
>   // Never happens
>   LOG.error("Error occured while updating the user cache", e);
> }
> for (Map.Entry entry : userPerms.entries()) {
>   if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
> GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
> new Permission(entry.getValue().getActions()));
>   } else {
> USER_CACHE.put(entry.getKey(), new 
> Permission(entry.getValue().getActions()));
>   }
> }
> {code}
> If authorization checks come in following the .clear() but before 
> repopulating, they will fail.
> We should have some synchronization here to serialize multiple updates and 
> use a COW type rebuild and reassign of the new maps.
> This particular issue crept in with the fix in HBASE-6157, so I'm flagging 
> for 0.94 and 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463064#comment-13463064
 ] 

Jesse Yates commented on HBASE-6637:


As mentioned, failing tests passed locally...

> Move DaemonThreadFactory into Threads and Threads to hbase-common
> -
>
> Key: HBASE-6637
> URL: https://issues.apache.org/jira/browse/HBASE-6637
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, 
> hbase-6637-v0.patch, hbase-6637-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463063#comment-13463063
 ] 

Jesse Yates commented on HBASE-6702:


Good stuff keywal! Just a couple comments:
{code}
+  hbase-common
+  ${project.version}
+  test-jar
+  test
+
+
{code}

To keep DRY, the aboves should go into hbase/pom.xml's dependencyManagement 
section and then the children projects should just use:
{code}
+  hbase-common
+  test-jar
+
+
{code}

Also, any chance for some javadocs on things like:
{code}
+  public ResourceChecker(String tagLine) {
+this.tagLine = tagLine;
+  }
{code}

Otherwise, this is a really sweet add.


> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch, 6702.v4.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5961.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

Committed to trunk. Thanks for the patch Jesse.

> New standard HBase code formatter
> -
>
> Key: HBASE-5961
> URL: https://issues.apache.org/jira/browse/HBASE-5961
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBase-Formmatter.xml
>
>
> There is currently no good way of passing out the formmatter currently the 
> 'standard' in HBase. The standard Apache formatter is actually not very close 
> to what we are considering 'good'/'pretty' code. Further, its not trivial to 
> get a good formatter setup.
> Proposing two things: 
> 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
> in the docs
> 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463047#comment-13463047
 ] 

stack commented on HBASE-5961:
--

I committed this formatter under dev-support and I added how to install doc 
from HBASE-3678.

> New standard HBase code formatter
> -
>
> Key: HBASE-5961
> URL: https://issues.apache.org/jira/browse/HBASE-5961
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBase-Formmatter.xml
>
>
> There is currently no good way of passing out the formmatter currently the 
> 'standard' in HBase. The standard Apache formatter is actually not very close 
> to what we are considering 'good'/'pretty' code. Further, its not trivial to 
> get a good formatter setup.
> Proposing two things: 
> 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
> in the docs
> 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6853) IllegalArgument Exception is thrown when an empty region is spliitted.

2012-09-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463039#comment-13463039
 ] 

ramkrishna.s.vasudevan commented on HBASE-6853:
---

@Stack
Can we commit patch 1?

> IllegalArgument Exception is thrown when an empty region is spliitted.
> --
>
> Key: HBASE-6853
> URL: https://issues.apache.org/jira/browse/HBASE-6853
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.1
>Reporter: ramkrishna.s.vasudevan
> Attachments: HBASE-6853_2_splitsuccess.patch, 
> HBASE-6853_splitfailure.patch
>
>
> This is w.r.t a mail sent in the dev mail list.
> Empty region split should be handled gracefully.  Either we should not allow 
> the split to happen if we know that the region is empty or we should allow 
> the split to happen by setting the no of threads to the thread pool executor 
> as 1.
> {code}
> int nbFiles = hstoreFilesToSplit.size();
> ThreadFactoryBuilder builder = new ThreadFactoryBuilder();
> builder.setNameFormat("StoreFileSplitter-%1$d");
> ThreadFactory factory = builder.build();
> ThreadPoolExecutor threadPool =
>   (ThreadPoolExecutor) Executors.newFixedThreadPool(nbFiles, factory);
> List> futures = new ArrayList>(nbFiles);
> {code}
> Here the nbFiles needs to be a non zero positive value.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6854) Deletion of SPLITTING node on split rollback should clear the region from RIT

2012-09-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463037#comment-13463037
 ] 

ramkrishna.s.vasudevan commented on HBASE-6854:
---

I found that the testcase added with this is sometimes failing.  Seems there is 
something in the AM and the way the watcher is set.
I will debug it and then commit the patch though it is only a testcase change.

> Deletion of SPLITTING node on split rollback should clear the region from RIT
> -
>
> Key: HBASE-6854
> URL: https://issues.apache.org/jira/browse/HBASE-6854
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.94.3
>
> Attachments: HBASE-6854.patch
>
>
> If a failure happens in split before OFFLINING_PARENT, we tend to rollback 
> the split including deleting the znodes created.
> On deletion of the RS_ZK_SPLITTING node we are getting a callback but not 
> remvoving from RIT. We need to remove it from RIT, anyway SSH logic is well 
> guarded in case the delete event comes due to RS down scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2012-09-25 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463031#comment-13463031
 ] 

Himanshu Vashishtha commented on HBASE-6870:


Looked at the patch:

Can you make the these two if statements in-line
{code}
+if (Bytes.compareTo(start, startKeys[i]) >= 0) {
+  if (Bytes.equals(endKeys[i], HConstants.EMPTY_END_ROW)
+  || Bytes.compareTo(start, endKeys[i]) < 0) {
+rangeKeys.add(start);
+  }
{code}

Can it be private?
{code}
+  public LinkedHashMap getKeysToRegionsInRange(
{code}

Re: Andrew's concern regarding cache use: 6877 will take care of region move 
too? cache may become stale for reasons other than splits too. Will look at 
6877.

> HTable#coprocessorExec always scan the whole table 
> ---
>
> Key: HBASE-6870
> URL: https://issues.apache.org/jira/browse/HBASE-6870
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 0.94.1
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
> HBASE-6870v2.patch, HBASE-6870v3.patch
>
>
> In current logic, HTable#coprocessorExec always scan the whole table, its 
> efficiency is low and will affect the Regionserver carrying .META. under 
> large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6784:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

> TestCoprocessorScanPolicy is sometimes flaky when run locally
> -
>
> Key: HBASE-6784
> URL: https://issues.apache.org/jira/browse/HBASE-6784
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6784.txt
>
>
> The problem is not seen in jenkins build.  
> When we run TestCoprocessorScanPolicy.testBaseCases locally or in our 
> internal jenkins we tend to get random failures.  The reason is the 2 puts 
> that we do here is sometimes getting the same timestamp.  This is leading to 
> improper scan results as the version check tends to skip one of the row 
> seeing the timestamp to be same. Marking this as minor.  As we are trying to 
> solve testcase related failures just raising this incase we need to resolve 
> this also.
> For eg,
> Both the puts are getting the time
> {code}
> time 1347635287360
> time 1347635287360
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6868.
--

  Resolution: Fixed
Assignee: Lars Hofhansl
Hadoop Flags: Reviewed

Committed to 0.94 and 0.96.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6851:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

> Race condition in TableAuthManager.updateGlobalCache()
> --
>
> Key: HBASE-6851
> URL: https://issues.apache.org/jira/browse/HBASE-6851
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.94.1, 0.96.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Critical
> Fix For: 0.94.2, 0.96.0
>
> Attachments: HBASE-6851_2.patch, HBASE-6851_3.patch, HBASE-6851.patch
>
>
> When new global permissions are assigned, there is a race condition, during 
> which further authorization checks relying on global permissions may fail.
> In TableAuthManager.updateGlobalCache(), we have:
> {code:java}
> USER_CACHE.clear();
> GROUP_CACHE.clear();
> try {
>   initGlobal(conf);
> } catch (IOException e) {
>   // Never happens
>   LOG.error("Error occured while updating the user cache", e);
> }
> for (Map.Entry entry : userPerms.entries()) {
>   if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
> GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
> new Permission(entry.getValue().getActions()));
>   } else {
> USER_CACHE.put(entry.getKey(), new 
> Permission(entry.getValue().getActions()));
>   }
> }
> {code}
> If authorization checks come in following the .clear() but before 
> repopulating, they will fail.
> We should have some synchronization here to serialize multiple updates and 
> use a COW type rebuild and reassign of the new maps.
> This particular issue crept in with the fix in HBASE-6157, so I'm flagging 
> for 0.94 and 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463017#comment-13463017
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I manually did these tests (0.94 patch):
* started HBase with HBase checksums off, inserted some data, flushed, 
compacted, scanned
* restarted HBase with HBase checksums on, inserted some more data, 
flush/compacted, scanned
* restarted HBase again with HBase checksums off, inserted some more data, 
flush/compacted, scanned

Checked the logs for anything weird. Looks good. Going to commit to 0.94 and 
0.96.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Attachment: 6868-0.94.txt

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Status: Open  (was: Patch Available)

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.1, 0.94.0
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462925#comment-13462925
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I looked through the run, nothing stuck out... All the tests passed.

I'll do some manual testing today and then commit.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462923#comment-13462923
 ] 

Hadoop QA commented on HBASE-6868:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546437/6868-0.96-v3.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//console

This message is automatically generated.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6736) Distributed Split: a split tasks can be mark as DONE but keep unassigned

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462897#comment-13462897
 ] 

nkeywal commented on HBASE-6736:


There are multiple synchro issues. 

One of them is 
{code}
@Override
protected void chore() {
  // [...]
  for (Map.Entry e : tasks.entrySet()) {
{code}

As we're iterating over a set that can be modified we can have reliability 
issues, cf. javadoc: "If the map is modified while an iteration over the set is 
in progress (except through the iterator's own remove operation, or through the 
setValue operation on a map entry returned by the iterator) the results of the 
iteration are undefined."



> Distributed Split: a split tasks can be mark as DONE but keep unassigned
> 
>
> Key: HBASE-6736
> URL: https://issues.apache.org/jira/browse/HBASE-6736
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.96.0
>Reporter: nkeywal
>
> Real cluster, scenario mentioned on HBASE-5843.
> Got it once out of 5 tests on 0.96
> Didn't get it on 0.94 after 3 tests.
> It seems we have a race condition on split logs: the task was nearly 
> simultaneously marked as done and resubmitted. Then it remained in the 
> unassigned state.
> 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
> total tasks = 1 unassigned = 0
> 2012-09-04 17:27:06,237 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
> resubmitted 1 out of 1 tasks
> 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
> task not yet acquired 
> /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346772046399-splitting%2FBOX0%252C60020%252C1346772046399.1346772046609
>  ver = 7
> 2012-09-04 17:27:06,314 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
> task /hbase/splitlog/RESCAN02 entered state: DONE 
> BOX1,6,1346771990737
> 2012-09-04 17:27:06,337 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/RESCAN02
> 2012-09-04 17:27:06,337 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
> deleted task without in memory state /hbase/splitlog/RESCAN02
> 2012-09-04 17:27:07,226 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
> total tasks = 1 unassigned = 1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6868:
-

Status: Patch Available  (was: Open)

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.1, 0.94.0
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5961) New standard HBase code formatter

2012-09-25 Thread Cody Marcel (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462851#comment-13462851
 ] 

Cody Marcel commented on HBASE-5961:


Nice!

> New standard HBase code formatter
> -
>
> Key: HBASE-5961
> URL: https://issues.apache.org/jira/browse/HBASE-5961
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>Priority: Minor
> Attachments: HBase-Formmatter.xml
>
>
> There is currently no good way of passing out the formmatter currently the 
> 'standard' in HBase. The standard Apache formatter is actually not very close 
> to what we are considering 'good'/'pretty' code. Further, its not trivial to 
> get a good formatter setup.
> Proposing two things: 
> 1) Adding a formmatter to the dev tools and calling out the formmatter usage 
> in the docs
> 2) Move to a 'better' formmatter that is not the standard apache formmatter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462757#comment-13462757
 ] 

nkeywal commented on HBASE-4955:


Monthly update...
Surefire: the regression on elapsed time is fixed on 2.12.4 (not tested). Still 
waiting for #800. May be it will make it to the 2.13. No date.
JUnit: no life there. Still a release this quarter is likely...



> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6878) DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving

2012-09-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6878:
---

Description: 
The code in SplitLogManager# getDataSetWatchSuccess is:
{code}
if (slt.isDone()) {
  LOG.info("task " + path + " entered state: " + slt.toString());
  if (taskFinisher != null && !ZKSplitLog.isRescanNode(watcher, path)) {
if (taskFinisher.finish(slt.getServerName(), 
ZKSplitLog.getFileName(path)) == Status.DONE) {
  setDone(path, SUCCESS);
} else {
  resubmitOrFail(path, CHECK);
}
  } else {
setDone(path, SUCCESS);
  }
{code}

  resubmitOrFail(path, CHECK);

should be 
  resubmitOrFail(path, FORCE);

Without it, the task won't be resubmitted if the delay is not reached, and the 
task will be marked as failed.



> DistributerLogSplit can fail to resubmit a task done if there is an exception 
> during the log archiving
> --
>
> Key: HBASE-6878
> URL: https://issues.apache.org/jira/browse/HBASE-6878
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: nkeywal
>Priority: Minor
>
> The code in SplitLogManager# getDataSetWatchSuccess is:
> {code}
> if (slt.isDone()) {
>   LOG.info("task " + path + " entered state: " + slt.toString());
>   if (taskFinisher != null && !ZKSplitLog.isRescanNode(watcher, path)) {
> if (taskFinisher.finish(slt.getServerName(), 
> ZKSplitLog.getFileName(path)) == Status.DONE) {
>   setDone(path, SUCCESS);
> } else {
>   resubmitOrFail(path, CHECK);
> }
>   } else {
> setDone(path, SUCCESS);
>   }
> {code}
>   resubmitOrFail(path, CHECK);
> should be 
>   resubmitOrFail(path, FORCE);
> Without it, the task won't be resubmitted if the delay is not reached, and 
> the task will be marked as failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6878) DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving

2012-09-25 Thread nkeywal (JIRA)
nkeywal created HBASE-6878:
--

 Summary: DistributerLogSplit can fail to resubmit a task done if 
there is an exception during the log archiving
 Key: HBASE-6878
 URL: https://issues.apache.org/jira/browse/HBASE-6878
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: nkeywal
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6309) [MTTR] Do NN operations outside of the ZK EventThread in SplitLogManager

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462745#comment-13462745
 ] 

nkeywal commented on HBASE-6309:


I'm was having a look at this. Could we have the log archiving done by the 
regionserver instead of the master? This would lower the work done in the event 
thread? The only remaining stuff would be the renaming of the region log dir at 
the end. 

I see one impact: if the same log was processed simultaneously by multiple 
region server, this archiving could occur in parallel on two different region 
server. Manageable I think...

> [MTTR] Do NN operations outside of the ZK EventThread in SplitLogManager
> 
>
> Key: HBASE-6309
> URL: https://issues.apache.org/jira/browse/HBASE-6309
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.96.0
>
>
> We found this issue during the leap second cataclysm which prompted a 
> distributed splitting of all our logs.
> I saw that none of the RS were splitting after some time while the master was 
> showing that it wasn't even 30% done. jstack'ing I saw this:
> {noformat}
> "main-EventThread" daemon prio=10 tid=0x7f6ce46d8800 nid=0x5376 in
> Object.wait() [0x7f6ce2ecb000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at org.apache.hadoop.ipc.Client.call(Client.java:1093)
> - locked <0x0005fdd661a0> (a org.apache.hadoop.ipc.Client$Call)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
> at $Proxy9.rename(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy9.rename(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:759)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:253)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.moveRecoveredEditsFromTemp(HLogSplitter.java:553)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.moveRecoveredEditsFromTemp(HLogSplitter.java:519)
> at 
> org.apache.hadoop.hbase.master.SplitLogManager$1.finish(SplitLogManager.java:138)
> at 
> org.apache.hadoop.hbase.master.SplitLogManager.getDataSetWatchSuccess(SplitLogManager.java:431)
> at 
> org.apache.hadoop.hbase.master.SplitLogManager.access$1200(SplitLogManager.java:95)
> at 
> org.apache.hadoop.hbase.master.SplitLogManager$GetDataAsyncCallback.processResult(SplitLogManager.java:1011)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:571)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
> {noformat}
> We are effectively bottlenecking on doing NN operations and whatever else is 
> happening in GetDataAsyncCallback. It was so bad that on our 100 offline 
> cluster it took a few hours for the master to process all the incoming ZK 
> events while the actual splitting took a fraction of that time.
> I'm marking this as critical and against 0.96 but depending on how involved 
> the fix is we might want to backport.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462726#comment-13462726
 ] 

nkeywal commented on HBASE-6702:


Seems ok...

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch, 6702.v4.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462724#comment-13462724
 ] 

Hadoop QA commented on HBASE-6702:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546497/6702.v4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 858 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.coprocessor.TestRowProcessorEndpoint

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2930//console

This message is automatically generated.

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch, 6702.v4.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6737) NullPointerException at regionserver.wal.SequenceFileLogWriter.append

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462701#comment-13462701
 ] 

nkeywal commented on HBASE-6737:


Stack 1: It seems to be an expected case, from the code:
{code}
  @Override
  public void append(HLog.Entry entry) throws IOException {
entry.setCompressionContext(compressionContext);
try {
  this.writer.append(entry.getKey(), entry.getEdit());
} catch (NullPointerException npe) {
  // Concurrent close...
  throw new IOException(npe);
}
  }
{code}


> NullPointerException at regionserver.wal.SequenceFileLogWriter.append
> -
>
> Key: HBASE-6737
> URL: https://issues.apache.org/jira/browse/HBASE-6737
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Priority: Critical
>
> Real cluster, scenario in HBASE-5843.
> There are two exceptions, I create a single JIRA with both of them.
> 2012-09-04 18:14:49,264 FATAL 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got 
> while writing log entry to log
> java.io.IOException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:229)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:949)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:891)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:226)
>   ... 3 more
> 2012-09-04 18:15:52,546 ERROR 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Error in log splitting 
> write thread
> java.lang.reflect.UndeclaredThrowableException
>   at $Proxy7.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:875)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:559)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWAP(HLogSplitter.java:974)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$800(HLogSplitter.java:82)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.getWriterAndPath(HLogSplitter.java:1309)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:942)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:919)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:891)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:261)
>   ... 11 more
> Caused by: java.io.IOException: Call to BOX1/192.168.15.5:9000 failed on 
> local exception: java.nio.channels.ClosedByInterruptException
>   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1075)
>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>   at $Proxy7.getFileInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>   at $Proxy7.getFileInfo(Unknown Source)
>   ... 15 more
> Caused by: java.nio.channels.ClosedByInterruptException
>   at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>   at sun.nio.

[jira] [Updated] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6702:
---

Attachment: 6702.v4.patch

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch, 6702.v4.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6702:
---

Status: Patch Available  (was: Open)

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch, 6702.v4.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5835) [hbck] Catch and handle NotServingRegionException when close region attempt fails

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462665#comment-13462665
 ] 

Hadoop QA commented on HBASE-5835:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12542730/HBASE-5835.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2929//console

This message is automatically generated.

> [hbck] Catch and handle NotServingRegionException when close region attempt 
> fails
> -
>
> Key: HBASE-5835
> URL: https://issues.apache.org/jira/browse/HBASE-5835
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>Reporter: Jonathan Hsieh
> Attachments: HBASE-5835.patch
>
>
> Currently, if hbck attempts to close a region and catches a 
> NotServerRegionException, hbck may hang outputting a stack trace.  Since the 
> goal is to close the region at a particular server, and since it is not 
> serving the region, the region is closed, and we should just warn and eat 
> this exception.
> {code}
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hbase.NotServingRegionException: Received close for 
>  but we are not serving it
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2162)
> at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> at $Proxy5.closeRegion(Unknown Source)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:165)
> at org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1185)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1302)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1065)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:351)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:370)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3001)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5835) [hbck] Catch and handle NotServingRegionException when close region attempt fails

2012-09-25 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-5835:
-

Status: Patch Available  (was: Open)

seems i forgot to click "submit patch"...

> [hbck] Catch and handle NotServingRegionException when close region attempt 
> fails
> -
>
> Key: HBASE-5835
> URL: https://issues.apache.org/jira/browse/HBASE-5835
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0, 0.90.7, 0.92.2, 0.96.0
>Reporter: Jonathan Hsieh
> Attachments: HBASE-5835.patch
>
>
> Currently, if hbck attempts to close a region and catches a 
> NotServerRegionException, hbck may hang outputting a stack trace.  Since the 
> goal is to close the region at a particular server, and since it is not 
> serving the region, the region is closed, and we should just warn and eat 
> this exception.
> {code}
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hbase.NotServingRegionException: Received close for 
>  but we are not serving it
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2162)
> at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> at $Proxy5.closeRegion(Unknown Source)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:165)
> at org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1185)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1302)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1065)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:351)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:370)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3001)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5954) Allow proper fsync support for HBase

2012-09-25 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462584#comment-13462584
 ] 

Luke Lu commented on HBASE-5954:


Hi Lars,

We just noticed that HDFS-744 did not implement the correct hsync semantics 
(mostly due to HDFS-265) so that the hsync is slower AND (arguably) less 
durable than hflush in Hadoop 1.x.

> Allow proper fsync support for HBase
> 
>
> Key: HBASE-5954
> URL: https://issues.apache.org/jira/browse/HBASE-5954
> Project: HBase
>  Issue Type: Improvement
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 5954-trunk-hdfs-trunk.txt, 5954-trunk-hdfs-trunk-v2.txt, 
> 5954-trunk-hdfs-trunk-v3.txt, 5954-trunk-hdfs-trunk-v4.txt, 
> 5954-trunk-hdfs-trunk-v5.txt, 5954-trunk-hdfs-trunk-v6.txt, hbase-hdfs-744.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-25 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462526#comment-13462526
 ] 

nkeywal commented on HBASE-6702:


bq. What is this change?
I've changed the interface of the resource checker, but not yet removed 
ResourceCheckerJUnitRule, so I've just commented the removed methods.

bq. Whats this mean 'migrate the localTests to a newer version of surefire'?
The log lines don't show up with surefire 2.10. It works with my patched 
version. But the localTests profile uses the 2.10. It's historical: I've done 
it this way because we don't use categories nor parallelization for localTests.

The v2 should be "ready for commit' and will include your comments. Thanks for 
the review!

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 6702.v1.patch
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira