date:20110801


 [ 
https://issues.apache.org/jira/browse/HBASE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-3810:
-

Attachment: HBASE-3810_final.patch

Final patch from review committed to trunk.

> Registering a Coprocessor at HTableDescriptor should be less strict
> ---
>
> Key: HBASE-3810
> URL: https://issues.apache.org/jira/browse/HBASE-3810
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.0
> Environment: all
>Reporter: Joerg Schad
>Assignee: Mingjie Lai
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-3810_final.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Registering a Copressor in the following way will fail as the "Coprocessor$1" 
> keyword is case sensitive (instead COPROCESSOR$1 works fine). Removing this 
> restriction would improve usability.
> HTableDescriptor desc = new HTableDescriptor(tName);
> desc.setValue("Coprocessor$1",
>path.toString() + ":" + full_class_name +
>  ":" + Coprocessor.Priority.USER);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-3810) Registering a Coprocessor at HTableDescriptor should be less strict


 [ 
https://issues.apache.org/jira/browse/HBASE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling resolved HBASE-3810.
--

Resolution: Fixed

Committed to trunk.  Thanks for the patch Mingjie.

> Registering a Coprocessor at HTableDescriptor should be less strict
> ---
>
> Key: HBASE-3810
> URL: https://issues.apache.org/jira/browse/HBASE-3810
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.0
> Environment: all
>Reporter: Joerg Schad
>Assignee: Mingjie Lai
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-3810_final.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Registering a Copressor in the following way will fail as the "Coprocessor$1" 
> keyword is case sensitive (instead COPROCESSOR$1 works fine). Removing this 
> restriction would improve usability.
> HTableDescriptor desc = new HTableDescriptor(tName);
> desc.setValue("Coprocessor$1",
>path.toString() + ":" + full_class_name +
>  ":" + Coprocessor.Priority.USER);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4097) troubleshooting.xml - adding entry for client errors about can't connect to zookeeper

2011-08-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076055#comment-13076055
 ] 

Hudson commented on HBASE-4097:
---

Integrated in HBase-TRUNK #2068 (See 
[https://builds.apache.org/job/HBase-TRUNK/2068/])
HBASE-4097 troubleshooting - adding entry for client errors about can't 
connect to zookeeper

dmeil : 
Files : 
* /hbase/trunk/src/docbkx/troubleshooting.xml


> troubleshooting.xml - adding entry for client errors about can't connect to 
> zookeeper
> -
>
> Key: HBASE-4097
> URL: https://issues.apache.org/jira/browse/HBASE-4097
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: troubleshooting_HBASE_4097.xml.patch
>
>
> There is a specific stack trace that comes up from time to time on the 
> dist-list and it's either because zookeeper is down or unreachable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4096) book.xml - adding copytables entry in tools-appendix

2011-08-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076056#comment-13076056
 ] 

Hudson commented on HBASE-4096:
---

Integrated in HBase-TRUNK #2068 (See 
[https://builds.apache.org/job/HBase-TRUNK/2068/])
HBASE-4096

dmeil : 
Files : 
* /hbase/trunk/src/docbkx/book.xml


> book.xml - adding copytables entry in tools-appendix
> 
>
> Key: HBASE-4096
> URL: https://issues.apache.org/jira/browse/HBASE-4096
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: book_HBASE_4096.xml.patch
>
>
> This came up on the dist-list recently.  Adding copytables entry to tools 
> appendix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3810) Registering a Coprocessor at HTableDescriptor should be less strict

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076054#comment-13076054
 ] 

jirapos...@reviews.apache.org commented on HBASE-3810:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1051/#review1262
---

Ship it!

Looks good to me.

+1 if tests pass.

src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java

Nice methods for adding and checking existence.

- Gary

On 2011-07-30 00:22:53, Mingjie Lai wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1051/
bq.  ---
bq.  
bq.  (Updated 2011-07-30 00:22:53)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Registering a Coprocessor at HTableDescriptor should be less strict
bq.  - fix regex to account for whitespace around ":" separators
bq.  - make path portion optional – we already skip the path handling if the 
class can be loaded by the classloader
bq.  - make priority optional and default to "USER"
bq.  
bq.  At revision 3, added HTableDecriptor.addCoprocessor() for loading a table 
level coprocessor. 
bq.  
bq.  
bq.  This addresses bug HBase-3810.
bq.  https://issues.apache.org/jira/browse/HBase-3810
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 0641f52 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java 
e0bde92 
bq.  
bq.  Diff: https://reviews.apache.org/r/1051/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Tests passed locally. 
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Mingjie
bq.  
bq.

> Registering a Coprocessor at HTableDescriptor should be less strict
> ---
>
> Key: HBASE-3810
> URL: https://issues.apache.org/jira/browse/HBASE-3810
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.0
> Environment: all
>Reporter: Joerg Schad
>Assignee: Mingjie Lai
>Priority: Minor
> Fix For: 0.92.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Registering a Copressor in the following way will fail as the "Coprocessor$1" 
> keyword is case sensitive (instead COPROCESSOR$1 works fine). Removing this 
> restriction would improve usability.
> HTableDescriptor desc = new HTableDescriptor(tName);
> desc.setValue("Coprocessor$1",
>path.toString() + ":" + full_class_name +
>  ":" + Coprocessor.Priority.USER);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076045#comment-13076045
]

jirapos...@reviews.apache.org commented on HBASE-4014:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1261
---

Sorry for the belated review.

The additional explicit error handling around coprocessor invocations here is
good, but I think this is still missing part of the intent of the original bug
description. I'm concerned about aborts that happen not just resulting
directly from unhandled exceptions in coprocessor code, but from bad
coprocessor behavior that eventually triggers an abort from within core _HBase_
code. This could be a memory leak in the coprocessor that eventually triggers
an OOME that only shows up in reading in the next RPC request, or something
that corrupts internal state or data structures. As a result, I think it's
equally important that the loaded coprocessor set be logged _within_
HMaster.abort() and HRegionServer.abort(), in all cases. So I don't think
logging from the CoprocessorHost implementations is sufficient.

For HRegionServer at least, since RegionCoprocessorHost is associated at the
HRegion level, this would need a distinct set of all coprocessor classes that
have been loaded in that region server's lifetime, accessible at the top
(server) level. (I thought we discussed this at some point, but can't find
those comments -- maybe I'm hearing voices?) I think a simple singleton would
work, with something like an internal HashSet and an add(String
coprocessorClass) method that's called on coprocessor load. The same would
work on HMaster, though it's not strictly necessary since only a single
MasterCoprocessorHost instance is created at the HMaster level. But
consistency in implementation here is probably good. The number of
CoprocessorHost types we have may increase if new Observer types are added.

I think what's here is good, with only some minor nits on naming. But I think
the above enhancement is pretty critical to add.

src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java

I saw in previous comments that this is needed because SortedCopyOnWriteSet
doesn't implement toString(). So why not make SortedCopyOnWriteSet implement
toString()? Seems cleaner to me and more generic/reusable.

Wherever this is implemented, use a StringBuilder to create the string to
return, not repeated string concatenation.

src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java

Awfully long name, maybe just abortServer()?

src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java

This and the following method names are awfully long. It's just personal
preference, but I like to keep things shorter.

src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java

Minor nit, but why have this as a separate top-level class? Seems like it
could just be an inner class in TestRegionServerCoprocessorException same way
that the BuggyMasterObserver is an inner class in
TestMasterCoprocessorException.

- Gary

On 2011-08-01 22:08:27, Eugene Koontz wrote:
bq.
bq. ---
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/969/
bq. ---
bq.
bq. (Updated 2011-08-01 22:08:27)
bq.
bq.
bq. Review request for hbase, Gary Helmling and Mingjie Lai.
bq.
bq.
bq. Summary
bq. ---
bq.
bq. https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the
presence of coprocessors in logged exceptions
bq.
bq. The general gist here is to wrap each of
{Master,RegionServer}CoprocessorHost's coprocessor call inside a
bq.
bq. "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.
bq. block.
bq.
bq. handleCoprocessorThrowable() is responsible for either passing 'e' along
to the client (if 'e' is an IOException) or, otherwise, aborting the service
(Regionserver or Master).
bq.
bq. The abort message contains a list of the loaded coprocessors for crash
analysis.
bq.
bq.
bq. This addresses bug HBASE-4014.
bq. https://issues.apache.org/jira/browse/HBASE-4014
bq.
bq.
bq. Diffs
bq. -
bq.
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
18ba6e7
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
aa930f5
bq.
src/main/j

[jira] [Commented] (HBASE-3065) Retry all 'retryable' zk operations; e.g. connection loss


[ 
https://issues.apache.org/jira/browse/HBASE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076032#comment-13076032
 ] 

Ted Yu commented on HBASE-3065:
---

In RecoverableZooKeeper, should we make the handling of zero length data in 
appendMetaData() and removeMetaData() symmetrical ?
I mean this change:
{code}
   private byte[] appendMetaData(byte[] data) {
-if(data == null){
+if(data == null || data.length == 0){
   return null;
 }
{code}

> Retry all 'retryable' zk operations; e.g. connection loss
> -
>
> Key: HBASE-3065
> URL: https://issues.apache.org/jira/browse/HBASE-3065
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Liyin Tang
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 3065-v3.txt, 3065-v4.txt, HBASE-3065-addendum.patch, 
> HBase-3065[r1088475]_1.patch, hbase3065_2.patch
>
>
> The 'new' master refactored our zk code tidying up all zk accesses and 
> coralling them behind nice zk utility classes.  One improvement was letting 
> out all KeeperExceptions letting the client deal.  Thats good generally 
> because in old days, we'd suppress important state zk changes in state.  But 
> there is at least one case the new zk utility could handle for the 
> application and thats the class of retryable KeeperExceptions.  The one that 
> comes to mind is conection loss.  On connection loss we should retry the 
> just-failed operation.  Usually the retry will just work.  At worse, on 
> reconnect, we'll pick up the expired session event. 
> Adding in this change shouldn't be too bad given the refactor of zk corralled 
> all zk access into one or two classes only.
> One thing to consider though is how much we should retry.  We could retry on 
> a timer or we could retry for ever as long as the Stoppable interface is 
> passed so if another thread has stopped or aborted the hosting service, we'll 
> notice and give up trying.  Doing the latter is probably better than some 
> kinda timeout.
> HBASE-3062 adds a timed retry on the first zk operation.  This issue is about 
> generalizing what is over there across all zk access.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-3065) Retry all 'retryable' zk operations; e.g. connection loss


[ 
https://issues.apache.org/jira/browse/HBASE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076032#comment-13076032
 ] 

Ted Yu edited comment on HBASE-3065 at 8/2/11 4:52 AM:
---

In RecoverableZooKeeper, should we make the handling of zero length data in 
appendMetaData() and removeMetaData() symmetrical ?
I mean this change:
{code}
   private byte[] appendMetaData(byte[] data) {
-if(data == null){
+if(data == null || data.length == 0){
   return data;
 }
{code}

  was (Author: yuzhih...@gmail.com):
In RecoverableZooKeeper, should we make the handling of zero length data in 
appendMetaData() and removeMetaData() symmetrical ?
I mean this change:
{code}
   private byte[] appendMetaData(byte[] data) {
-if(data == null){
+if(data == null || data.length == 0){
   return null;
 }
{code}
  
> Retry all 'retryable' zk operations; e.g. connection loss
> -
>
> Key: HBASE-3065
> URL: https://issues.apache.org/jira/browse/HBASE-3065
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Liyin Tang
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 3065-v3.txt, 3065-v4.txt, HBASE-3065-addendum.patch, 
> HBase-3065[r1088475]_1.patch, hbase3065_2.patch
>
>
> The 'new' master refactored our zk code tidying up all zk accesses and 
> coralling them behind nice zk utility classes.  One improvement was letting 
> out all KeeperExceptions letting the client deal.  Thats good generally 
> because in old days, we'd suppress important state zk changes in state.  But 
> there is at least one case the new zk utility could handle for the 
> application and thats the class of retryable KeeperExceptions.  The one that 
> comes to mind is conection loss.  On connection loss we should retry the 
> just-failed operation.  Usually the retry will just work.  At worse, on 
> reconnect, we'll pick up the expired session event. 
> Adding in this change shouldn't be too bad given the refactor of zk corralled 
> all zk access into one or two classes only.
> One thing to consider though is how much we should retry.  We could retry on 
> a timer or we could retry for ever as long as the Stoppable interface is 
> passed so if another thread has stopped or aborted the hosting service, we'll 
> notice and give up trying.  Doing the latter is probably better than some 
> kinda timeout.
> HBASE-3062 adds a timed retry on the first zk operation.  This issue is about 
> generalizing what is over there across all zk access.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076031#comment-13076031
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1260
---

src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java

Would one such config option be enough instead of two ? My reasoning is 
that debug mode is a cluster wide mode.

We shouldn't mix dot and underscore in the name of config. How about 
hbase.coproc.error.aborts.server ?

When the option is false, we need to prevent flooding log file with 
LOG.error()

We have two choices for the above scenario.

1. use some counter to reduce frequency of LOG.error()
2. we can remove the offending coprocessor

Although the second choice implies a new config option. So I would choose 
option 1.

- Ted

On 2011-08-01 22:08:27, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-08-01 22:08:27)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.

> Coprocessors: Flag the presence of coprocessors in logged exceptions
> 
>
> Key: HBASE-4014
> URL: https://issues.apache.org/jira/browse/HBASE-4014
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Andrew Purtell
>Assignee: Eugene Koontz
> Fix For: 0.92.0
>
> Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
> HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with 
> loaded coprocessors, we need something like the Linux kernel's taint flag, 
> and list of linked in modules that show up in the output of every OOPS, to 
> appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076029#comment-13076029
 ] 

Jonathan Hsieh commented on HBASE-4148:
---

Sorry about previous post; posted to wrong jira.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076027#comment-13076027
 ] 

Jonathan Hsieh commented on HBASE-4148:
---

Patrick Hunt suggested using 'apt-get purge' instead of 'apt-get remove'.  This 
seems to have worked. The difference between 'purge' (remove binary + configs) 
and 'remove' (just remove binary) wan't clear to me until I looked it up. 

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076022#comment-13076022
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--

bq.  On 2011-08-01 22:52:20, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, 
line 112
bq.  > 
bq.  >
bq.  > Reading description of this JIRA, I don't see abortion as part of 
the requirement.
bq.  > Shall we introduce debug mode for coprocessors. When debug mode is 
enabled, we abort.

Hi Ted,
A configuration option sounds like a good idea to me - what shall we call it?

something like (2 of them, one for master, one for RS) :

hbase.master.abort_on_cp_error
hbase.regionserver.abort_on_cp_error

if true, abort (as in this patch currently)
if false, print LOG.error() with the identical message that's printed by the 
abort.

- Eugene

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1256
---

On 2011-08-01 22:08:27, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-08-01 22:08:27)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.

> Coprocessors: Flag the presence of coprocessors in logged exceptions
> 
>
> Key: HBASE-4014
> URL: https://issues.apache.org/jira/browse/HBASE-4014
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Andrew Purtell
>Assignee: Eugene Koontz
> Fix For: 0.92.0
>
> Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
> HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with 
> loaded coprocessors, we need something like the Linux kernel's taint flag, 
> and list of linked in modules that show up in the output of every OOPS, to 
> appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3857) Change the HFile Format


[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076009#comment-13076009
 ] 

Ted Yu commented on HBASE-3857:
---

@Mikhail:
If 2011-08-01_11_34_28 test failures match those in TRUNK build 2067 (18 
failures), that would give us confidence of not introducing regression.

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4089) blockCache contents report

2011-08-01 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076004#comment-13076004
 ] 

Andrew Purtell commented on HBASE-4089:
---

bq. Does that mean that BlockCacheSummary returned from BlockCache should 
implement writable, or is there another class that represents the 
BlockCacheSummary that implements Writable that has the same information

Objects sent over RPC implement Writable directly, so by that convention 
BlockCacheSummary should implement Writable.


> blockCache contents report
> --
>
> Key: HBASE-4089
> URL: https://issues.apache.org/jira/browse/HBASE-4089
> Project: HBase
>  Issue Type: New Feature
>Reporter: Doug Meil
> Attachments: hbase_4089_blockcachereport.pdf
>
>
> Summarized block-cache report for a RegionServer would be helpful.  For 
> example ...
> table1
>   cf1   100 blocks, totalBytes=y, averageTimeInCache= hours
>   cf2   200 blocks, totalBytes=z, averageTimeInCache= hours
> table2
>   cf1  75 blocks, totalBytes=y, averageTimeInCache= hours
>   cf2 150 blocks, totalBytes=z, averageTimeInCache= hours
> ... Etc.
> The current metrics list blockCacheSize and blockCacheFree, but there is no 
> way to know what's in there.  Any single block isn't really important, but 
> the patterns of what CF/Table they came from, how big are they, and how long 
> (on average) they've been in the cache, are important.
> No such interface exists in HRegionInterface.  But I think it would be 
> helpful from an operational perspective.
> Updated (7-29):  Removing suggestion for UI.  I would be happy just to get 
> this report on a configured interval dumped to a log file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of "Region has been PENDING_CLOSE for too long..."

2011-08-01 Thread gaojinchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13076000#comment-13076000
 ] 

gaojinchao commented on HBASE-4064:
---

Do we need fix this issue? If it need I will test it. or I will close the issue 
?

> Two concurrent unassigning of the same region caused the endless loop of 
> "Region has been PENDING_CLOSE for too long..."
> 
>
> Key: HBASE-4064
> URL: https://issues.apache.org/jira/browse/HBASE-4064
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.3
>Reporter: Jieshan Bean
> Fix For: 0.90.5
>
> Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, 
> disableflow.png
>
>
> 1. If there is a "rubbish" RegionState object with "PENDING_CLOSE" in 
> regionsInTransition(The RegionState was remained by some exception which 
> should be removed, that's why I called it as "rubbish" object), but the 
> region is not currently assigned anywhere, TimeoutMonitor will fall into an 
> endless loop:
> 2011-06-27 10:32:21,326 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> state=PENDING_CLOSE, ts=1309141555301
> 2011-06-27 10:32:21,326 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
> 2011-06-27 10:32:21,438 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> (offlining)
> 2011-06-27 10:32:21,441 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
> not currently assigned anywhere
> 2011-06-27 10:32:31,207 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> state=PENDING_CLOSE, ts=1309141555301
> 2011-06-27 10:32:31,207 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
> 2011-06-27 10:32:31,215 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> (offlining)
> 2011-06-27 10:32:31,215 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
> not currently assigned anywhere
> 2011-06-27 10:32:41,164 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> state=PENDING_CLOSE, ts=1309141555301
> 2011-06-27 10:32:41,164 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
> 2011-06-27 10:32:41,172 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> (offlining)
> 2011-06-27 10:32:41,172 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
> not currently assigned anywhere
> .
> 2  In the following scenario, two concurrent unassigning call of the same 
> region may lead to the above problem:
> the first unassign call send rpc call success, the master watched the event 
> of "RS_ZK_REGION_CLOSED", process this event, will create a 
> ClosedRegionHandler to remove the state of the region in master.eg.
> while ClosedRegionHandler is running in  
> "hbase.master.executor.closeregion.threads" thread (A), another unassign call 
> of same region run in another thread(B).
> while thread B  run "if (!regions.containsKey(region))", this.regions have 
> the region info, now  cpu switch to thread A.
> The thread A will remove the region from the sets of "this.regions" and 
> "regionsInTransition", then switch to thread B. the thread B run continue, 
> will throw an exception with the msg of "Server null returned 
> java.lang.NullPointerException: Passed server is null for 
> 9a6e26d40293663a79523c58315b930f", but without removing the new-adding 
> RegionState from "regionsInTransition",and it can not be removed for ever.
>  public void unas

[jira] [Commented] (HBASE-4097) troubleshooting.xml - adding entry for client errors about can't connect to zookeeper


[ 
https://issues.apache.org/jira/browse/HBASE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075999#comment-13075999
 ] 

Doug Meil commented on HBASE-4097:
--

Integrated to trunk.
*11529948/1/11 9:59 PM  1   dmeil   


> troubleshooting.xml - adding entry for client errors about can't connect to 
> zookeeper
> -
>
> Key: HBASE-4097
> URL: https://issues.apache.org/jira/browse/HBASE-4097
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: troubleshooting_HBASE_4097.xml.patch
>
>
> There is a specific stack trace that comes up from time to time on the 
> dist-list and it's either because zookeeper is down or unreachable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4097) troubleshooting.xml - adding entry for client errors about can't connect to zookeeper


 [ 
https://issues.apache.org/jira/browse/HBASE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4097:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> troubleshooting.xml - adding entry for client errors about can't connect to 
> zookeeper
> -
>
> Key: HBASE-4097
> URL: https://issues.apache.org/jira/browse/HBASE-4097
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: troubleshooting_HBASE_4097.xml.patch
>
>
> There is a specific stack trace that comes up from time to time on the 
> dist-list and it's either because zookeeper is down or unreachable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4096) book.xml - adding copytables entry in tools-appendix


[ 
https://issues.apache.org/jira/browse/HBASE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075997#comment-13075997
 ] 

Doug Meil commented on HBASE-4096:
--

Commited on trunk.
*11529938/1/11 9:54 PM  1   dmeil   HBASE-4096


> book.xml - adding copytables entry in tools-appendix
> 
>
> Key: HBASE-4096
> URL: https://issues.apache.org/jira/browse/HBASE-4096
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: book_HBASE_4096.xml.patch
>
>
> This came up on the dist-list recently.  Adding copytables entry to tools 
> appendix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4096) book.xml - adding copytables entry in tools-appendix


 [ 
https://issues.apache.org/jira/browse/HBASE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4096:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> book.xml - adding copytables entry in tools-appendix
> 
>
> Key: HBASE-4096
> URL: https://issues.apache.org/jira/browse/HBASE-4096
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: book_HBASE_4096.xml.patch
>
>
> This came up on the dist-list recently.  Adding copytables entry to tools 
> appendix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-3842) Refactor Coprocessor Compaction API


[ 
https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075985#comment-13075985
 ] 

Gary Helmling edited comment on HBASE-3842 at 8/2/11 12:57 AM:
---

I think the stacking issue is key here:  are we expecting the common case to be 
loading a single "CompactionObserver" that overrides the compaction 
implementation, or loading multiple that each override/customize compaction 
policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for 
{{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the 
same time, I don't think we want to force the implementations to write their 
own {{StoreFiles}} though, as that will be massively inefficient -- for N 
loaded coprocessors this becomes N compactions being written (assuming we 
bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as 
a parameter, as suggested, and return a scanner instance that would allow 
overriding policy and mutating KVs, while still relying on the core writer 
functionality.  This would allow wrapping the store scanner with a custom 
scanner that inspects and emits KVs as needed on the fly.  In this case, 
{{preCompact}} would look like:

{code}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner 
scanner);
{code}

Wrapping the scanner seems much easier for chaining multiple observers.  On the 
other hand we lose the clean {{boolean}} return to indicate that core 
compaction processing should be skipped.  Are there cases that would still want 
to handling the store file writing portion of the implementation entirely in 
the coprocessor?  If so, can we still emit a flag to skip normal processing 
another way?  We could skip normal processing if {{null}} is returned.  Seems a 
little clunky, but it could work with appropriate documentation.

  was (Author: ghelmling):
I think the stacking issue is key here:  are we expecting the common case 
to be loading a single "CompactionObserver" that overrides the compaction 
implementation, or loading multiple that each override/customize compaction 
policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for 
{{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the 
same time, I don't think we want to force the implementations to write their 
own {{StoreFiles}} though, as that will be massively inefficient -- for N 
loaded coprocessors this becomes N compactions being written (assuming we 
bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as 
a parameter, as suggested, and return a scanner instance that would allow 
overriding policy and mutating KVs, while still relying on the core writer 
functionality.  This would allow wrapping the store scanner with a custom 
scanner that inspects and emits KVs as needed on the fly.  In this case, 
{{preCompact}} would look like:

{{code}}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner 
scanner);
{{code}}

Wrapping the scanner seems much easier for chaining multiple observers.  On the 
other hand we lose the clean {{boolean}} return to indicate that core 
compaction processing should be skipped.  Are there cases that would still want 
to handling the store file writing portion of the implementation entirely in 
the coprocessor?  If so, can we still emit a flag to skip normal processing 
another way?  We could skip normal processing if {{null}} is returned.  Seems a 
little clunky, but it could work with appropriate documentation.
  
> Refactor Coprocessor Compaction API
> ---
>
> Key: HBASE-3842
> URL: https://issues.apache.org/jira/browse/HBASE-3842
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors, regionserver
>Affects Versions: 0.92.0
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: compaction
> Fix For: 0.92.0
>
>
> After HBASE-3797, the compaction logic flow has been significantly altered.  
> Because of this, the current compaction coprocessor API is insufficient for 
> gaining full insight into compaction requests/results.  Refactor coprocessor 
> API after HBASE-3797 is committed to be more extensible and increase 
> visibility.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3842) Refactor Coprocessor Compaction API


[ 
https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075985#comment-13075985
 ] 

Gary Helmling commented on HBASE-3842:
--

I think the stacking issue is key here:  are we expecting the common case to be 
loading a single "CompactionObserver" that overrides the compaction 
implementation, or loading multiple that each override/customize compaction 
policy but not necessarily behavior?

I agree on the one hand that having a {{KeyValue}} oriented interface for 
{{preCompactWrite()}} and {{postCompactWrite()}} may not be sufficient.  At the 
same time, I don't think we want to force the implementations to write their 
own {{StoreFiles}} though, as that will be massively inefficient -- for N 
loaded coprocessors this becomes N compactions being written (assuming we 
bypass the core compaction code at the end of chaining).

One alternative would be to have {{preCompact}} take the scanner to be used as 
a parameter, as suggested, and return a scanner instance that would allow 
overriding policy and mutating KVs, while still relying on the core writer 
functionality.  This would allow wrapping the store scanner with a custom 
scanner that inspects and emits KVs as needed on the fly.  In this case, 
{{preCompact}} would look like:

{{code}}
StoreScanner preCompact(ObserverContext<~> context, Store store, StoreScanner 
scanner);
{{code}}

Wrapping the scanner seems much easier for chaining multiple observers.  On the 
other hand we lose the clean {{boolean}} return to indicate that core 
compaction processing should be skipped.  Are there cases that would still want 
to handling the store file writing portion of the implementation entirely in 
the coprocessor?  If so, can we still emit a flag to skip normal processing 
another way?  We could skip normal processing if {{null}} is returned.  Seems a 
little clunky, but it could work with appropriate documentation.

> Refactor Coprocessor Compaction API
> ---
>
> Key: HBASE-3842
> URL: https://issues.apache.org/jira/browse/HBASE-3842
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors, regionserver
>Affects Versions: 0.92.0
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: compaction
> Fix For: 0.92.0
>
>
> After HBASE-3797, the compaction logic flow has been significantly altered.  
> Because of this, the current compaction coprocessor API is insufficient for 
> gaining full insight into compaction requests/results.  Refactor coprocessor 
> API after HBASE-3797 is committed to be more extensible and increase 
> visibility.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075983#comment-13075983
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/
---

(Updated 2011-08-02 00:46:57.567666)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, 
and Li Pi.


Changes
---

configuration changed to percentage of MaxDirectMemorySize rather than 
specifying size of offHeapCache.


Summary
---

Review request - I apparently can't edit tlipcon's earlier posting of my diff, 
so creating a new one.


This addresses bug HBase-4027.
https://issues.apache.org/jira/browse/HBase-4027


Diffs (updated)
-

  conf/hbase-env.sh 2d55d27 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 509121d 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheTestUtils.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ecab7ca 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 150f54f 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e87eb3e 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 9cc75bb 
  src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java c4c66e1 

Diff: https://reviews.apache.org/r/1214/diff


Testing
---

Ran benchmarks against it in HBase standalone mode. Wrote test cases for all 
classes, multithreaded test cases exist for the cache.


Thanks,

Li



> Enable direct byte buffers LruBlockCache
> 
>
> Key: HBASE-4027
> URL: https://issues.apache.org/jira/browse/HBASE-4027
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jason Rutherglen
>Assignee: Li Pi
>Priority: Minor
> Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
> HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027v6.diff, 
> slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, 
> slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, 
> slabcachepatchv4.diff
>
>
> Java offers the creation of direct byte buffers which are allocated outside 
> of the heap.
> They need to be manually free'd, which can be accomplished using an 
> documented {{clean}} method.
> The feature will be optional.  After implementing, we can benchmark for 
> differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-01 Thread Li Pi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-4027:
-

Attachment: HBase4027v9.diff

configuration changed to percentage of MaxDirectMemorySize.

> Enable direct byte buffers LruBlockCache
> 
>
> Key: HBASE-4027
> URL: https://issues.apache.org/jira/browse/HBASE-4027
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jason Rutherglen
>Assignee: Li Pi
>Priority: Minor
> Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
> HBase-4027.pdf, HBase4027v8.diff, HBase4027v9.diff, hbase-4027v6.diff, 
> slabcachepatch.diff, slabcachepatchv2.diff, slabcachepatchv3.1.diff, 
> slabcachepatchv3.2.diff, slabcachepatchv3.diff, slabcachepatchv4.5.diff, 
> slabcachepatchv4.diff
>
>
> Java offers the creation of direct byte buffers which are allocated outside 
> of the heap.
> They need to be manually free'd, which can be accomplished using an 
> documented {{clean}} method.
> The feature will be optional.  After implementing, we can benchmark for 
> differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4154) LoadIncrementalHFiles.doBulkLoad should validate the data being loaded

2011-08-01 Thread David Capwell (JIRA)

LoadIncrementalHFiles.doBulkLoad should validate the data being loaded
--

 Key: HBASE-4154
 URL: https://issues.apache.org/jira/browse/HBASE-4154
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 0.90.3
Reporter: David Capwell


LoadIncrementalHFiles.doBulkLoad currently checks if the HDFS Path matches the 
table you are uploading to but it doesn't validate that the contents of the 
data belong to the table.

This can be an issue if the HFile contains multiple families or invalid 
families.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075975#comment-13075975
 ] 

Hudson commented on HBASE-4148:
---

Integrated in HBase-TRUNK #2066 (See 
[https://builds.apache.org/job/HBase-TRUNK/2066/])
HBASE-4148  HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata 
(Jonathan Hsieh)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java


> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-01 Thread Li Pi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-4027:
-

Attachment: HBase4027v8.diff

Fixed build error, addressed most of Ted Yu/JGray's review. 

> Enable direct byte buffers LruBlockCache
> 
>
> Key: HBASE-4027
> URL: https://issues.apache.org/jira/browse/HBASE-4027
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jason Rutherglen
>Assignee: Li Pi
>Priority: Minor
> Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
> HBase-4027.pdf, HBase4027v8.diff, hbase-4027v6.diff, slabcachepatch.diff, 
> slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, 
> slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff
>
>
> Java offers the creation of direct byte buffers which are allocated outside 
> of the heap.
> They need to be manually free'd, which can be accomplished using an 
> documented {{clean}} method.
> The feature will be optional.  After implementing, we can benchmark for 
> differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4027) Enable direct byte buffers LruBlockCache

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13075966#comment-13075966
 ] 

jirapos...@reviews.apache.org commented on HBASE-4027:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1214/
---

(Updated 2011-08-01 23:54:08.232314)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, Jonathan Gray, 
and Li Pi.


Changes
---

Addressed most of Ted Yu/JGray's comments. Will do another patch later. Fixed 
build errors. Tested patch/build on fresh pull from trunk.


Summary
---

Review request - I apparently can't edit tlipcon's earlier posting of my diff, 
so creating a new one.


This addresses bug HBase-4027.
https://issues.apache.org/jira/browse/HBase-4027


Diffs (updated)
-

  conf/hbase-env.sh 2d55d27 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java 509121d 
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheTestUtils.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheStats.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/DoubleBlockCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java ecab7ca 
  src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java 150f54f 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SingleSizeCache.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/Slab.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java 
PRE-CREATION 
  
src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabItemEvictionWatcher.java
 PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e87eb3e 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 9cc75bb 
  src/main/java/org/apache/hadoop/hbase/util/DirectMemoryUtils.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSingleSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlab.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/io/hfile/slab/TestSlabCache.java 
PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java c4c66e1 

Diff: https://reviews.apache.org/r/1214/diff


Testing
---

Ran benchmarks against it in HBase standalone mode. Wrote test cases for all 
classes, multithreaded test cases exist for the cache.


Thanks,

Li



> Enable direct byte buffers LruBlockCache
> 
>
> Key: HBASE-4027
> URL: https://issues.apache.org/jira/browse/HBASE-4027
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jason Rutherglen
>Assignee: Li Pi
>Priority: Minor
> Attachments: 4027-v5.diff, 4027v7.diff, HBase-4027 (1).pdf, 
> HBase-4027.pdf, HBase4027v8.diff, hbase-4027v6.diff, slabcachepatch.diff, 
> slabcachepatchv2.diff, slabcachepatchv3.1.diff, slabcachepatchv3.2.diff, 
> slabcachepatchv3.diff, slabcachepatchv4.5.diff, slabcachepatchv4.diff
>
>
> Java offers the creation of direct byte buffers which are allocated outside 
> of the heap.
> They need to be manually free'd, which can be accomplished using an 
> documented {{clean}} method.
> The feature will be optional.  After implementing, we can benchmark for 
> differences in speed and garbage collection observances.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073981#comment-13073981
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1256
---

src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java

Reading description of this JIRA, I don't see abortion as part of the 
requirement.
Shall we introduce debug mode for coprocessors. When debug mode is enabled, 
we abort.

- Ted

On 2011-08-01 22:08:27, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-08-01 22:08:27)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.

> Coprocessors: Flag the presence of coprocessors in logged exceptions
> 
>
> Key: HBASE-4014
> URL: https://issues.apache.org/jira/browse/HBASE-4014
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Andrew Purtell
>Assignee: Eugene Koontz
> Fix For: 0.92.0
>
> Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
> HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with 
> loaded coprocessors, we need something like the Linux kernel's taint flag, 
> and list of linked in modules that show up in the output of every OOPS, to 
> appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3857) Change the HFile Format


[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073982#comment-13073982
 ] 

Jean-Daniel Cryans commented on HBASE-3857:
---

The reason TestReplication sometimes fails is HBASE-3515, if you see that stack 
trace then you can disregard. If not, that's a new bug.

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-08-01 Thread Mikhail Bautin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073880#comment-13073880
 ] 

Mikhail Bautin commented on HBASE-3857:
---

@Ted:

Yes, you are right, TestReplication also failed with the patch applied while it 
did not fail without the patch. The failure message was as follows:

java.lang.AssertionError: Waited too much time for queueFailover replication

Looking at my archive of unit test results for the trunk, TestReplication 
failure shows up in 67 cases out of 291 total runs of the test suite, so I 
suspect it is a highly problematic unit test. I will look into this a bit more.

However, my question is the following: with the extremely agile development 
process and a non-trivial number of routine unit test failures in the trunk, 
what is the accepted approach of testing new patches? How do people select the 
right revision to test their patch against? Is it always the trunk, and are the 
existing unit test failures in the trunk ignored (which is bad because this may 
mask new bugs), or is there a different approach?


> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073822#comment-13073822
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--

bq.  On 2011-06-30 04:23:35, Ted Yu wrote:
bq.  > 
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java,
 line 246
bq.  > 
bq.  >
bq.  > Please add timeout parameter, e.g.
bq.  > -  @Test
bq.  > +  @Test (timeout=30)

Thanks Ted, added timeouts of 30 seconds (3) per test in both 
TestMasterCoprocessorException.java and 
TestRegionServerCoprocessorException.java. This amount is sufficient to pass 
all tests on my dev laptop.

- Eugene

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review933
---

On 2011-07-14 23:39:07, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-07-14 23:39:07)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java 
54ccd6f 
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 
8ffa086 
bq.src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 
4fa82c0 
bq.src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 6af0ecf 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java 
PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 
7f19c72 
bq.src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java 
78e7d62 
bq.src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 
c571227 
bq.src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java 
fc05e47 
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
 aa48c22 
bq.
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
 dd5dc3e 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.

> Coprocessors: Flag the presence of coprocessors in logged exceptions
> 
>
> Key: HBASE-4014
> URL: https://issues.apache.org/jira/browse/HBASE-4014
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Andrew Purtell
>Assignee: Eugene Koontz
> Fix For: 0.92.0
>
> Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
> HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with 
> loaded coprocessors, we need something like the Linux kernel's taint flag, 
> and list of linked in modules that show up in the output of every OOPS, to 
> appear above or below exceptions that appear in the logs.

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073821#comment-13073821
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--



bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > Some comments below Eugene.  This thing looks useful and almost done.  
Lets get it in!

Thank you Michael for looking at this!


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java, 
line 54
bq.  > 
bq.  >
bq.  > Do we need to add this?  Doesn't every object inherit Object and so 
have a toString?

Thanks, Michael; you are right. Removed this unnecessary declaration.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java, 
line 578
bq.  > 
bq.  >
bq.  > Do you think this needed Eugene?  Is coprocessors a List?  What if 
you toString'd it?  Maybe'll do right thing (with square bracket delimiters 
rather than curly's but that might be ok)

Hi Michael, coprocessors is an o.a.h.h.util.SortedCopyOnWriteSet, whose 
toString() returns :

"org.apache.hadoop.hbase.util.SortedCopyOnWriteSet@4d441b16" (or a similar 
memory location after the @-sign. 

Thus the need for a string serialization method like 
"coprocessorSetAsString()". 

Perhaps this should be moved to an (overriding) SortedCopyOnWriteSet's 
toString()? I am happy to make a sub-JIRA to HBASE-4014 if you think so.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, 
line 81
bq.  > 
bq.  >
bq.  > Whats masterServices?  I think it subclasses Server?  If you do 
getServerName or something, that'll give you something better than 'master'.  
It'll include port and startcode which could be important debugging (more 
important for the RS case than for Master but could be important if 
multimasters).

Correct; masterServices is an instance of MasterServices, which subclasses 
Server. 

I think you're right that the port and startcode are important. 

But note that abortServiceWithCoprocessorInfo() *does* show 
serverName.getServerName() in its output, (so it shows the port and startcode 
as you recomend).

On the other hand, getServerName() doesn't show what *role* the server plays: 
is it a master or a regionserver? It seems to me this should be part of the 
abort message, too.

So here's an example of an entire abort message on the master side:

Aborting service: master running on : 192.168.0.136,56238,1312228544878 because 
coprocessor: 
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorException$BuggyMasterObserver@1658fe12
 threw an exception: java.lang.NullPointerException. Loaded coprocessors are: 
{class 
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorException$BuggyMasterObserver}

How does that look to you?


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, 
line 87
bq.  > 
bq.  >
bq.  > Nice comment

Thank you! :)


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, 
line 112
bq.  > 
bq.  >
bq.  > Abort seems radical

Hmm..is it? The point of HBASE-4014 is to catch buggy coprocessors by aborting 
the coprocessor host (the master or regionserver) with as much information as 
possible so that the bug in the coprocessor can be fixed. For example, 
BuggyMasterObserver (in TestMasterCoprocessorException.java) tries to 
dereference a null pointer.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java, 
line 237
bq.  > 
bq.  >
bq.  > The convention is to put } catch { on the one line rather than line 
break after the } (no biggie)

Thank you; fixed.


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java, 
line 26
bq.  > 
bq.  >
bq.  > We are importing but we don't seem to use the imports, is that so?

You are right: removed unused "import o.a.h.h.coprocessor.Coprocessor;" as well 
as "import o.a.h.h.coprocessor.CoprocessorHost;".


bq.  On 2011-07-27 05:14:19, Michael Stack wrote:
bq.  > 
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCopr

[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073820#comment-13073820
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/
---

(Updated 2011-08-01 22:08:27.772297)


Review request for hbase, Gary Helmling and Mingjie Lai.


Changes
---

Addressed Michael Stack's comments.


Summary
---

https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions

The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's 
coprocessor call inside a 

"try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"

block. 

handleCoprocessorThrowable() is responsible for either passing 'e' along to the 
client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).

The abort message contains a list of the loaded coprocessors for crash analysis.


This addresses bug HBASE-4014.
https://issues.apache.org/jira/browse/HBASE-4014


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
18ba6e7 
  src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
  src/test/java/org/apache/hadoop/hbase/coprocessor/BuggyRegionObserver.java 
PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/969/diff


Testing
---

patch includes two tests:

TestMasterCoprocessorException.java
TestRegionServerCoprocessorException.java

both tests pass in my build environment.


Thanks,

Eugene



> Coprocessors: Flag the presence of coprocessors in logged exceptions
> 
>
> Key: HBASE-4014
> URL: https://issues.apache.org/jira/browse/HBASE-4014
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Andrew Purtell
>Assignee: Eugene Koontz
> Fix For: 0.92.0
>
> Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
> HBASE-4014.patch
>
>
> For some initial triage of bug reports for core versus for deployments with 
> loaded coprocessors, we need something like the Linux kernel's taint flag, 
> and list of linked in modules that show up in the output of every OOPS, to 
> appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3857) Change the HFile Format


[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073819#comment-13073819
 ] 

Ted Yu commented on HBASE-3857:
---

@Mikhail:
Thanks for your effort of bringing this closer to checkin.

Minor note: Replication was new in your second list of unit tests above.

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-08-01 Thread Mikhail Bautin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073814#comment-13073814
 ] 

Mikhail Bautin commented on HBASE-3857:
---

The new version of the patch is successfully passing the randomized load test 
as well.

I am trying to compare unit test results between r1152122 and the same version 
with the patch applied.

Without the patch:
2011-08-01_10_55_23 commit: HBASE-4144  RS does not abort if th | tests: 812, 
fail: 4, err: 14, skip: 9, time: 4791.1, failed: FullLogReconstruction, 
DistributedLogSplitting, SplitTransactionOnCluster, ScannerTimeout, 
MasterFailover, MultiParallel

With the patch:
2011-08-01_11_34_28 commit: review_hfile-v2-r1152122-2011_08_01 | tests: 845, 
fail: 5, err: 14, skip: 9, time: 5426.5, failed: FullLogReconstruction, 
DistributedLogSplitting, ServerCustomProtocol, Replication, 
SplitTransactionOnCluster, ScannerTimeout, MasterFailover, MultiParallel

Looking at the ServerCustomProtocol (the only test that has failures with the 
patch but not without), I see that this is a quite frequent intermittent test 
failure in my automated runs of HBase trunk tests.

Any advice about what version I should select as the baseline for my unit test 
run? The only trunk version that I managed to get a clean unit test run with 
was r1147350, and my patch against that passed all unit tests as well, but I 
think some changes were introduced since then that made the patch not apply 
cleanly, so I rebased the patch to a more recent version (and got all the test 
failures of that version). 

Notwithstanding all the above, I believe the patch is currently in a stable 
state and can be committed, and I will confirm that once a few more unit test 
runs complete.


> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata


[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073811#comment-13073811
 ] 

Ted Yu commented on HBASE-4148:
---

Integrated to branch and TRUNK.

Thanks for the patch Jonathan.
Thanks for the review, Todd and Li.

I ran TestHFileOutputFormat in 0.90 branch and it passed.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3810) Registering a Coprocessor at HTableDescriptor should be less strict

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073799#comment-13073799
 ] 

jirapos...@reviews.apache.org commented on HBASE-3810:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1051/#review1255
---

Ship it!

Minor: the second line below should be removed.
{code}
+// validate parameter kvs
+//String kvString = "";
{code}

- Ted

On 2011-07-30 00:22:53, Mingjie Lai wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1051/
bq.  ---
bq.  
bq.  (Updated 2011-07-30 00:22:53)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Registering a Coprocessor at HTableDescriptor should be less strict
bq.  - fix regex to account for whitespace around ":" separators
bq.  - make path portion optional – we already skip the path handling if the 
class can be loaded by the classloader
bq.  - make priority optional and default to "USER"
bq.  
bq.  At revision 3, added HTableDecriptor.addCoprocessor() for loading a table 
level coprocessor. 
bq.  
bq.  
bq.  This addresses bug HBase-3810.
bq.  https://issues.apache.org/jira/browse/HBase-3810
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 0641f52 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
c2b3558 
bq.src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java 
e0bde92 
bq.  
bq.  Diff: https://reviews.apache.org/r/1051/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Tests passed locally. 
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Mingjie
bq.  
bq.

> Registering a Coprocessor at HTableDescriptor should be less strict
> ---
>
> Key: HBASE-3810
> URL: https://issues.apache.org/jira/browse/HBASE-3810
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.0
> Environment: all
>Reporter: Joerg Schad
>Assignee: Mingjie Lai
>Priority: Minor
> Fix For: 0.92.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Registering a Copressor in the following way will fail as the "Coprocessor$1" 
> keyword is case sensitive (instead COPROCESSOR$1 works fine). Removing this 
> restriction would improve usability.
> HTableDescriptor desc = new HTableDescriptor(tName);
> desc.setValue("Coprocessor$1",
>path.toString() + ":" + full_class_name +
>  ":" + Coprocessor.Priority.USER);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3741) Make HRegionServer aware of the regions it's opening/closing


[ 
https://issues.apache.org/jira/browse/HBASE-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073782#comment-13073782
 ] 

Jean-Daniel Cryans commented on HBASE-3741:
---

You're right Stack, opened HBASE-4153.

> Make HRegionServer aware of the regions it's opening/closing
> 
>
> Key: HBASE-3741
> URL: https://issues.apache.org/jira/browse/HBASE-3741
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.90.3
>
> Attachments: HBASE-3741-rsfix-v2.patch, HBASE-3741-rsfix-v3.patch, 
> HBASE-3741-rsfix.patch, HBASE-3741-trunk.patch
>
>
> This is a serious issue about a race between regions being opened and closed 
> in region servers. We had this situation where the master tried to unassign a 
> region for balancing, failed, force unassigned it, force assigned it 
> somewhere else, failed to open it on another region server (took too long), 
> and then reassigned it back to the original region server. A few seconds 
> later, the region server processed the first closed and the region was left 
> unassigned.
> This is from the master log:
> {quote}
> 11-04-05 15:11:17,758 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
> Sent CLOSE to serverName=sv4borg42,60020,1300920459477, load=(requests=187, 
> regions=574, usedHeap=3918, maxHeap=6973) for region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:12:10,021 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=PENDING_CLOSE, ts=1302041477758
> 2011-04-05 15:12:10,021 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> ...
> 2011-04-05 15:14:45,783 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=CLOSED, ts=1302041685733
> 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x42ec2cece810b68 Creating (or updating) unassigned node for 
> 1470298961 with OFFLINE state
> ...
> 2011-04-05 15:14:45,885 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961;
>  
> plan=hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
>  src=sv4borg42,60020,1300920459477, dest=sv4borg40,60020,1302041218196
> 2011-04-05 15:14:45,885 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  to sv4borg40,60020,1302041218196
> 2011-04-05 15:15:39,410 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=PENDING_OPEN, ts=1302041700944
> 2011-04-05 15:15:39,410 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_OPEN for too long, reassigning 
> region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  state=PENDING_OPEN, ts=1302041700944
> ...
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
> was found (or we are ignoring an existing plan) for 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  so generated a random one; 
> hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
>  src=, dest=sv4borg42,60020,1300920459477; 19 (online=19, exclude=null) 
> available servers
> 2011-04-05 15:15:39,410 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
>  to sv4borg42,60020,1300920459477
> 2011-04-05 15:15:40,951 DEBUG 
> org.apache.hadoop

[jira] [Created] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

Handle RegionAlreadyInTransitionException in AssignmentManager
--

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
 Fix For: 0.92.0


Comment from Stack over in HBASE-3741:

{quote}
Question: Looking at this patch again, if we throw a 
RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
though RegionAlreadyInTransitionException in at least one case here is saying 
that the region is already open on this regionserver?
{quote}

Indeed looking at the code it's going to be handled the same way other 
exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4132) Extend the WALActionsListener API to accomodate log archival

2011-08-01 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4132:
--

Summary: Extend the WALActionsListener API to accomodate log archival  
(was: Extend the WALObserver API to accomodate log archival)

> Extend the WALActionsListener API to accomodate log archival
> 
>
> Key: HBASE-4132
> URL: https://issues.apache.org/jira/browse/HBASE-4132
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.92.0
>
> Attachments: walArchive.txt
>
>
> The WALObserver interface exposes the log roll events. It would be nice to 
> extend it to accomodate log archival events as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4152) Rename o.a.h.h.regionserver.wal.WALObserver to o.a.h.h.regionserver.wal.WALActionsListener

2011-08-01 Thread Andrew Purtell (JIRA)

Rename o.a.h.h.regionserver.wal.WALObserver to 
o.a.h.h.regionserver.wal.WALActionsListener 
---

 Key: HBASE-4152
 URL: https://issues.apache.org/jira/browse/HBASE-4152
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell


Rename o.a.h.h.regionserver.wal.WALObserver to 
o.a.h.h.regionserver.wal.WALActionsListener 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073687#comment-13073687
 ] 

jirapos...@reviews.apache.org commented on HBASE-4148:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/#review1253
---

Ship it!

- Li

On 2011-08-01 17:54:26, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1229/
bq.  ---
bq.  
bq.  (Updated 2011-08-01 17:54:26)
bq.  
bq.  
bq.  Review request for hbase and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  When HFiles are flushed through the normal path, they include an attribute 
TIMERANGE_KEY which can be used to cull HFiles when performing a 
time-restricted scan. Files produced by HFileOutputFormat are currently missing 
this metadata.
bq.  
bq.  
bq.  This addresses bug HBASE-4148.
bq.  https://issues.apache.org/jira/browse/HBASE-4148
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 
3c48d08 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
b600020 
bq.
src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 
2f3f5df 
bq.  
bq.  Diff: https://reviews.apache.org/r/1229/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added unit test.  
bq.  
bq.  I don't quite understand why the KeyValue with the larger timestamp (2000) 
value must be written before the one with the smaller timestamp (1000). I can 
see the code that enforces this (HFile.checkKey) but not why keys are larger to 
smaller.  Is this in HFile data precondition?
bq.  
bq.  I cannot get the full test suite to pass, with or without this patch.  
Suite seems to timeout on tests unrelated to this.  Would appreciate some hints 
or pointers for info on which tests are flakey or take a long time to run.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4151) completebulkload checks zoo.cfg even though ZK ensemble is specified in hbase-site.xml

2011-08-01 Thread Mubarak Seyed (JIRA)

completebulkload checks zoo.cfg even though ZK ensemble is specified in 
hbase-site.xml
--

 Key: HBASE-4151
 URL: https://issues.apache.org/jira/browse/HBASE-4151
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.1
 Environment: HBase-0.90.1
Reporter: Mubarak Seyed


I have generated HFiles using importtsv and tried to bulk load them using 
completebulkload, even though i have specified the ZK quorum ensemble and 
client port in hbase-site.xml, completebulkload looks for ZK ensemble and 
client port in zoo.cfg, even after i have specified parameters in zoo.cfg, i 
was getting NullPointerException at line 167 in ZKConfig.java

{code}
 if 
(conf.get(HConstants.CLUSTER_DISTRIBUTED).equals(HConstants.CLUSTER_IS_DISTRIBUTED)
&& value.startsWith("localhost")) {
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-4149) Javadoc for Result.getRow is confusing to new users.

2011-08-01 Thread Elliott Clark (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark reassigned HBASE-4149:


Assignee: Doug Meil  (was: Elliott Clark)

Addind Doug Meil as Ted Yu said he owns javadoc

> Javadoc for Result.getRow is confusing to new users.
> 
>
> Key: HBASE-4149
> URL: https://issues.apache.org/jira/browse/HBASE-4149
> Project: HBase
>  Issue Type: Bug
>Reporter: Elliott Clark
>Assignee: Doug Meil
>Priority: Trivial
>  Labels: docuentation
> Attachments: HBASE-4149-0.patch
>
>
> org.apache.hadoop.hbase.client.Result getRow is confusing to new users.  The 
> documentation could be read to mean the raw data of the row.  In addition it 
> is written with improper grammar.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4089) blockCache contents report


[ 
https://issues.apache.org/jira/browse/HBASE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073638#comment-13073638
 ] 

Doug Meil commented on HBASE-4089:
--

Thanks JD.  I included the JMX/Ganglia/et al. because it came up as suggestions 
in the dist-list, but I really didn't see how it would fit with this type of 
usage reporting.  I'm glad you came to the same conclusion!

> blockCache contents report
> --
>
> Key: HBASE-4089
> URL: https://issues.apache.org/jira/browse/HBASE-4089
> Project: HBase
>  Issue Type: New Feature
>Reporter: Doug Meil
> Attachments: hbase_4089_blockcachereport.pdf
>
>
> Summarized block-cache report for a RegionServer would be helpful.  For 
> example ...
> table1
>   cf1   100 blocks, totalBytes=y, averageTimeInCache= hours
>   cf2   200 blocks, totalBytes=z, averageTimeInCache= hours
> table2
>   cf1  75 blocks, totalBytes=y, averageTimeInCache= hours
>   cf2 150 blocks, totalBytes=z, averageTimeInCache= hours
> ... Etc.
> The current metrics list blockCacheSize and blockCacheFree, but there is no 
> way to know what's in there.  Any single block isn't really important, but 
> the patterns of what CF/Table they came from, how big are they, and how long 
> (on average) they've been in the cache, are important.
> No such interface exists in HRegionInterface.  But I think it would be 
> helpful from an operational perspective.
> Updated (7-29):  Removing suggestion for UI.  I would be happy just to get 
> this report on a configured interval dumped to a log file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3857) Change the HFile Format

2011-08-01 Thread Mikhail Bautin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-3857:
--

Attachment: 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch

Here is a new version of the HFile v2 patch, addressing Todd's comments. This 
is intended to be applied using "git apply" because of the binary file needed 
for TestHFileReaderV1. Please use this patch instead of the one you may 
download from ReviewBoard, because ReviewBoard does not include the binary file 
into the downloaded patch for some reason.

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> 0001-review_hfile-v2-r1152122-2011_08_01_03_18_00.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4089) blockCache contents report


[ 
https://issues.apache.org/jira/browse/HBASE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073635#comment-13073635
 ] 

Jean-Daniel Cryans commented on HBASE-4089:
---

Nice document Doug, it puts everyone else to shame :)

I don't think we can expose those metrics through JMX/Ganglia/OpenTSDB as they 
will be changing a lot. It would be "doable" only if the regions and families 
never changed IMO. I'd prefer we concentrate on presenting this information 
from inside HBase.

In the nice to haves I'd like to see:

 - Number of accesses/misses per block or family (could see what's hot, well 
cached, etc)
 - Total size of the family on disk (then you can tell what portion of the 
dataset you cached)

Regarding the Writable question, you have to do that because it's required by 
Hadoop RPC. Since you are adding new infos, you'll have to implement it. Don't 
forget the default constructor! :)

For the web UI, what about making the region name clickable?



> blockCache contents report
> --
>
> Key: HBASE-4089
> URL: https://issues.apache.org/jira/browse/HBASE-4089
> Project: HBase
>  Issue Type: New Feature
>Reporter: Doug Meil
> Attachments: hbase_4089_blockcachereport.pdf
>
>
> Summarized block-cache report for a RegionServer would be helpful.  For 
> example ...
> table1
>   cf1   100 blocks, totalBytes=y, averageTimeInCache= hours
>   cf2   200 blocks, totalBytes=z, averageTimeInCache= hours
> table2
>   cf1  75 blocks, totalBytes=y, averageTimeInCache= hours
>   cf2 150 blocks, totalBytes=z, averageTimeInCache= hours
> ... Etc.
> The current metrics list blockCacheSize and blockCacheFree, but there is no 
> way to know what's in there.  Any single block isn't really important, but 
> the patterns of what CF/Table they came from, how big are they, and how long 
> (on average) they've been in the cache, are important.
> No such interface exists in HRegionInterface.  But I think it would be 
> helpful from an operational perspective.
> Updated (7-29):  Removing suggestion for UI.  I would be happy just to get 
> this report on a configured interval dumped to a log file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread Jonathan Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4148:
--

Attachment: 
0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch

xxx.trunk2.patch is a version that can be applied to trunk.  It cleans up a 
conflict marker in a comment that I had missed.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073632#comment-13073632
 ] 

jirapos...@reviews.apache.org commented on HBASE-4148:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/
---

(Updated 2011-08-01 17:54:26.858153)


Review request for hbase and Todd Lipcon.


Changes
---

Cleaned up nit.


Summary
---

When HFiles are flushed through the normal path, they include an attribute 
TIMERANGE_KEY which can be used to cull HFiles when performing a 
time-restricted scan. Files produced by HFileOutputFormat are currently missing 
this metadata.


This addresses bug HBASE-4148.
https://issues.apache.org/jira/browse/HBASE-4148


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 
3c48d08 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java b600020 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 
2f3f5df 

Diff: https://reviews.apache.org/r/1229/diff


Testing
---

Added unit test.  

I don't quite understand why the KeyValue with the larger timestamp (2000) 
value must be written before the one with the smaller timestamp (1000). I can 
see the code that enforces this (HFile.checkKey) but not why keys are larger to 
smaller.  Is this in HFile data precondition?

I cannot get the full test suite to pass, with or without this patch.  Suite 
seems to timeout on tests unrelated to this.  Would appreciate some hints or 
pointers for info on which tests are flakey or take a long time to run.


Thanks,

jmhsieh



> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk2.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4147) StoreFile query usage report


[ 
https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073627#comment-13073627
 ] 

Doug Meil commented on HBASE-4147:
--

Thanks Todd.  I don't think it's that is out of scope.  If something like this 
could serve as the general purpose framework to gather log details, then that 
sounds like a good idea.  I'd like to see some examples that would satisfy the 
use cases in my writeup and any other common performance issues that anybody 
else has.

One expectation I would have is that this would be on for 
development/production clusters - the next question is what level of 
summarization is done out of the box in HBase, and (here we go again) where the 
output goes.  



> StoreFile query usage report
> 
>
> Key: HBASE-4147
> URL: https://issues.apache.org/jira/browse/HBASE-4147
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Priority: Minor
> Attachments: hbase_4147_storefilereport.pdf
>
>
> Detailed information on what HBase is doing in terms of reads is hard to come 
> by.
> What would be useful is to have a periodic StoreFile query report.  
> Specifically, this could run on a configured interval (e.g., every 30 
> seconds, 60 seconds) and dump the output to the log files.
> This would have all StoreFiles accessed during the reporting period (and with 
> the Path we would also know region, CF, and table), # of times the StoreFile 
> was accessed, the size of the StoreFile, and the total time (ms) spent 
> processing that StoreFile.
> Even this level of summary would be useful to detect a which tables & CFs are 
> being accessed the most, and including the StoreFile would provide insight 
> into relative "uncompaction" (i.e., lots of StoreFiles).
> I think the log-output, as opposed to UI, is an important facet with this.  
> I'm assuming that users will slice and dice this data on their own so I think 
> we should skip any kind of admin view for now (i.e., new JSPs, new APIs to 
> expose this data).  Just getting this to log-file would be a big improvement.
> Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
> it will.  However, flying a plane without any instrumentation isn't fun.  :-) 
>  
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4148) HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata

2011-08-01 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073619#comment-13073619
 ] 

jirapos...@reviews.apache.org commented on HBASE-4148:
--

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1229/#review1250
---

small nit (conflict marker in the patch)

src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java

conflict marker

- Todd

On 2011-08-01 04:31:42, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1229/
bq.  ---
bq.  
bq.  (Updated 2011-08-01 04:31:42)
bq.  
bq.  
bq.  Review request for hbase and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  When HFiles are flushed through the normal path, they include an attribute 
TIMERANGE_KEY which can be used to cull HFiles when performing a 
time-restricted scan. Files produced by HFileOutputFormat are currently missing 
this metadata.
bq.  
bq.  
bq.  This addresses bug HBASE-4148.
bq.  https://issues.apache.org/jira/browse/HBASE-4148
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java 
3c48d08 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
b600020 
bq.
src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java 
2f3f5df 
bq.  
bq.  Diff: https://reviews.apache.org/r/1229/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Added unit test.  
bq.  
bq.  I don't quite understand why the KeyValue with the larger timestamp (2000) 
value must be written before the one with the smaller timestamp (1000). I can 
see the code that enforces this (HFile.checkKey) but not why keys are larger to 
smaller.  Is this in HFile data precondition?
bq.  
bq.  I cannot get the full test suite to pass, with or without this patch.  
Suite seems to timeout on tests unrelated to this.  Would appreciate some hints 
or pointers for info on which tests are flakey or take a long time to run.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  jmhsieh
bq.  
bq.

> HFileOutputFormat doesn't fill in TIMERANGE_KEY metadata
> 
>
> Key: HBASE-4148
> URL: https://issues.apache.org/jira/browse/HBASE-4148
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.patch, 
> 0001-HBASE-4148-HFileOutputFormat-doesn-t-fill-in-TIMERAN.trunk.patch, 
> 0001-HBASE-4148.-HFileOutputFormat-doesn-t-fill-in-TIMERA.patch
>
>
> When HFiles are flushed through the normal path, they include an attribute 
> TIMERANGE_KEY which can be used to cull HFiles when performing a 
> time-restricted scan. Files produced by HFileOutputFormat are currently 
> missing this metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4147) StoreFile query usage report

2011-08-01 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073606#comment-13073606
 ] 

Todd Lipcon commented on HBASE-4147:


Maybe I'm wandering out of scope, but I've been thinking a bit about statistics 
and monitoring as well recently, and I think it might integrate well with this 
work.

The idea is to define various probe points (to use the dtrace terminology) 
throughout the code. Each probe point would have a name and some predefined set 
of arguments. For example, in the HFile code you might have:
{code}
HFile() {
  this.readTrace = Tracer.get("hfile.read.complete");
}

read() {
...
if (readTrace != null && readTrace.isEnabled()) {
  readTrace.trace(millisSpent, thisHFilePath, blockIdx, ...);
}
{code}

then different things interested in this tracing data can subscribe to the 
trace point -- in this case in order to collect aggregate statistics for each 
30 second period, though other applications would be useful as well. (eg 
dynamically attach a listener to sample some percentage of requests)

Advantage of the above design is that it's flexible, and if off-by-default 
should have no performance impact since it will be basically jitted away

> StoreFile query usage report
> 
>
> Key: HBASE-4147
> URL: https://issues.apache.org/jira/browse/HBASE-4147
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Priority: Minor
> Attachments: hbase_4147_storefilereport.pdf
>
>
> Detailed information on what HBase is doing in terms of reads is hard to come 
> by.
> What would be useful is to have a periodic StoreFile query report.  
> Specifically, this could run on a configured interval (e.g., every 30 
> seconds, 60 seconds) and dump the output to the log files.
> This would have all StoreFiles accessed during the reporting period (and with 
> the Path we would also know region, CF, and table), # of times the StoreFile 
> was accessed, the size of the StoreFile, and the total time (ms) spent 
> processing that StoreFile.
> Even this level of summary would be useful to detect a which tables & CFs are 
> being accessed the most, and including the StoreFile would provide insight 
> into relative "uncompaction" (i.e., lots of StoreFiles).
> I think the log-output, as opposed to UI, is an important facet with this.  
> I'm assuming that users will slice and dice this data on their own so I think 
> we should skip any kind of admin view for now (i.e., new JSPs, new APIs to 
> expose this data).  Just getting this to log-file would be a big improvement.
> Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
> it will.  However, flying a plane without any instrumentation isn't fun.  :-) 
>  
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4147) StoreFile query usage report


 [ 
https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4147:
-

Attachment: hbase_4147_storefilereport.pdf

Adding document for requirements and 1st pass general design.

> StoreFile query usage report
> 
>
> Key: HBASE-4147
> URL: https://issues.apache.org/jira/browse/HBASE-4147
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Priority: Minor
> Attachments: hbase_4147_storefilereport.pdf
>
>
> Detailed information on what HBase is doing in terms of reads is hard to come 
> by.
> What would be useful is to have a periodic StoreFile query report.  
> Specifically, this could run on a configured interval (e.g., every 30 
> seconds, 60 seconds) and dump the output to the log files.
> This would have all StoreFiles accessed during the reporting period (and with 
> the Path we would also know region, CF, and table), # of times the StoreFile 
> was accessed, the size of the StoreFile, and the total time (ms) spent 
> processing that StoreFile.
> Even this level of summary would be useful to detect a which tables & CFs are 
> being accessed the most, and including the StoreFile would provide insight 
> into relative "uncompaction" (i.e., lots of StoreFiles).
> I think the log-output, as opposed to UI, is an important facet with this.  
> I'm assuming that users will slice and dice this data on their own so I think 
> we should skip any kind of admin view for now (i.e., new JSPs, new APIs to 
> expose this data).  Just getting this to log-file would be a big improvement.
> Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes 
> it will.  However, flying a plane without any instrumentation isn't fun.  :-) 
>  
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4089) blockCache contents report