[jira] Created: (HBASE-2895) Use some annotations for documentation purposes

2010-08-02 Thread Lars Francke (JIRA)
Use some annotations for documentation purposes
---

 Key: HBASE-2895
 URL: https://issues.apache.org/jira/browse/HBASE-2895
 Project: HBase
  Issue Type: Wish
Reporter: Lars Francke
Priority: Minor


I'd love to use some annotations to document some common things. In particular 
I'd love to see @ThreadSafe and @NotThreadSafe. If not for all classes it would 
at least be nice to have this for all the client classes that a user might see.

http://jcip.net/annotations/doc/index.html
Those can be easily pulled in via Maven.

Should we do it?

Hadoop common also started using  InterfaceAudience and InterfaceStability 
annotations which might be nice.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2608) Switch to SLF4J

2010-08-02 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894647#action_12894647
 ] 

Lars Francke commented on HBASE-2608:
-

I'll finish up my other small issues and then take a stab at this by converting 
the necessary config files and check out the web frontend.

When we have the config files in place everyone can transparently decide if he 
wants to use commons-logging API (which should ideally phased out) or SLF4j. It 
doesn't make a difference because the CL jar gets replaced by a wrapper that 
forwards all calls to SLF4J anyway. This should make it very easy for everyone 
to adapt if you decide to go this way.

 Switch to SLF4J
 ---

 Key: HBASE-2608
 URL: https://issues.apache.org/jira/browse/HBASE-2608
 Project: HBase
  Issue Type: Improvement
Reporter: Benoit Sigoure

 There are 2 compelling reasons to switch from log4j to slf4j:
 * HBase provides a client library that is going to be embedded in another 
 application.  Using SLF4J lets the application chose whatever logging library 
 it wants instead of imposing log4j.
 * When using SLF4J, we should use logback by default as it is basically a 
 better, faster, stronger log4j.  Same author, new design / new code.  See 
 http://logback.qos.ch/reasonsToSwitch.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2895) Use some annotations for documentation purposes

2010-08-02 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894703#action_12894703
 ] 

Jonathan Gray commented on HBASE-2895:
--

Is thread-safety relevant for anything besides HTable and HBaseAdmin?  (where 
the answer is sometimes yes sometimes no - therefore we say no)

 Use some annotations for documentation purposes
 ---

 Key: HBASE-2895
 URL: https://issues.apache.org/jira/browse/HBASE-2895
 Project: HBase
  Issue Type: Wish
Reporter: Lars Francke
Priority: Minor

 I'd love to use some annotations to document some common things. In 
 particular I'd love to see @ThreadSafe and @NotThreadSafe. If not for all 
 classes it would at least be nice to have this for all the client classes 
 that a user might see.
 http://jcip.net/annotations/doc/index.html
 Those can be easily pulled in via Maven.
 Should we do it?
 Hadoop common also started using  InterfaceAudience and InterfaceStability 
 annotations which might be nice.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-50) Snapshot of table

2010-08-02 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894706#action_12894706
 ] 

HBase Review Board commented on HBASE-50:
-

Message from: Todd Lipcon t...@cloudera.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/467/#review618
---


looks pretty good! I didn't et a chance to look through the test cases in 
detail, I'll try to look them over some more later this week.


src/main/java/org/apache/hadoop/hbase/HConstants.java
http://review.cloudera.org/r/467/#comment2293

since we also have a log archive dir somewhere, should specify this a 
little more - this is archived HFiles that are still referenced by snapshots?



src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java
http://review.cloudera.org/r/467/#comment2294

license



src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java
http://review.cloudera.org/r/467/#comment2295

no need for @param javadoc if there is no actual description attached. same 
thing below in a few places



src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java
http://review.cloudera.org/r/467/#comment2296

why not System.currentTimeMillis?



src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java
http://review.cloudera.org/r/467/#comment2297

empty @return



src/main/java/org/apache/hadoop/hbase/HSnapshotDescriptor.java
http://review.cloudera.org/r/467/#comment2298

since we're using the snapshot name as a directory name in HDFS, it has to 
be a UTF8 string, so why not just keep it as a String above too?



src/main/java/org/apache/hadoop/hbase/TablePartialOpenException.java
http://review.cloudera.org/r/467/#comment2299

no need for this javadoc (it's obvious)



src/main/java/org/apache/hadoop/hbase/TablePartialOpenException.java
http://review.cloudera.org/r/467/#comment2300

same with this one



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
http://review.cloudera.org/r/467/#comment2301

add TODO to this comment



src/main/java/org/apache/hadoop/hbase/io/Reference.java
http://review.cloudera.org/r/467/#comment2302

to keep compatibility with current storefiles, entire should be value 2, 
and bottom should be 0

while we're at it, maybe rename these to be all caps - Range.TOP, 
Range.BOTTOM, etc



src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java
http://review.cloudera.org/r/467/#comment2303

no need to check size() - iterating the empty array should be fine



src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java
http://review.cloudera.org/r/467/#comment2304

if we crash between step 1 and 2, we orphan the archived file. Instead, we 
can do the delete first (ignoring failure if it doesn't exist) and then update 
META.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/467/#comment2305

you can just call mkdirs, I think, and it won't fail if it already exists



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/467/#comment2306

should this be an exception, rather than a result code? ie is it normal to 
fail?



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/467/#comment2309

do we have a race here? what if the table gets enabled while the snapshot 
is being processed? it seems we need some locking here around table status and 
snapshot modification



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/467/#comment2311

shouldn't we rethrow in this error case? and in the above error case? ie 
these should be clauses like:

boolean success=false;
try {
  ... make snapshot ...
  success = true;
} finally {
  if (!success) {
deleteSnapshot();
  }
}




src/main/java/org/apache/hadoop/hbase/master/HMaster.java
http://review.cloudera.org/r/467/#comment2313

would it be problematic to create a partially written snapshotinfo file? or 
would it get cleaned up at a higher layer?

(perhaps worth creating snapshotinfo.tmp, then atomically rename it to 
snapshotinfo if it writes correctly)



src/main/java/org/apache/hadoop/hbase/master/SnapshotLogCleaner.java
http://review.cloudera.org/r/467/#comment2314

license



src/main/java/org/apache/hadoop/hbase/master/SnapshotLogCleaner.java
http://review.cloudera.org/r/467/#comment2315

worth noting that this class is not thread-safe? I don't know if these 
classes need to be thread safe, but you're using an unsynchronized hashset. 
Also, since refreshHLogsAndSearch clears hlogs before re-adding stuff, it needs 
to be synchronized more than just using a synchronized collection.




[jira] Commented: (HBASE-2895) Use some annotations for documentation purposes

2010-08-02 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894709#action_12894709
 ] 

Lars Francke commented on HBASE-2895:
-

To be honest I'm not sure if there are others where it's relevant for now. 
HTablePool would be one more.

I'd be satisfied with _any_ kind of documentation of these things at least for 
those few classes. I guess it can always be added to more classes if needed. 
The annotations are nice to have but a sentence in the Javadocs would suffice 
as well.

 Use some annotations for documentation purposes
 ---

 Key: HBASE-2895
 URL: https://issues.apache.org/jira/browse/HBASE-2895
 Project: HBase
  Issue Type: Wish
Reporter: Lars Francke
Priority: Minor

 I'd love to use some annotations to document some common things. In 
 particular I'd love to see @ThreadSafe and @NotThreadSafe. If not for all 
 classes it would at least be nice to have this for all the client classes 
 that a user might see.
 http://jcip.net/annotations/doc/index.html
 Those can be easily pulled in via Maven.
 Should we do it?
 Hadoop common also started using  InterfaceAudience and InterfaceStability 
 annotations which might be nice.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-2871) Make start|stop commands symmetric for Master Cluster

2010-08-02 Thread Nicolas Spiegelberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg updated HBASE-2871:
---

Attachment: HBASE-2871.patch

Patch changes stop master to actually just stop the master and stop-hbase.sh 
should be the only quick command to stop the cluster.  Note that you can run 
bin/stop-hbase.sh on a unit that is not the master.  If not on the HMaster 
server, the script will not wait for the HMaster to stop before proceeding but 
should properly shutdown the system.

 Make start|stop commands symmetric for Master  Cluster
 -

 Key: HBASE-2871
 URL: https://issues.apache.org/jira/browse/HBASE-2871
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial
 Fix For: 0.90.0

 Attachments: HBASE-2871.patch


 Currently, our master  server start/stop commands aren't asymmetric.  
 Calling hbase-daemon.sh start master will create a single Master process, 
 but calling hbase-daemon.sh stop master will stop the master + 
 regionservers.  Additionally, running bin/stop-hbase.sh will not work 
 properly if a backup master is currently the primary.  We should modify these 
 commands so they are intuitive/asymmetric and let the HBase contributors 
 (instead of users) live with the fact that it might be a little ugly 
 underneath.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-57) [hbase] Master should allocate regions to regionservers based upon data locality and rack awareness

2010-08-02 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894784#action_12894784
 ] 

Jonathan Gray commented on HBASE-57:


@Jacques, yeah that approach is being considered.  once the master rewrite gets 
merged into trunk (HBASE-2692) it should be possible to retain assignment 
information across a shutdown.  you wouldn't even need zk because you could 
just leave it in META.  it would actually be a fairly simple change once the 
other stuff is in.

And yeah, this solves 80% of the problem.  But I've also written some partial 
code to do stuff by locality; it's not trivial but not too bad.  Perhaps this 
simple solution first and then we can start looking at block locations when we 
extend our notion of load for the next version of load balancing.

 [hbase] Master should allocate regions to regionservers based upon data 
 locality and rack awareness
 ---

 Key: HBASE-57
 URL: https://issues.apache.org/jira/browse/HBASE-57
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.2.0
Reporter: stack
Assignee: Li Chongxin

 Currently, regions are assigned regionservers based off a basic loading 
 attribute.  A factor to include in the assignment calcuation is the location 
 of the region in hdfs; i.e. servers hosting region replicas.  If the cluster 
 is such that regionservers are being run on the same nodes as those running 
 hdfs, then ideally the regionserver for a particular region should be running 
 on the same server as hosts a region replica.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-57) [hbase] Master should allocate regions to regionservers based upon data locality and rack awareness

2010-08-02 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894787#action_12894787
 ] 

Jonathan Gray commented on HBASE-57:


Filed HBASE-2896 for the simpler solution

 [hbase] Master should allocate regions to regionservers based upon data 
 locality and rack awareness
 ---

 Key: HBASE-57
 URL: https://issues.apache.org/jira/browse/HBASE-57
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.2.0
Reporter: stack
Assignee: Li Chongxin

 Currently, regions are assigned regionservers based off a basic loading 
 attribute.  A factor to include in the assignment calcuation is the location 
 of the region in hdfs; i.e. servers hosting region replicas.  If the cluster 
 is such that regionservers are being run on the same nodes as those running 
 hdfs, then ideally the regionserver for a particular region should be running 
 on the same server as hosts a region replica.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-2896) Retain assignment information between cluster shutdown/startup

2010-08-02 Thread Jonathan Gray (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray updated HBASE-2896:
-

Description: 
Over in HBASE-57 we want to consider block locations for region assignment.  
This is most important during cluster startup where you currently lose all 
locality because regions are assignment randomly.

This jira is about a shot-term solution to the cluster startup problem by 
retaining assignment information after a cluster shutdown and using it on the 
next cluster startup.

  was:
Over in HBASE-57 we want to consider block locations for region assignment.  
This is most important during cluster startup where you currently lose all 
locality because regions are assignment normally.

This jira is about a shot-term solution to the cluster startup problem by 
retaining assignment information after a cluster shutdown and using it on the 
next cluster startup.


(changed normally to randomly)

 Retain assignment information between cluster shutdown/startup
 --

 Key: HBASE-2896
 URL: https://issues.apache.org/jira/browse/HBASE-2896
 Project: HBase
  Issue Type: New Feature
  Components: master, regionserver
Reporter: Jonathan Gray
 Fix For: 0.90.0


 Over in HBASE-57 we want to consider block locations for region assignment.  
 This is most important during cluster startup where you currently lose all 
 locality because regions are assignment randomly.
 This jira is about a shot-term solution to the cluster startup problem by 
 retaining assignment information after a cluster shutdown and using it on the 
 next cluster startup.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.