[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-03-26 Thread Laxman (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239222#comment-13239222
 ] 

Laxman commented on HBASE-5564:
---

Findbugs reported by QA bot are about usage of default encoding. This behavior 
is inline with existing code.


bug #1
{noformat}
TESTUnknown bug pattern DM_DEFAULT_ENCODING in 
org.apache.hadoop.hbase.mapreduce.ImportTsv$TsvParser$ParsedLine.getTimestamp()
{noformat}

bug #2
{noformat}
TESTUnknown bug pattern DM_DEFAULT_ENCODING in 
org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(Configuration, 
String[])
{noformat}

bug #2 already existing in code. just included in patch file with no changes.

And test case failures are not because of this patch. Test failures to be 
addressed as part of HBASE-5608

> Bulkload is discarding duplicate records
> 
>
> Key: HBASE-5564
> URL: https://issues.apache.org/jira/browse/HBASE-5564
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
> Environment: HBase 0.92
>Reporter: Laxman
>Assignee: Laxman
>  Labels: bulkloader
> Fix For: 0.96.0
>
> Attachments: 5564.lint, HBASE-5564_trunk.1.patch, 
> HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.patch
>
>
> Duplicate records are getting discarded when duplicate records exists in same 
> input file and more specifically if they exists in same split.
> Duplicate records are considered if the records are from diffrent different 
> splits.
> Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239216#comment-13239216
 ] 

Hudson commented on HBASE-5641:
---

Integrated in HBase-TRUNK-security #151 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/151/])
HBASE-5641 decayingSampleTick1 prevents HBase from shutting down. (Revision 
1305722)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java


> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239215#comment-13239215
 ] 

Hudson commented on HBASE-5209:
---

Integrated in HBase-TRUNK-security #151 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/151/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305661)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> HConnection/HMasterInterface should allow for way to get hostname of 
> currently active master in multi-master HBase setup
> 
>
> Key: HBASE-5209
> URL: https://issues.apache.org/jira/browse/HBASE-5209
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.90.5, 0.92.0, 0.94.0
>Reporter: Aditya Acharya
>Assignee: David S. Wang
> Fix For: 0.92.1, 0.94.0
>
> Attachments: 5209.addendum, HBASE_5209_v5.diff
>
>
> I have a multi-master HBase set up, and I'm trying to programmatically 
> determine which of the masters is currently active. But the API does not 
> allow me to do this. There is a getMaster() method in the HConnection class, 
> but it returns an HMasterInterface, whose methods do not allow me to find out 
> which master won the last race. The API should have a 
> getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239213#comment-13239213
 ] 

Hudson commented on HBASE-5623:
---

Integrated in HBase-TRUNK-security #151 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/151/])
HBASE-5623 Race condition when rolling the HLog and hlogFlush (Enis 
Soztutar and LarsH) (Revision 1305556)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollingNoCluster.java


> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() ho

[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239214#comment-13239214
 ] 

Hudson commented on HBASE-5596:
---

Integrated in HBase-TRUNK-security #151 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/151/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305661)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5596.patch, hbase-5596-0.94.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239212#comment-13239212
 ] 

Hudson commented on HBASE-5533:
---

Integrated in HBase-TRUNK-security #151 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/151/])
HBASE-5533 Add more metrics to HBase (Revision 1305499)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/ExactCounterMetric.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram/MetricsHistogram.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Sample.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Snapshot.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/histogram/UniformSample.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/metrics/TestExactCounterMetric.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/metrics/TestExponentiallyDecayingSample.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/metrics/TestMetricsHistogram.java


> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239211#comment-13239211
 ] 

Hudson commented on HBASE-5190:
---

Integrated in HBase-TRUNK-security #151 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/151/])
HBASE-5190 Limit the IPC queue size based on calls' payload size
   (Ted's addendum) (Revision 1305468)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java


> Limit the IPC queue size based on calls' payload size
> -
>
> Key: HBASE-5190
> URL: https://issues.apache.org/jira/browse/HBASE-5190
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.5
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5190.addendum, HBASE-5190-v2.patch, HBASE-5190-v3.patch, 
> HBASE-5190.patch
>
>
> Currently we limit the number of calls in the IPC queue only on their count. 
> It used to be really high and was dropped down recently to num_handlers * 10 
> (so 100 by default) because it was easy to OOME yourself when huge calls were 
> being queued. It's still possible to hit this problem if you use really big 
> values and/or a lot of handlers, so the idea is that we should take into 
> account the payload size. I can see 3 solutions:
>  - Do the accounting outside of the queue itself for all calls coming in and 
> out and when a call doesn't fit, throw a retryable exception.
>  - Same accounting but instead block the call when it comes in until space is 
> made available.
>  - Add a new parameter for the maximum size (in bytes) of a Call and then set 
> the size the IPC queue (in terms of the number of items) so that it could 
> only contain as many items as some predefined maximum size (in bytes) for the 
> whole queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4465) Lazy-seek optimization for StoreFile scanners

2012-03-26 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4465:
-

Release Note: Check the most recent file first before seeking all other 
files in a Store.
Hadoop Flags: Reviewed

Thanks Mikhail.

> Lazy-seek optimization for StoreFile scanners
> -
>
> Key: HBASE-4465
> URL: https://issues.apache.org/jira/browse/HBASE-4465
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>  Labels: optimization, seek
> Fix For: 0.89.20100924, 0.94.0
>
> Attachments: 
> HBASE-4465_Lazy-seek_optimization_for_St-20111005121052-b2ea8753.patch
>
>
> Previously, if we had several StoreFiles for a column family in a region, we 
> would seek in each of them and only then merge the results, even though the 
> row/column we are looking for might only be in the most recent (and the 
> smallest) file. Now we prioritize our reads from those files so that we check 
> the most recent file first. This is done by doing a "lazy seek" which 
> pretends that the next value in the StoreFile is (seekRow, seekColumn, 
> lastTimestampInStoreFile), which is earlier in the KV order than anything 
> that might actually occur in the file. So if we don't find the result in 
> earlier files, that fake KV will bubble up to the top of the KV heap and a 
> real seek will be done. This is expected to significantly reduce the amount 
> of disk IO (as of 09/22/2011 we are doing dark launch testing and 
> measurement).
> This is joint work with Liyin Tang -- huge thanks to him for many helpful 
> discussions on this and the idea of putting fake KVs with the highest 
> timestamp of the StoreFile in the scanner priority queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5619) Create PB protocols for HRegionInterface

2012-03-26 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239205#comment-13239205
 ] 

jirapos...@reviews.apache.org commented on HBASE-5619:
--



bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 21
bq.  > 
bq.  >
bq.  > What did we figure on the package name?  Shouldn't it agree w/ the 
dir that holds the .proto files at src/main?  Currently one is protobuf and the 
other is proto.
bq.  
bq.  Jimmy Xiang wrote:
bq.  There is already an exiting folder called protobuf (rest).  Let me 
change the dir holds the .proto files under src/main to protobuf.

good


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 35
bq.  > 
bq.  >
bq.  > What are these two booleans broken out?  Aren't they in they 
attributes of HRI already?  Why repeat them?
bq.  
bq.  Jimmy Xiang wrote:
bq.  I used to put these transient parameters in the protobuff RegionInfo 
as well.  However Todd thought it's better to put them outside.
bq.  
bq.  What'd do you think?  To me, it is fine either way.  However, if we 
are going to replace HRI with the protobuff later on, it may be better to put 
them together.

Hmm.  Moving out these flags changes the current 'model' but in a direction we 
should be headed in.  The split/offline stuff were stuffed into HRI in the 
first place just because this was an easy way to pass these transient states 
around in hbase; they also are less important now in HRI though still depended 
on when we scan meta IIRC.  Its probably better to evolve toward an HRI that is 
immutable once made.  So I'd be down w/ moving them out but its up to you.  It 
might be easier on you achieving parity w/ first commit of pb work if the pb 
classes are like the internals they will be feeding into.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 70
bq.  > 
bq.  >
bq.  > Should we be repeating the API documentation that is up in the 
HRegionInterface that this .proto replaces here?  If its not here, where will 
it be?  Not all of the javadoc should make it over -- the stuff that says 
nothing shouldn't but some is of worth.  What you think?
bq.  
bq.  Jimmy Xiang wrote:
bq.  I though about this.  I added documentation to RegionClient.proto.  
For methods in RegionAdmin, the method names seem to be very clear.  I will 
take a look again and document where confusing/misunderstand could arise.

I'd be fine w/ the doc being sparse but there are a few cases where doc is 
necessary; e.g. the one I cite above.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 87
bq.  > 
bq.  >
bq.  > Isn't the response currently a void?   And isnt' flush async (IIRC). 
 If so, under what circumstance would we be able to fill out this response?
bq.  
bq.  Jimmy Xiang wrote:
bq.  I was thinking to set it if the HRegion.flushCache call returned true. 
 It is just informational.

Does that imply a synchronous call?  I thought flushCache just queued the flush 
then returned to the client w/o actually waiting on flush to complete.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionClient.proto, line 52
bq.  > 
bq.  >
bq.  > This is new?  Being able to do this?  How will it be used?
bq.  
bq.  Jimmy Xiang wrote:
bq.  Some call expects a region name, some call expects an encoded region 
name.  With a specifier, we can handle both.
bq.  Encoded region name works only if the region is on-line.  If we can't 
find the region based on the specifier, a region not found exception will be 
thrown.
bq.  
bq.  So we can simplify the request a little bit since we don't have to ask 
for region name, and encoded region name, and check only if one is specified.
bq.  
bq.  I will add a comment for it.

ok


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionClient.proto, line 66
bq.  > 
bq.  >
bq.  > This is new feature on get?  Or just special handling of an 
attribute?
bq.  
bq.  Jimmy Xiang wrote:
bq.  This is for the exist() call.  In this case, the caller doesn't care 
about the result. They just want to know if the row is there.  It is not 
special handling of an attribute.

The current implementation actually does the fetch and in the client checks it 
null or not IIRC?  Or is it all serverside?  So you add

[jira] [Commented] (HBASE-4465) Lazy-seek optimization for StoreFile scanners

2012-03-26 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239203#comment-13239203
 ] 

Mikhail Bautin commented on HBASE-4465:
---

@Stack: yes, this feature is on by default, because it has the same or better 
performance as before in all cases.


> Lazy-seek optimization for StoreFile scanners
> -
>
> Key: HBASE-4465
> URL: https://issues.apache.org/jira/browse/HBASE-4465
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>  Labels: optimization, seek
> Fix For: 0.89.20100924, 0.94.0
>
> Attachments: 
> HBASE-4465_Lazy-seek_optimization_for_St-20111005121052-b2ea8753.patch
>
>
> Previously, if we had several StoreFiles for a column family in a region, we 
> would seek in each of them and only then merge the results, even though the 
> row/column we are looking for might only be in the most recent (and the 
> smallest) file. Now we prioritize our reads from those files so that we check 
> the most recent file first. This is done by doing a "lazy seek" which 
> pretends that the next value in the StoreFile is (seekRow, seekColumn, 
> lastTimestampInStoreFile), which is earlier in the KV order than anything 
> that might actually occur in the file. So if we don't find the result in 
> earlier files, that fake KV will bubble up to the top of the KV heap and a 
> real seek will be done. This is expected to significantly reduce the amount 
> of disk IO (as of 09/22/2011 we are doing dark launch testing and 
> measurement).
> This is joint work with Liyin Tang -- huge thanks to him for many helpful 
> discussions on this and the idea of putting fake KVs with the highest 
> timestamp of the StoreFile in the scanner priority queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239199#comment-13239199
 ] 

Lars Hofhansl commented on HBASE-3134:
--

Hmm... TestAtomicOperation still failing. Unrelated to this change, though.

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.1
>
> Attachments: 3134-v2.txt, 3134-v3.txt, 3134-v4.txt, 3134.txt, 
> HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira





[jira] [Updated] (HBASE-3433) Remove the KV copy of every KV in Scan; introduced by HBASE-3232

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3433:
-

Summary: Remove the KV copy of every KV in Scan; introduced by HBASE-3232  
(was: Remove the KV copy of every KV in Scan; introduced by HBASE-3232 (why 
doesn't keyonlyfilter make copies rather than mutate -- HBASE-3211)?)

> Remove the KV copy of every KV in Scan; introduced by HBASE-3232
> 
>
> Key: HBASE-3433
> URL: https://issues.apache.org/jira/browse/HBASE-3433
> Project: HBase
>  Issue Type: Improvement
>  Components: performance, regionserver
>Reporter: stack
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.0, 0.94.0
>
> Attachments: 3433-v2.txt, 3433-v3.txt, 3433.txt, 
> HBASE-3433-sidenote.patch
>
>
> Here is offending code from inside in StoreScanner#next:
> {code}
>   // kv is no longer immutable due to KeyOnlyFilter! use copy for safety
>   KeyValue copyKv = new KeyValue(kv.getBuffer(), kv.getOffset(), 
> kv.getLength());
> {code}
> This looks wrong given philosophy up to this has been avoidance of 
> garbage-making copies.
> Maybe this has been looked into before and this is the only thing to be done 
> but why is KeyOnlyFilter not making copies rather than mutating originals?
> Making this critical against 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4256) Intra-row scanning (part deux)

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4256:
-

Issue Type: Task  (was: New Feature)

> Intra-row scanning (part deux)
> --
>
> Key: HBASE-4256
> URL: https://issues.apache.org/jira/browse/HBASE-4256
> Project: HBase
>  Issue Type: Task
>Affects Versions: 0.90.4
>Reporter: Jean-Daniel Cryans
>Assignee: Dave Revell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 4256.txt
>
>
> Dave Revell was asking on IRC today if there's a way to scan ranges of 
> *qualifiers* within a row. That is, to be able to specify a *start qualifier* 
> and an *end qualifier* so that the Get or Scan seeks directly to the first 
> qualifier and stops at some point which can be predeterminate by a qualifier 
> or simply a batch configuration (already exists).
> This is particularly useful for large rows with time-based qualifiers.
> Dave also mentioned that another popular database has such a feature that 
> they call "column slices".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239190#comment-13239190
 ] 

Lars Hofhansl commented on HBASE-5639:
--

+1 on patch. Although my mind twisted like a pretzel thinking about the correct 
condition here.

> The logic used in waiting for region servers during startup is broken
> -
>
> Key: HBASE-5639
> URL: https://issues.apache.org/jira/browse/HBASE-5639
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: nkeywal
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5639.patch
>
>
> See the tail of HBASE-4993, which I'll report here:
> Me:
> {quote}
> I think a bug was introduced here. Here's the new waiting logic in 
> waitForRegionServers:
> the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>there have been no new region server in for
>   'hbase.master.wait.on.regionservers.interval' time
> And the code that verifies that:
> !(lastCountChange+interval > now && count >= minToStart)
> {quote}
> Nic:
> {quote}
> It seems that changing the code to
> (count < minToStart ||
> lastCountChange+interval > now)
> would make the code works as documented.
> If you have 0 region servers that checked in and you are under the interval, 
> you wait: (true or true) = true.
> If you have 0 region servers but you are above the interval, you wait: (true 
> or false) = true.
> If you have 1 or more region servers that checked in and you are under the 
> interval, you wait: (false or true) = true.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239185#comment-13239185
 ] 

Hadoop QA commented on HBASE-3134:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520058/3134-v4.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestAtomicOperation

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1318//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1318//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1318//console

This message is automatically generated.

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.1
>
> Attachments: 3134-v2.txt, 3134-v3.txt, 3134-v4.txt, 3134.txt, 
> HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239175#comment-13239175
 ] 

Hudson commented on HBASE-5641:
---

Integrated in HBase-0.94 #60 (See 
[https://builds.apache.org/job/HBase-0.94/60/])
HBASE-5641 decayingSampleTick1 prevents HBase from shutting down. (Revision 
1305721)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/CHANGES.txt
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java


> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239168#comment-13239168
 ] 

ramkrishna.s.vasudevan commented on HBASE-5598:
---

@Jon
Can you add the subtasks here that you have identified as the categories of 
findbugs.
We have some tasks started over here.  We can take up few in that.

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Attachments: HBASE-5598.patch
>
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4532) Avoid top row seek by dedicated bloom filter for delete family bloom filter

2012-03-26 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239167#comment-13239167
 ] 

stack commented on HBASE-4532:
--

This feature looks like its always on (which would make sense).  Can you 
confirm Liyin?  Thanks.

> Avoid top row seek by dedicated bloom filter for delete family bloom filter
> ---
>
> Key: HBASE-4532
> URL: https://issues.apache.org/jira/browse/HBASE-4532
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Fix For: 0.94.0
>
> Attachments: D27.1.patch, D27.1.patch, HBASE-4532-apache-trunk.patch, 
> hbase-4532-89-fb.patch, hbase-4532-remove-system.out.println.patch
>
>
> The previous jira, HBASE-4469, is to avoid the top row seek operation if 
> row-col bloom filter is enabled. 
> This jira tries to avoid top row seek for all the cases by creating a 
> dedicated bloom filter only for delete family
> The only subtle use case is when we are interested in the top row with empty 
> column.
> For example, 
> we are interested in row1/cf1:/1/put.
> So we seek to the top row: row1/cf1:/MAX_TS/MAXIMUM. And the delete family 
> bloom filter will say there is NO delete family.
> Then it will avoid the top row seek and return a fake kv, which is the last 
> kv for this row (createLastOnRowCol).
> In this way, we have already missed the real kv we are interested in.
> The solution for the above problem is to disable this optimization if we are 
> trying to GET/SCAN a row with empty column.
> Evaluation from TestSeekOptimization:
> Previously:
> For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
> optimization: 1714 (68.40%), savings: 31.60%
> For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
> optimization: 1714 (68.40%), savings: 31.60%
> For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
> optimization: 1714 (68.40%), savings: 31.60%
> For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
> optimization: 1714 (68.40%), savings: 31.60%
> For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> So we can get about 10% more seek savings ONLY if the ROWCOL bloom filter is 
> enabled.[HBASE-4469]
> 
> After this change:
> For bloom=NONE, compr=NONE total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> For bloom=ROW, compr=NONE total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> For bloom=ROWCOL, compr=NONE total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> For bloom=NONE, compr=GZ total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> For bloom=ROW, compr=GZ total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> For bloom=ROWCOL, compr=GZ total seeks without optimization: 2506, with 
> optimization: 1458 (58.18%), savings: 41.82%
> So we can get about 10% more seek savings for ALL kinds of bloom filter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5639) The logic used in waiting for region servers during startup is broken

2012-03-26 Thread Jean-Daniel Cryans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-5639:
--

Attachment: HBASE-5639.patch

Patch that just changes the line as shown in the jira's description. I'm 
currently running the small and medium tests.

> The logic used in waiting for region servers during startup is broken
> -
>
> Key: HBASE-5639
> URL: https://issues.apache.org/jira/browse/HBASE-5639
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: nkeywal
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5639.patch
>
>
> See the tail of HBASE-4993, which I'll report here:
> Me:
> {quote}
> I think a bug was introduced here. Here's the new waiting logic in 
> waitForRegionServers:
> the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>there have been no new region server in for
>   'hbase.master.wait.on.regionservers.interval' time
> And the code that verifies that:
> !(lastCountChange+interval > now && count >= minToStart)
> {quote}
> Nic:
> {quote}
> It seems that changing the code to
> (count < minToStart ||
> lastCountChange+interval > now)
> would make the code works as documented.
> If you have 0 region servers that checked in and you are under the interval, 
> you wait: (true or true) = true.
> If you have 0 region servers but you are above the interval, you wait: (true 
> or false) = true.
> If you have 1 or more region servers that checked in and you are under the 
> interval, you wait: (false or true) = true.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4465) Lazy-seek optimization for StoreFile scanners

2012-03-26 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239163#comment-13239163
 ] 

stack commented on HBASE-4465:
--

Is this feature on by default?  It seems to be.  I'm not sure.

> Lazy-seek optimization for StoreFile scanners
> -
>
> Key: HBASE-4465
> URL: https://issues.apache.org/jira/browse/HBASE-4465
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>  Labels: optimization, seek
> Fix For: 0.89.20100924, 0.94.0
>
> Attachments: 
> HBASE-4465_Lazy-seek_optimization_for_St-20111005121052-b2ea8753.patch
>
>
> Previously, if we had several StoreFiles for a column family in a region, we 
> would seek in each of them and only then merge the results, even though the 
> row/column we are looking for might only be in the most recent (and the 
> smallest) file. Now we prioritize our reads from those files so that we check 
> the most recent file first. This is done by doing a "lazy seek" which 
> pretends that the next value in the StoreFile is (seekRow, seekColumn, 
> lastTimestampInStoreFile), which is earlier in the KV order than anything 
> that might actually occur in the file. So if we don't find the result in 
> earlier files, that fake KV will bubble up to the top of the KV heap and a 
> real seek will be done. This is expected to significantly reduce the amount 
> of disk IO (as of 09/22/2011 we are doing dark launch testing and 
> measurement).
> This is joint work with Liyin Tang -- huge thanks to him for many helpful 
> discussions on this and the idea of putting fake KVs with the highest 
> timestamp of the StoreFile in the scanner priority queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-03-26 Thread Teruyoshi Zenmyo (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teruyoshi Zenmyo updated HBASE-3134:


Attachment: 3134-v4.txt

@J-D
Thanks for the review.
I attached the patch with a trivial modification. (The constants in the new 
test are modified to the static variables of the TestReplication).

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.1
>
> Attachments: 3134-v2.txt, 3134-v3.txt, 3134-v4.txt, 3134.txt, 
> HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239155#comment-13239155
 ] 

Lars Hofhansl commented on HBASE-5639:
--

Are you planning to work on this, J-D?
Agree with the blocker status.

> The logic used in waiting for region servers during startup is broken
> -
>
> Key: HBASE-5639
> URL: https://issues.apache.org/jira/browse/HBASE-5639
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: nkeywal
>Priority: Blocker
> Fix For: 0.94.0
>
>
> See the tail of HBASE-4993, which I'll report here:
> Me:
> {quote}
> I think a bug was introduced here. Here's the new waiting logic in 
> waitForRegionServers:
> the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>there have been no new region server in for
>   'hbase.master.wait.on.regionservers.interval' time
> And the code that verifies that:
> !(lastCountChange+interval > now && count >= minToStart)
> {quote}
> Nic:
> {quote}
> It seems that changing the code to
> (count < minToStart ||
> lastCountChange+interval > now)
> would make the code works as documented.
> If you have 0 region servers that checked in and you are under the interval, 
> you wait: (true or true) = true.
> If you have 0 region servers but you are above the interval, you wait: (true 
> or false) = true.
> If you have 1 or more region servers that checked in and you are under the 
> interval, you wait: (false or true) = true.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5636) TestTableMapReduce doesn't work properly.

2012-03-26 Thread Takuya Ueshin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239154#comment-13239154
 ] 

Takuya Ueshin commented on HBASE-5636:
--

Yes, I'm going to work this week or weekend.

> TestTableMapReduce doesn't work properly.
> -
>
> Key: HBASE-5636
> URL: https://issues.apache.org/jira/browse/HBASE-5636
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.1
>Reporter: Takuya Ueshin
>
> No map function is called because there are no test data put before test 
> starts.
> The following three tests are in the same situation:
> - org.apache.hadoop.hbase.mapred.TestTableMapReduce
> - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
> - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5335) Dynamic Schema Configurations

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239149#comment-13239149
 ] 

Hadoop QA commented on HBASE-5335:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12520049/HBASE-5335-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 20 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1317//console

This message is automatically generated.

> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Fix For: 0.96.0
>
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch, D2247.5.patch, D2247.6.patch, D2247.7.patch, 
> HBASE-5335-trunk.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239145#comment-13239145
 ] 

Lars Hofhansl commented on HBASE-5641:
--

Committed to 0.94 and 0.96

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5641:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239144#comment-13239144
 ] 

Hudson commented on HBASE-5209:
---

Integrated in HBase-0.92 #340 (See 
[https://builds.apache.org/job/HBase-0.92/340/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305663)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> HConnection/HMasterInterface should allow for way to get hostname of 
> currently active master in multi-master HBase setup
> 
>
> Key: HBASE-5209
> URL: https://issues.apache.org/jira/browse/HBASE-5209
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.90.5, 0.92.0, 0.94.0
>Reporter: Aditya Acharya
>Assignee: David S. Wang
> Fix For: 0.92.1, 0.94.0
>
> Attachments: 5209.addendum, HBASE_5209_v5.diff
>
>
> I have a multi-master HBase set up, and I'm trying to programmatically 
> determine which of the masters is currently active. But the API does not 
> allow me to do this. There is a getMaster() method in the HConnection class, 
> but it returns an HMasterInterface, whose methods do not allow me to find out 
> which master won the last race. The API should have a 
> getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239143#comment-13239143
 ] 

Hudson commented on HBASE-5596:
---

Integrated in HBase-0.92 #340 (See 
[https://builds.apache.org/job/HBase-0.92/340/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305663)

 Result = FAILURE
jmhsieh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5596.patch, hbase-5596-0.94.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5619) Create PB protocols for HRegionInterface

2012-03-26 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239140#comment-13239140
 ] 

jirapos...@reviews.apache.org commented on HBASE-5619:
--



bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > Excellent.
bq.  > 
bq.  > It looks like we can commit this w/o breaking whats currently there?

That's right.  It won't break anything.  It is just proto files in this patch. 
Thanks for reviewing.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 21
bq.  > 
bq.  >
bq.  > What did we figure on the package name?  Shouldn't it agree w/ the 
dir that holds the .proto files at src/main?  Currently one is protobuf and the 
other is proto.

There is already an exiting folder called protobuf (rest).  Let me change the 
dir holds the .proto files under src/main to protobuf.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 35
bq.  > 
bq.  >
bq.  > What are these two booleans broken out?  Aren't they in they 
attributes of HRI already?  Why repeat them?

I used to put these transient parameters in the protobuff RegionInfo as well.  
However Todd thought it's better to put them outside.

What'd do you think?  To me, it is fine either way.  However, if we are going 
to replace HRI with the protobuff later on, it may be better to put them 
together.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 70
bq.  > 
bq.  >
bq.  > Should we be repeating the API documentation that is up in the 
HRegionInterface that this .proto replaces here?  If its not here, where will 
it be?  Not all of the javadoc should make it over -- the stuff that says 
nothing shouldn't but some is of worth.  What you think?

I though about this.  I added documentation to RegionClient.proto.  For methods 
in RegionAdmin, the method names seem to be very clear.  I will take a look 
again and document where confusing/misunderstand could arise.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 87
bq.  > 
bq.  >
bq.  > Isn't the response currently a void?   And isnt' flush async (IIRC). 
 If so, under what circumstance would we be able to fill out this response?

I was thinking to set it if the HRegion.flushCache call returned true.  It is 
just informational.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 112
bq.  > 
bq.  >
bq.  > WALKey maps to HLogKey?  Maybe add a comment to this effect?

Sure.  Will add a comment.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 141
bq.  > 
bq.  >
bq.  > Good.  I like this method name better.  Should be a comment which 
points back to the old name?  Or what you think?

Ok.  Will do.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionAdmin.proto, line 151
bq.  > 
bq.  >
bq.  > Yeah, this proto is missing commentary.   I mean, the return here 
should be explained?

Ok, will add a comment too.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionClient.proto, line 52
bq.  > 
bq.  >
bq.  > This is new?  Being able to do this?  How will it be used?

Some call expects a region name, some call expects an encoded region name.  
With a specifier, we can handle both.
Encoded region name works only if the region is on-line.  If we can't find the 
region based on the specifier, a region not found exception will be thrown.

So we can simplify the request a little bit since we don't have to ask for 
region name, and encoded region name, and check only if one is specified.

I will add a comment for it.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionClient.proto, line 66
bq.  > 
bq.  >
bq.  > This is new feature on get?  Or just special handling of an 
attribute?

This is for the exist() call.  In this case, the caller doesn't care about the 
result. They just want to know if the row is there.  It is not special handling 
of an attribute.


bq.  On 2012-03-26 23:21:58, Michael Stack wrote:
bq.  > src/main/proto/RegionClient.proto, line 71
bq.  > 

[jira] [Commented] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239139#comment-13239139
 ] 

Ted Yu commented on HBASE-5641:
---

+1 on patch.

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239138#comment-13239138
 ] 

Lars Hofhansl commented on HBASE-5641:
--

This fixes the problem for me. Please have a look, so that I can trigger 0.94 
build tonight.

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5641:
-

Assignee: Lars Hofhansl
  Status: Patch Available  (was: Open)

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5641:
-

Attachment: 5641.txt

Create a daemon thread factory instead.

> decayingSampleTick1 prevents HBase from shutting down.
> --
>
> Key: HBASE-5641
> URL: https://issues.apache.org/jira/browse/HBASE-5641
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5641.txt
>
>
> I think this is the problem. It creates a non-daemon thread.
> {code}
>   private static final ScheduledExecutorService TICK_SERVICE = 
>   Executors.newScheduledThreadPool(1, 
>   Threads.getNamedThreadFactory("decayingSampleTick"));
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5641) decayingSampleTick1 prevents HBase from shutting down.

2012-03-26 Thread Lars Hofhansl (Created) (JIRA)
decayingSampleTick1 prevents HBase from shutting down.
--

 Key: HBASE-5641
 URL: https://issues.apache.org/jira/browse/HBASE-5641
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.0, 0.96.0
 Attachments: 5641.txt

I think this is the problem. It creates a non-daemon thread.
{code}
  private static final ScheduledExecutorService TICK_SERVICE = 
  Executors.newScheduledThreadPool(1, 
  Threads.getNamedThreadFactory("decayingSampleTick"));
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239129#comment-13239129
 ] 

Lars Hofhansl commented on HBASE-5533:
--

Now, do we fix it, or do I take it out of 0.94.0?

> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239127#comment-13239127
 ] 

Lars Hofhansl commented on HBASE-5533:
--

While testing a 0.94.0RC I found that this is preventing HBase from shutting 
down.
{code}
"decayingSampleTick1" prio=10 tid=0x7faf5c047000 nid=0x216c waiting on condi
tion [0x7faf345c1000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xb853e308> (a java.util.concurrent.lock
s.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226
)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject
.awaitNanos(AbstractQueuedSynchronizer.java:2081)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.tak
e(ScheduledThreadPoolExecutor.java:688)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.tak
e(ScheduledThreadPoolExecutor.java:681)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1103)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
{code}
This is the only non-daemon thread I found.


> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5636) TestTableMapReduce doesn't work properly.

2012-03-26 Thread Enis Soztutar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239125#comment-13239125
 ] 

Enis Soztutar commented on HBASE-5636:
--

Very good find. It seems that HBASE-4503 might have caused it. Do you plan to 
work on this? 

> TestTableMapReduce doesn't work properly.
> -
>
> Key: HBASE-5636
> URL: https://issues.apache.org/jira/browse/HBASE-5636
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.1
>Reporter: Takuya Ueshin
>
> No map function is called because there are no test data put before test 
> starts.
> The following three tests are in the same situation:
> - org.apache.hadoop.hbase.mapred.TestTableMapReduce
> - org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
> - org.apache.hadoop.hbase.mapreduce.TestMulitthreadedTableMapper

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5335) Dynamic Schema Configurations

2012-03-26 Thread Nicolas Spiegelberg (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg updated HBASE-5335:
---

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Fix For: 0.96.0
>
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch, D2247.5.patch, D2247.6.patch, D2247.7.patch, 
> HBASE-5335-trunk.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5335) Dynamic Schema Configurations

2012-03-26 Thread Nicolas Spiegelberg (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg updated HBASE-5335:
---

Attachment: HBASE-5335-trunk.patch

Initial patch for trunk.  I think I should change 'ADVANCED' to config and a 
couple other minor things, so please don't commit.  This is for hadoop QA plus 
for you to fiddle with.  Note that TestFromClientSide3 should fail until 
HBASE-5359 is committed.  I made a preliminary patch for HBASE-5359 and 
verified that it fixed the test.

> Dynamic Schema Configurations
> -
>
> Key: HBASE-5335
> URL: https://issues.apache.org/jira/browse/HBASE-5335
> Project: HBase
>  Issue Type: New Feature
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>  Labels: configuration, schema
> Attachments: D2247.1.patch, D2247.2.patch, D2247.3.patch, 
> D2247.4.patch, D2247.5.patch, D2247.6.patch, D2247.7.patch, 
> HBASE-5335-trunk.patch
>
>
> Currently, the ability for a core developer to add per-table & per-CF 
> configuration settings is very heavyweight.  You need to add a reserved 
> keyword all the way up the stack & you have to support this variable 
> long-term if you're going to expose it explicitly to the user.  This has 
> ended up with using Configuration.get() a lot because it is lightweight and 
> you can tweak settings while you're trying to understand system behavior 
> [since there are many config params that may never need to be tuned].  We 
> need to add the ability to put & read arbitrary KV settings in the HBase 
> schema.  Combined with online schema change, this will allow us to safely 
> iterate on configuration settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5640) bulk load runs slowly than before

2012-03-26 Thread dhruba borthakur (Created) (JIRA)
bulk load runs slowly than before
-

 Key: HBASE-5640
 URL: https://issues.apache.org/jira/browse/HBASE-5640
 Project: HBase
  Issue Type: Bug
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor


I am loading data from an external system into hbase. There are many prints of 
the form. This is possibly a regression caused by a recent patch.

on different filesystem than destination store - moving to this filesystem

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239043#comment-13239043
 ] 

Hadoop QA commented on HBASE-5606:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12520026/0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1314//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1314//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1314//console

This message is automatically generated.

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239040#comment-13239040
 ] 

Hadoop QA commented on HBASE-5598:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520025/HBASE-5598.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 13 javac compiler warnings (more 
than the trunk's current 5 warnings).

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1313//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1313//console

This message is automatically generated.

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Attachments: HBASE-5598.patch
>
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239037#comment-13239037
 ] 

stack commented on HBASE-5598:
--

I'm good w/ this patch.  We should add your instruction on how to gen the 
findbugs report to the release notes for this issue and to the reference guide 
(I can add it there if you add it to the release notes).  I thought we were 
building the findbugs report when we generated the site but we are not doing 
this it seems.  Findbugs is being generated by the hadoopqa builds.

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Attachments: HBASE-5598.patch
>
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5633) NPE reading ZK config in HBase

2012-03-26 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5633:
--

Assignee: Matteo Bertozzi

> NPE reading ZK config in HBase
> --
>
> Key: HBASE-5633
> URL: https://issues.apache.org/jira/browse/HBASE-5633
> Project: HBase
>  Issue Type: Bug
>  Components: zookeeper
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-5633-0.90.patch, HBASE-5633-0.92.patch, 
> HBASE-5633-v1.patch, HBASE-5633-v2.patch
>
>
> If zoo.cfg contains server.* ("server.0=server0:2888:3888\n") and 
> cluster.distributed property (in hbase-site.xml) is empty we get an NPE in 
> parseZooCfg().
> The easy way to reproduce the bug is running 
> org.apache.hbase.zookeeper.TestHQuorumPeer with hbase-site.xml containing:
> {code}
> 
>   hbase.cluster.distributed
>   
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-03-26 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239009#comment-13239009
 ] 

Jesse Yates commented on HBASE-5547:


HBASE-50 would build on this functionality (just getting to where I need it 
anyways), so thought I could knock this out the 'right' way and then later wrap 
it into hbase-50. The reference stuff is a bit involved and I would be fine if 
we just moved the files out to another directory. Backup then can just check 
file times and remove the files that were backed up since the time or copy the 
ones that are currently being used.

Timestamp can just looked up per backed-up file or possibly grouped by 
directory (though the latter seems likely to fragmentation.

bq. HBase is in "backup mode", maybe by the existence of a specific zNode, 
although other models are possible as well.

Like this idea - only backing up while the switch is on, but otherwise should 
help keep cruft down (not be part of normal operation).

> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to .bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239004#comment-13239004
 ] 

Hudson commented on HBASE-5209:
---

Integrated in HBase-0.94-security #3 (See 
[https://builds.apache.org/job/HBase-0.94-security/3/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305662)

 Result = ABORTED
jmhsieh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> HConnection/HMasterInterface should allow for way to get hostname of 
> currently active master in multi-master HBase setup
> 
>
> Key: HBASE-5209
> URL: https://issues.apache.org/jira/browse/HBASE-5209
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.90.5, 0.92.0, 0.94.0
>Reporter: Aditya Acharya
>Assignee: David S. Wang
> Fix For: 0.92.1, 0.94.0
>
> Attachments: 5209.addendum, HBASE_5209_v5.diff
>
>
> I have a multi-master HBase set up, and I'm trying to programmatically 
> determine which of the masters is currently active. But the API does not 
> allow me to do this. There is a getMaster() method in the HConnection class, 
> but it returns an HMasterInterface, whose methods do not allow me to find out 
> which master won the last race. The API should have a 
> getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239008#comment-13239008
 ] 

Hudson commented on HBASE-5209:
---

Integrated in HBase-0.94 #58 (See 
[https://builds.apache.org/job/HBase-0.94/58/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305662)

 Result = ABORTED
jmhsieh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> HConnection/HMasterInterface should allow for way to get hostname of 
> currently active master in multi-master HBase setup
> 
>
> Key: HBASE-5209
> URL: https://issues.apache.org/jira/browse/HBASE-5209
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.90.5, 0.92.0, 0.94.0
>Reporter: Aditya Acharya
>Assignee: David S. Wang
> Fix For: 0.92.1, 0.94.0
>
> Attachments: 5209.addendum, HBASE_5209_v5.diff
>
>
> I have a multi-master HBase set up, and I'm trying to programmatically 
> determine which of the masters is currently active. But the API does not 
> allow me to do this. There is a getMaster() method in the HConnection class, 
> but it returns an HMasterInterface, whose methods do not allow me to find out 
> which master won the last race. The API should have a 
> getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239007#comment-13239007
 ] 

Hudson commented on HBASE-5596:
---

Integrated in HBase-0.94 #58 (See 
[https://builds.apache.org/job/HBase-0.94/58/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305662)

 Result = ABORTED
jmhsieh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5596.patch, hbase-5596-0.94.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239006#comment-13239006
 ] 

Hudson commented on HBASE-5533:
---

Integrated in HBase-0.94 #58 (See 
[https://builds.apache.org/job/HBase-0.94/58/])
HBASE-5533 Add more metrics to HBase (Shaneal Manek) (Revision 1305582)

 Result = ABORTED
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/ExactCounterMetric.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/MetricsHistogram.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Sample.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Snapshot.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/UniformSample.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/metrics/TestExactCounterMetric.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/metrics/TestExponentiallyDecayingSample.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/metrics/TestMetricsHistogram.java


> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238999#comment-13238999
 ] 

Hudson commented on HBASE-5190:
---

Integrated in HBase-0.94-security #3 (See 
[https://builds.apache.org/job/HBase-0.94-security/3/])
HBASE-5190 Limit the IPC queue size based on calls' payload size
   (Ted's addendum) (Revision 1305469)

 Result = ABORTED
jdcryans : 
Files : 
* 
/hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java


> Limit the IPC queue size based on calls' payload size
> -
>
> Key: HBASE-5190
> URL: https://issues.apache.org/jira/browse/HBASE-5190
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.5
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5190.addendum, HBASE-5190-v2.patch, HBASE-5190-v3.patch, 
> HBASE-5190.patch
>
>
> Currently we limit the number of calls in the IPC queue only on their count. 
> It used to be really high and was dropped down recently to num_handlers * 10 
> (so 100 by default) because it was easy to OOME yourself when huge calls were 
> being queued. It's still possible to hit this problem if you use really big 
> values and/or a lot of handlers, so the idea is that we should take into 
> account the payload size. I can see 3 solutions:
>  - Do the accounting outside of the queue itself for all calls coming in and 
> out and when a call doesn't fit, throw a retryable exception.
>  - Same accounting but instead block the call when it comes in until space is 
> made available.
>  - Add a new parameter for the maximum size (in bytes) of a Call and then set 
> the size the IPC queue (in terms of the number of items) so that it could 
> only contain as many items as some predefined maximum size (in bytes) for the 
> whole queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239003#comment-13239003
 ] 

Hudson commented on HBASE-5615:
---

Integrated in HBase-0.94-security #3 (See 
[https://builds.apache.org/job/HBase-0.94-security/3/])
HBASE-5615 the master never does balance because of balancing the parent 
region (Xufeng) (Revision 1305172)

 Result = ABORTED
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


> the master never does balance because of balancing the parent region
> 
>
> Key: HBASE-5615
> URL: https://issues.apache.org/jira/browse/HBASE-5615
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.7
>Reporter: xufeng
>Assignee: xufeng
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
> NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html
>
>
> the master never do balance becauseof when master do rebuildUserRegions(),it 
> will add the parent region into  AssignmentManager#servers,
> if balancer let the parent region to move,the parent will in RIT forever.thus 
> balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239002#comment-13239002
 ] 

Hudson commented on HBASE-5596:
---

Integrated in HBase-0.94-security #3 (See 
[https://builds.apache.org/job/HBase-0.94-security/3/])
HBASE-5596 Few minor bugs from HBASE-5209 (David S. Wang) (Revision 1305662)

 Result = ABORTED
jmhsieh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ServerName.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5596.patch, hbase-5596-0.94.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239000#comment-13239000
 ] 

Hudson commented on HBASE-5533:
---

Integrated in HBase-0.94-security #3 (See 
[https://builds.apache.org/job/HBase-0.94-security/3/])
HBASE-5533 Add more metrics to HBase (Shaneal Manek) (Revision 1305582)

 Result = ABORTED
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/ExactCounterMetric.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/MetricsHistogram.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Sample.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/Snapshot.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/metrics/histogram/UniformSample.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/metrics/TestExactCounterMetric.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/metrics/TestExponentiallyDecayingSample.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/metrics/TestMetricsHistogram.java


> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239001#comment-13239001
 ] 

Hudson commented on HBASE-5623:
---

Integrated in HBase-0.94-security #3 (See 
[https://builds.apache.org/job/HBase-0.94-security/3/])
HBASE-5623 Race condition when rolling the HLog and hlogFlush (Enis 
Soztutar and LarsH) (Revision 1305549)

 Result = ABORTED
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollingNoCluster.java


> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at H

[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238990#comment-13238990
 ] 

Hadoop QA commented on HBASE-2600:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12520024/hbase-2600-root.dir.tgz
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1315//console

This message is automatically generated.

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, HBASE-2600+5217-Sun-Mar-25-2012-v3.patch, 
> HBASE-2600+5217-Sun-Mar-25-2012-v4.patch, hbase-2600-root.dir.tgz, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-03-26 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238977#comment-13238977
 ] 

stack commented on HBASE-5547:
--

bq. Thoughts on how long we should keep around files? Indefinitely? 

Yes. I'd think some other process other than hbase would be responsible for 
their cleanup.

> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to .bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken

2012-03-26 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238973#comment-13238973
 ] 

Jean-Daniel Cryans commented on HBASE-5639:
---

Oh I forgot to mention that I'm marking this as a blocker for 0.94.0 because 
right now if you start a sizable cluster you may end up with region servers 
that checkin too late and miss the re-assignment of regions.

> The logic used in waiting for region servers during startup is broken
> -
>
> Key: HBASE-5639
> URL: https://issues.apache.org/jira/browse/HBASE-5639
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: nkeywal
>Priority: Blocker
> Fix For: 0.94.0
>
>
> See the tail of HBASE-4993, which I'll report here:
> Me:
> {quote}
> I think a bug was introduced here. Here's the new waiting logic in 
> waitForRegionServers:
> the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>there have been no new region server in for
>   'hbase.master.wait.on.regionservers.interval' time
> And the code that verifies that:
> !(lastCountChange+interval > now && count >= minToStart)
> {quote}
> Nic:
> {quote}
> It seems that changing the code to
> (count < minToStart ||
> lastCountChange+interval > now)
> would make the code works as documented.
> If you have 0 region servers that checked in and you are under the interval, 
> you wait: (true or true) = true.
> If you have 0 region servers but you are above the interval, you wait: (true 
> or false) = true.
> If you have 1 or more region servers that checked in and you are under the 
> interval, you wait: (false or true) = true.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5619) Create PB protocols for HRegionInterface

2012-03-26 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238972#comment-13238972
 ] 

jirapos...@reviews.apache.org commented on HBASE-5619:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4054/#review6365
---


Excellent.

It looks like we can commit this w/o breaking whats currently there?


src/main/proto/RegionAdmin.proto


What did we figure on the package name?  Shouldn't it agree w/ the dir that 
holds the .proto files at src/main?  Currently one is protobuf and the other is 
proto.



src/main/proto/RegionAdmin.proto


What are these two booleans broken out?  Aren't they in they attributes of 
HRI already?  Why repeat them?



src/main/proto/RegionAdmin.proto


Good.  Its kinda dumb the way our HRegionInterface is now where it has an 
override, one that takes single family and another that takes an array.  Thanks 
for collapsing.



src/main/proto/RegionAdmin.proto


Should we be repeating the API documentation that is up in the 
HRegionInterface that this .proto replaces here?  If its not here, where will 
it be?  Not all of the javadoc should make it over -- the stuff that says 
nothing shouldn't but some is of worth.  What you think?



src/main/proto/RegionAdmin.proto


Again, thanks for doing the work collapsing the overrides that are up in 
HRegionInterface.



src/main/proto/RegionAdmin.proto


Isn't the response currently a void?   And isnt' flush async (IIRC).  If 
so, under what circumstance would we be able to fill out this response?



src/main/proto/RegionAdmin.proto


WALKey maps to HLogKey?  Maybe add a comment to this effect?



src/main/proto/RegionAdmin.proto


Good.  I like this method name better.  Should be a comment which points 
back to the old name?  Or what you think?



src/main/proto/RegionAdmin.proto


Yeah, this proto is missing commentary.   I mean, the return here should be 
explained?



src/main/proto/RegionAdmin.proto


This will for sure grow w/ time.



src/main/proto/RegionAdmin.proto


Nice.  So you are splitting HRegionInterface into admin and client?  Good.



src/main/proto/RegionClient.proto


This is new?  Being able to do this?  How will it be used?



src/main/proto/RegionClient.proto


This is new feature on get?  Or just special handling of an attribute?



src/main/proto/RegionClient.proto


We don't have this in the java Result.  I don't understand why this is 
making its way into the object.



src/main/proto/RegionClient.proto


ditto



src/main/proto/RegionClient.proto


what is processed?



src/main/proto/RegionClient.proto


Why we need to add a region to the Get even optionally?



src/main/proto/RegionClient.proto


Why is the Get polluted by multiGet stuff?



src/main/proto/RegionClient.proto


I thought we could set this into the Get above?  Why have it here as 
separate param?



src/main/proto/RegionClient.proto


Good stuff



src/main/proto/RegionClient.proto


This is a new feature?



src/main/proto/RegionClient.proto


A type rather than have the mutation type specified as a subclass?

A mutate is a delete or put?



src/main/proto/RegionClient.proto


Again, does it make sense having this in here?  I mean regions come and go 
-- e.g. split -- so you could have reference to non-existent region.  This 
stuff of tying a mutation to a particular region should be done external to the 
mutations themselves?

Is this trying to do multiaction?  Maybe that should be a message apart 
from mutate?  The new message would have region n

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238968#comment-13238968
 ] 

Lars Hofhansl commented on HBASE-5547:
--

Was thinking that these files are up grabs unless HBase is in "backup mode", 
maybe by the existence of a specific zNode, although other models are possible 
as well.
The details are a bit tricky, of course. Do we need a full cleanup between 
backup runs so that we do not confuse the backup files? If not, do we tag with 
a backup number or a with timestamp (like the HLogs)?

If we do HBASE-50 we won't need this one methinks. This might get us to 
workable solution more quickly.


> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to .bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238963#comment-13238963
 ] 

Lars Hofhansl commented on HBASE-5623:
--

Good to know :)

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please 

[jira] [Created] (HBASE-5639) The logic used in waiting for region servers during startup is broken

2012-03-26 Thread Jean-Daniel Cryans (Created) (JIRA)
The logic used in waiting for region servers during startup is broken
-

 Key: HBASE-5639
 URL: https://issues.apache.org/jira/browse/HBASE-5639
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Daniel Cryans
Assignee: nkeywal
Priority: Blocker
 Fix For: 0.94.0


See the tail of HBASE-4993, which I'll report here:

Me:
{quote}
I think a bug was introduced here. Here's the new waiting logic in 
waitForRegionServers:

the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
   there have been no new region server in for
  'hbase.master.wait.on.regionservers.interval' time

And the code that verifies that:

!(lastCountChange+interval > now && count >= minToStart)
{quote}

Nic:
{quote}
It seems that changing the code to

(count < minToStart ||
lastCountChange+interval > now)

would make the code works as documented.
If you have 0 region servers that checked in and you are under the interval, 
you wait: (true or true) = true.
If you have 0 region servers but you are above the interval, you wait: (true or 
false) = true.
If you have 1 or more region servers that checked in and you are under the 
interval, you wait: (false or true) = true.
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-03-26 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238965#comment-13238965
 ] 

Jesse Yates commented on HBASE-5547:


Thoughts on how long we should keep around files? Indefinitely? The latter 
seems a bit excessive, especially if a 'backup mode' ensures you run every X 
minutes (and exports to another cluster, moves the files to another backup 
directory). 'Cleanup' in implies you want to remove the file when no one care 
about the hfiles anymore - thinking maybe a periodic chore on the rs?

With snapshots, I was expecting to add an file reference feature - essentially 
doing impl hardlinks for files we care about keeping around. Was thinking we 
could add a CP hook and impl that would let you add a checks (config based?) 
for if you want to keep a reference around for the file being cleaned up. In 
the backup situation, you would have a timer or (maybe check for a backup 
completed file/meta row) and see if you had elapsed that time or not; if not, 
you would add a reference, if so, do nothing and let the file get cleaned up.

> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to .bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation

2012-03-26 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238962#comment-13238962
 ] 

Jean-Daniel Cryans commented on HBASE-4993:
---

It seems this would work better, also reviewing the 0.90/0.92 code it think we 
should keep the new logic you introduced in this jira (with the fixed code). I 
opened HBASE-5639 and assigned it to you.

> Performance regression in minicluster creation
> --
>
> Key: HBASE-4993
> URL: https://issues.apache.org/jira/browse/HBASE-4993
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.94.0
>
> Attachments: 4993.patch, 4993.v3.patch
>
>
> Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-26 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5596:
--

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.94.0
   0.92.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5596.patch, hbase-5596-0.94.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5596) Few minor bugs from HBASE-5209

2012-03-26 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238955#comment-13238955
 ] 

Jonathan Hsieh commented on HBASE-5596:
---

Test came back clean. Committed.  Thanks David and thanks for reviews Stack!

> Few minor bugs from HBASE-5209
> --
>
> Key: HBASE-5596
> URL: https://issues.apache.org/jira/browse/HBASE-5596
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.1, 0.94.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5596.patch, hbase-5596-0.94.patch
>
>
> A few leftover bugs from HBASE-5209.  Comments are documented here:
> https://reviews.apache.org/r/3892/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding

2012-03-26 Thread Matt Corgan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238948#comment-13238948
 ] 

Matt Corgan commented on HBASE-4676:


{quote}Do we not have this now Matt?{quote}
I don't believe so.  Right now, HFileReaderV2.blockSeek(..) does a brute force 
loop through each KV in the block and does a full KV comparison at each KV.  
Say there are 1000 KVs in the block and 20/row, on average you will have to do 
500 full KV comparisons just to get to your key, and even if you are selecting 
all 20 KVs in the row you still do 500 comparisons on average to get to the 
start of the row.  It's also very important to remember that these are long 
keys and they are most likely to share a common prefix since they got sorted to 
the same block, so you're comparators are churning the same prefix bytes over 
and over.

{quote}The slow write and scan speeds would tend to rule it out for anything 
but a random read workload but when random reading only{quote}
Yeah, it's not really targeted at long scans on cold data, but there are some 
cases in between long cold scans and hot point queries.  Some factors: If it's 
warm data your block cache is now 6x bigger.  Then there are many scans that 
are really sub-block prefix scans to get all the keys in a row (the 20 of 1000 
cells i mentioned above).  If your scan will have a low hit ratio then the 
ability to jump between rows inside a block will lessen CPU usage.

It performs best when doing lots of random Gets on hot data (in block cache).  
Possibly a persistent memcached alternative, especially with SSDs.  I believe 
the current system is actually limited by CPU when doing high throughput Gets 
on data with long keys and short values.  The trie's individual request latency 
may not be much different, but a single server could serve many more parallel 
requests before maxing out cpu.

The bigger the value/key ratio of your data the smaller the difference between 
trie and anything else.  Seems like many people now have big values so I doubt 
they would see a difference.  I'm more trying to enable smarter schemas with 
compound primary keys.

{quote}Regards your working set read testing, did it all fit in memory?{quote}
yes, i'm trying to do the purest comparison of cpu usage possible.  leaving it 
up to others to extrapolate the results to what happens with the effectively 
bigger block cache, more rows/block fetched from disk for equivalent size 
block, etc.  i'm currently just standing up a StoreFile and using it directly.  
see 
https://github.com/hotpads/hbase-prefix-trie/blob/hcell-scanners/test/org/apache/hadoop/hbase/cell/pt/test/performance/seek/SeekBenchmarkMain.java.
  i'll try to factor in network and disk latencies in later tests (did some 
preliminary tests friday but am on vacation this week).

{quote}How did you measure cycles per seek?{quote}
simple assumption of 2 billion/second.  was just trying to emphasize how many 
cycles a seek currently takes

{quote}What is numBlocks? The total number of blocks the dataset fit in?{quote}
number of data blocks in the HFile i fed in

> Prefix Compression - Trie data block encoding
> -
>
> Key: HBASE-4676
> URL: https://issues.apache.org/jira/browse/HBASE-4676
> Project: HBase
>  Issue Type: New Feature
>  Components: io, performance, regionserver
>Affects Versions: 0.90.6
>Reporter: Matt Corgan
> Attachments: PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, 
> SeeksPerSec by blockSize.png
>
>
> The HBase data block format has room for 2 significant improvements for 
> applications that have high block cache hit ratios.  
> First, there is no prefix compression, and the current KeyValue format is 
> somewhat metadata heavy, so there can be tremendous memory bloat for many 
> common data layouts, specifically those with long keys and short values.
> Second, there is no random access to KeyValues inside data blocks.  This 
> means that every time you double the datablock size, average seek time (or 
> average cpu consumption) goes up by a factor of 2.  The standard 64KB block 
> size is ~10x slower for random seeks than a 4KB block size, but block sizes 
> as small as 4KB cause problems elsewhere.  Using block sizes of 256KB or 1MB 
> or more may be more efficient from a disk access and block-cache perspective 
> in many big-data applications, but doing so is infeasible from a random seek 
> perspective.
> The PrefixTrie block encoding format attempts to solve both of these 
> problems.  Some features:
> * trie format for row key encoding completely eliminates duplicate row keys 
> and encodes similar row keys into a standard trie structure which also saves 
> a lot of space
> * the column family is currently stored once at the beginning of each block.  

[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Enis Soztutar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238938#comment-13238938
 ] 

Enis Soztutar commented on HBASE-5623:
--

Thanks Lars for pushing this. Just as a note, I just tested the final version 
of the patch on a 4 node cluster with ycsb -threads 100. No problems.  

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's

[jira] [Updated] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HBASE-5598:
---

Status: Patch Available  (was: Open)

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Attachments: HBASE-5598.patch
>
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4348) Add metrics for regions in transition

2012-03-26 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238923#comment-13238923
 ] 

jirapos...@reviews.apache.org commented on HBASE-4348:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4402/#review6204
---


Himanshu, please address the potential NPE issue.  I've added some suggestions 
to keep names consistent with HBase's conventions.

It would be really nice if you could do a test that would exercise some of the 
new code (test updates don't really seem do it).  See TestRpcMetrics, or 
TestMetricsMBeanBase as possible modes.  I won't block committing if this 
doesn't happen, but it would be nice. :)


src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon


let's rename this to be consistent with other Configuration fields.  Check 
out HConstants.java to see the names of quite a few configuration variables to 
get a general idea of the pattern.

My suggestion is something like:
'hbase.metrics.rit.threshold.time'



src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


Either:

* javadoc to say this must not be null and add 
'Preconditions.assertNotNull(masterMetrics,"master metrics should never be 
null") on initialization

* add guards where masterMetrics is deref'ed to see if null.

Seems like with your tests, adding the guard 'if !=null' guard to 
masterMetrics derefs in the next comment is easier.



src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


ditto.  Actually, since it is used in a few places, you should probably to 
add this to the HConstants file.



src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


master metrics could npe here.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java


nit: spitting? (kind gross) maybe "emitting" (you use that word below) or 
"publishing"?



src/main/java/org/apache/hadoop/hbase/master/HMaster.java


nit: funny spacing



src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java


maybe rename to put rit in front so that it is consistent and will sort 
nicely?

maybe 'ritOldestAge'?


- jmhsieh


On 2012-03-20 21:44:10, Himanshu Vashishtha wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4402/
bq.  ---
bq.  
bq.  (Updated 2012-03-20 21:44:10)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch is for adding Region in transition metrics to the HMaster 
metrics system. It also adds these metrics in the master ui, in the Region in 
transition section. I have attached the proposed new format in the jira 4348.
bq.  
bq.  
bq.  This addresses bug HBase-4348.
bq.  https://issues.apache.org/jira/browse/HBase-4348
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon
 0dc0691 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
ae468ca 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java c4b4d30 
bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 
83abc52 
bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 
d68ce33 
bq.  
bq.  Diff: https://reviews.apache.org/r/4402/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Ran on a 5 node cluster and kill region servers randomly to observe the 
changes in the RIT metrics as emitted out by the Master's mxbean;
bq.  
bq.  mvn test passes without any failure.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Himanshu
bq.  
bq.



> Add metrics for regions in transition
> -
>
> Key: HBASE-4348
> URL: https://issues.apache.org/jira/browse/HBASE-4348
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Himanshu Vashishtha
>Priority: Minor
>  Labels: noob
> Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, 
> RITs.png, RegionInTransitions2.png, metrics-v2.patch
>

[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238924#comment-13238924
 ] 

Uma Maheswara Rao G commented on HBASE-5598:


As a initial start I updated the patch with filter file.

1) added filter file
2) excluded generated packges.
   org.apache.hadoop.hbase.thrift2.generated, 
org.apache.hadoop.hbase.thrift.generated, 
org.apache.hadoop.hbase.rest.protobuf.generated

This reduces the findbugs count from 772 to 601.

{quote}
[WARNING]
[INFO]
[INFO] 
[INFO] Building HBase 0.95-SNAPSHOT
[INFO] 
[INFO]
[INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase ---
[INFO] Fork Value is true
 [java] Warnings generated: 601
[INFO] Done FindBugs Analysis
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 3:19.270s
[INFO] Finished at: Tue Mar 27 03:56:20 IST 2012
[INFO] Final Memory: 13M/55M
[INFO] 
{quote}

note: currently we have already increased the count of findbugs in 
test-patch.properties. While verifying, I just reverted back to 0 for testing.


> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Attachments: HBASE-5598.patch
>
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5606:
--

Attachment: 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch

Re-attaching Prakash's patch.

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HBASE-5598:
---

Attachment: HBASE-5598.patch

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
> Attachments: HBASE-5598.patch
>
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-26 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-2600:
---

Attachment: hbase-2600-root.dir.tgz

generate-hbase-2600-root-in-tmp.sh  was used to generate this tarball

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, HBASE-2600+5217-Sun-Mar-25-2012-v3.patch, 
> HBASE-2600+5217-Sun-Mar-25-2012-v4.patch, hbase-2600-root.dir.tgz, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238910#comment-13238910
 ] 

Hudson commented on HBASE-5623:
---

Integrated in HBase-0.94 #57 (See 
[https://builds.apache.org/job/HBase-0.94/57/])
HBASE-5623 Race condition when rolling the HLog and hlogFlush (Enis 
Soztutar and LarsH) (Revision 1305549)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRollingNoCluster.java


> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter()

[jira] [Commented] (HBASE-5626) Compactions simulator tool for proofing algorithms

2012-03-26 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238908#comment-13238908
 ] 

stack commented on HBASE-5626:
--

Nice. Let me take a looksee...

> Compactions simulator tool for proofing algorithms
> --
>
> Key: HBASE-5626
> URL: https://issues.apache.org/jira/browse/HBASE-5626
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Priority: Minor
>  Labels: noob
> Attachments: cf_compact.py
>
>
> A tool to run compaction simulations would be a nice to have.   We could use 
> it to see how well an algo ran under different circumstances loaded w/ 
> different value types with different rates of flushes and splits, etc. 
> HBASE-2462 had one (see in patch).  Or we could try doing it using something 
> like this: http://en.wikipedia.org/wiki/Discrete_event_simulation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5626) Compactions simulator tool for proofing algorithms

2012-03-26 Thread Nicolas Spiegelberg (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238907#comment-13238907
 ] 

Nicolas Spiegelberg commented on HBASE-5626:


A little more explanation.  

Basic Concept:
We wish to model the amount of compaction IO and file dispersion.  The unit of 
measurement for compactions is a flush.  This is because a flush is always 64MB 
(or whatever you configure) regardless of other properties about the CF/KV.  
Column families might trigger flushes at different intervals, but they usually 
flush a consistent amount of data.  You can understand the behavior of a 
compaction algorithm based upon how it behaves over X amount of flushes.  Does 
this test make a lot of assumptions and simplifications?  Yes!

Inputs:
1. ratio = compaction.ratio between files.  (same as the HBase config)
2. min.files = minimum count of files that must be selected for a compaction to 
occur (same as HBase config)
3. duplication = percentage of KVs within a file that are mutations and will be 
deduped on compaction (0 <= DUPLICATION <= 1)
4. iterations = number of flushes to simulate

Output:
1. The StoreFile dispersion after every flush (and, possibly, compaction 
triggered by that flush)
2. The average storefile count over  flushes
3. The amount of IO consumed by compactions after those  flushes.

> Compactions simulator tool for proofing algorithms
> --
>
> Key: HBASE-5626
> URL: https://issues.apache.org/jira/browse/HBASE-5626
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Priority: Minor
>  Labels: noob
> Attachments: cf_compact.py
>
>
> A tool to run compaction simulations would be a nice to have.   We could use 
> it to see how well an algo ran under different circumstances loaded w/ 
> different value types with different rates of flushes and splits, etc. 
> HBASE-2462 had one (see in patch).  Or we could try doing it using something 
> like this: http://en.wikipedia.org/wiki/Discrete_event_simulation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5626) Compactions simulator tool for proofing algorithms

2012-03-26 Thread Nicolas Spiegelberg (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg updated HBASE-5626:
---

Attachment: cf_compact.py

Attached the current python script that I use to emulate compactions given 
different params.

> Compactions simulator tool for proofing algorithms
> --
>
> Key: HBASE-5626
> URL: https://issues.apache.org/jira/browse/HBASE-5626
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Priority: Minor
>  Labels: noob
> Attachments: cf_compact.py
>
>
> A tool to run compaction simulations would be a nice to have.   We could use 
> it to see how well an algo ran under different circumstances loaded w/ 
> different value types with different rates of flushes and splits, etc. 
> HBASE-2462 had one (see in patch).  Or we could try doing it using something 
> like this: http://en.wikipedia.org/wiki/Discrete_event_simulation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

2012-03-26 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238891#comment-13238891
 ] 

Ted Yu commented on HBASE-2600:
---

@Alex:
Can you attach hbase-2600-root.dir.tgz to this JIRA ?
Please briefly describe how you generated the tar ball.

Thanks

> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> 
>
> Key: HBASE-2600
> URL: https://issues.apache.org/jira/browse/HBASE-2600
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Alex Newman
> Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, HBASE-2600+5217-Sun-Mar-25-2012-v3.patch, 
> HBASE-2600+5217-Sun-Mar-25-2012-v4.patch, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaay 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs

2012-03-26 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238893#comment-13238893
 ] 

jirapos...@reviews.apache.org commented on HBASE-5451:
--



bq.  On 2012-03-24 07:38:03, Benoit Sigoure wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java,
 line 102
bq.  > 
bq.  >
bq.  > Argh, no, don't change this!  I got other HBase devs to promise to 
not change this as it makes backwards compatible clients impossibly complicated.

I see. This was the basis of the "graceful" failure for current clients that 
are not aware of PB (clients would bail out if the versions of RPC don't match, 
right). The response to your comment below "I don't see how this is graceful." 
is actually this change in the version.


bq.  On 2012-03-24 07:38:03, Benoit Sigoure wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto,
 line 34
bq.  > 
bq.  >
bq.  > Why keep this oddity of Hadoop RPC?  Either rely on TCP keepalive, 
or add a Ping method to the RPC interface.

Note that this is just documentation. Ping is already done in hbase RPC, and I 
thought I'd document it. I haven't done anything in the PB stuff for handling 
this. I agree with you this is odd/special-case and IMO a topic for a separate 
jira.


bq.  On 2012-03-24 07:38:03, Benoit Sigoure wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto,
 line 72
bq.  > 
bq.  >
bq.  > Why is this optional?

General comment on the optional vs required PB fields... I have made most of 
the fields as optional since it makes the specification flexible and makes 
compatibility very easy. Once we are somewhat certain of the PB fields in the 
RPC we can finalize on the labeling of optional/required on the fields. Does 
this make sense?


bq.  On 2012-03-24 07:38:03, Benoit Sigoure wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto,
 line 71
bq.  > 
bq.  >
bq.  > What's the point of this message?  Why not just put the callId in 
RpcRequestProto and be done with it?

The main reason being I wanted to clearly separate what comes from the 
application and what's put in by the RPC layer. The client would frame a PB 
object (RpcRequestProto) and send it down to the RPC layer. Currently, the 
RpcRequestProto is mostly a placeholder with only one field called 'bytes'. 
Once I implement the ProtoBufRpcEngine (as in Hadoop core) in a follow-up jira, 
I'll have fields like "methodname', 'protocolname', etc. and they would be 
encoded as RpcRequestProto objects.

Similarly, on the response side.


bq.  On 2012-03-24 07:38:03, Benoit Sigoure wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto,
 line 25
bq.  > 
bq.  >
bq.  > I don't see how this is graceful.

I answered this above.


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4096/#review6302
---


On 2012-03-01 03:40:14, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4096/
bq.  ---
bq.  
bq.  (Updated 2012-03-01 03:40:14)
bq.  
bq.  
bq.  Review request for .
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Switch RPC call envelope/headers to PBs
bq.  
bq.  
bq.  This addresses bug HBASE-5451.
bq.  https://issues.apache.org/jira/browse/HBASE-5451
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.http://svn.apache.org/repos/asf/hbase/trunk/pom.xml 1294899 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java
 1294899 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
 1294899 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
 1294899 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java
 1294899 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/proto/RPCMessageProto.proto
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/4096/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thank

[jira] [Updated] (HBASE-5618) SplitLogManager - prevent unnecessary attempts to resubmits

2012-03-26 Thread Prakash Khemani (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Khemani updated HBASE-5618:
---

Attachment: 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch

update "heartbeat time" as soon as possible and as often as one can.

> SplitLogManager - prevent unnecessary attempts to resubmits
> ---
>
> Key: HBASE-5618
> URL: https://issues.apache.org/jira/browse/HBASE-5618
> Project: HBase
>  Issue Type: Improvement
>  Components: wal, zookeeper
>Reporter: Prakash Khemani
> Attachments: 
> 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch
>
>
> Currently once a watch fires that the task node has been updated (hearbeated) 
> by the worker, the splitlogmanager still quite some time before it updates 
> the "last heard from" time. This is because the manager currently schedules 
> another getDataSetWatch() and only after that finishes will it update the 
> task's "last heard from" time.
> This leads to a large number of zk-BadVersion warnings when resubmission is 
> continuously attempted and it fails.
> Two changes should be made
> (1) On a resubmission failure because of BadVersion the task's lastUpdate 
> time should get upped.
> (2) The task's lastUpdate time should get upped as soon as the 
> nodeDataChanged() watch fires and without waiting for getDataSetWatch() to 
> complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5606:
--

Attachment: (was: 5606.txt)

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238861#comment-13238861
 ] 

Hadoop QA commented on HBASE-5606:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12520003/0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1310//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1310//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1310//console

This message is automatically generated.

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 5606.txt
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5544) Add metrics to HRegion.processRow()

2012-03-26 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238856#comment-13238856
 ] 

Hadoop QA commented on HBASE-5544:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12519753/HBASE-5544.D2457.2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.util.TestHBaseFsck

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1311//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1311//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1311//console

This message is automatically generated.

> Add metrics to HRegion.processRow()
> ---
>
> Key: HBASE-5544
> URL: https://issues.apache.org/jira/browse/HBASE-5544
> Project: HBase
>  Issue Type: New Feature
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.96.0
>
> Attachments: HBASE-5544.D2457.1.patch, HBASE-5544.D2457.2.patch
>
>
> Add metrics of
> 1. time for waiting for the lock
> 2. processing time (scan time)
> 3. time spent while holding the lock
> 4. total call time
> 5. number of failures / calls

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5626) Compactions simulator tool for proofing algorithms

2012-03-26 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238853#comment-13238853
 ] 

stack commented on HBASE-5626:
--

Where is the python simulation script?  Is it uploaded anywhere?  (Pardon me if 
I missed it)

Simulator needs to also factor in splitting.

> Compactions simulator tool for proofing algorithms
> --
>
> Key: HBASE-5626
> URL: https://issues.apache.org/jira/browse/HBASE-5626
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Priority: Minor
>  Labels: noob
>
> A tool to run compaction simulations would be a nice to have.   We could use 
> it to see how well an algo ran under different circumstances loaded w/ 
> different value types with different rates of flushes and splits, etc. 
> HBASE-2462 had one (see in patch).  Or we could try doing it using something 
> like this: http://en.wikipedia.org/wiki/Discrete_event_simulation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238849#comment-13238849
 ] 

Uma Maheswara Rao G commented on HBASE-5598:


Also another case to use exclude filter is, we have many findbugs reported from 
generated code (from thrift).

We can just add that package to the filter file.

ex:
{code}

 

  
 
{code}

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5626) Compactions simulator tool for proofing algorithms

2012-03-26 Thread Nicolas Spiegelberg (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238838#comment-13238838
 ] 

Nicolas Spiegelberg commented on HBASE-5626:


How is this different from the compaction simulation python script?  The unit 
of measurement should be a flush, since we flush after a certain memstore 
memory size, regardless of flow rate or KV length.

> Compactions simulator tool for proofing algorithms
> --
>
> Key: HBASE-5626
> URL: https://issues.apache.org/jira/browse/HBASE-5626
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Priority: Minor
>  Labels: noob
>
> A tool to run compaction simulations would be a nice to have.   We could use 
> it to see how well an algo ran under different circumstances loaded w/ 
> different value types with different rates of flushes and splits, etc. 
> HBASE-2462 had one (see in patch).  Or we could try doing it using something 
> like this: http://en.wikipedia.org/wiki/Discrete_event_simulation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5619) Create PB protocols for HRegionInterface

2012-03-26 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5619:
---

Attachment: hbase-5619_v3.patch

> Create PB protocols for HRegionInterface
> 
>
> Key: HBASE-5619
> URL: https://issues.apache.org/jira/browse/HBASE-5619
> Project: HBase
>  Issue Type: Sub-task
>  Components: ipc, master, migration, regionserver
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-5619.patch, hbase-5619_v3.patch
>
>
> Subtask of HBase-5443, separate HRegionInterface into admin protocol and 
> client protocol, create the PB protocol buffer files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5619) Create PB protocols for HRegionInterface

2012-03-26 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5619:
---

Status: Patch Available  (was: Open)

> Create PB protocols for HRegionInterface
> 
>
> Key: HBASE-5619
> URL: https://issues.apache.org/jira/browse/HBASE-5619
> Project: HBase
>  Issue Type: Sub-task
>  Components: ipc, master, migration, regionserver
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-5619.patch, hbase-5619_v3.patch
>
>
> Subtask of HBase-5443, separate HRegionInterface into admin protocol and 
> client protocol, create the PB protocol buffer files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5619) Create PB protocols for HRegionInterface

2012-03-26 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5619:
---

Status: Open  (was: Patch Available)

> Create PB protocols for HRegionInterface
> 
>
> Key: HBASE-5619
> URL: https://issues.apache.org/jira/browse/HBASE-5619
> Project: HBase
>  Issue Type: Sub-task
>  Components: ipc, master, migration, regionserver
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-5619.patch, hbase-5619_v3.patch
>
>
> Subtask of HBase-5443, separate HRegionInterface into admin protocol and 
> client protocol, create the PB protocol buffer files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238820#comment-13238820
 ] 

Uma Maheswara Rao G commented on HBASE-5598:


Who ever wants to generate the findbugs html report locally, follow the below 
steps.

1) add the below target in pom.xml. Already there is one transformationSet 
available in pom.xml. just we can place this after that transformationSet.


  ${basedir}/target/
  
findbugsXml.xml
  
  
E:/SoftWares/findbugs-1.3.9/findbugs-1.3.9/src/xsl/default.xsl
  ${basedir}/target/


2) Make sure to update the above findbugs xsl path correctly referring to your 
local path of findbugs.

3) run 'mvn findbugs:findbugs'

4) run 'mvn site'

now ${basedir}/target/findbugsXml.xml will be replaced with html report. rename 
to ${basedir}/target/findbugsXml.html and open.

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238817#comment-13238817
 ] 

Lars Hofhansl commented on HBASE-5533:
--

Committed to 0.94 as well.

> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238811#comment-13238811
 ] 

Jimmy Xiang commented on HBASE-5606:


@Prakash,  could there be other places which failed delete can cause this issue?

Is it a cleaner fix to change async delete to sync delete?  With sync delete, 
we can
avoid all these havoc racing problems, and the retry will get a fresh start 
each time.

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 5606.txt
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5533) Add more metrics to HBase

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5533:
-

Fix Version/s: 0.94.0

Patch applies almost cleanup to 0.94. Will pull it in.

> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, TimingOverhead.java, 
> hbase-5533-0.92.patch, hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, 
> hbase5533-0.92-v5.patch, histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-03-26 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238807#comment-13238807
 ] 

Jean-Daniel Cryans commented on HBASE-3134:
---

+1 on the patch you put up on review board Teruyoshi, can you attach it here 
and grant the license so that I can commit? Thanks!

> [replication] Add the ability to enable/disable streams
> ---
>
> Key: HBASE-3134
> URL: https://issues.apache.org/jira/browse/HBASE-3134
> Project: HBase
>  Issue Type: New Feature
>  Components: replication
>Reporter: Jean-Daniel Cryans
>Assignee: Teruyoshi Zenmyo
>Priority: Minor
>  Labels: replication
> Fix For: 0.94.1
>
> Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, 
> HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch
>
>
> This jira was initially in the scope of HBASE-2201, but was pushed out since 
> it has low value compared to the required effort (and when want to ship 
> 0.90.0 rather soonish).
> We need to design a way to enable/disable replication streams in a 
> determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5598) Analyse and fix the findbugs reporting by QA and add invalid bugs into findbugs-excludeFilter file

2012-03-26 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238804#comment-13238804
 ] 

Uma Maheswara Rao G commented on HBASE-5598:


I am not so favor of adding directly static tools related annotations into 
code. 
In Hadoop projects(HDFS, Mapreduce) we are using this exclude-filter for 
invalid find bugs. So, I proposed this here.

I think we can decide first how we will exclude the invalid bugs, then we can 
start working on the bug fix directly.

> Analyse and fix the findbugs reporting by QA and add invalid bugs into 
> findbugs-excludeFilter file
> --
>
> Key: HBASE-5598
> URL: https://issues.apache.org/jira/browse/HBASE-5598
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.1, 0.94.0, 0.96.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
>
> There are many findbugs errors reporting by HbaseQA. HBASE-5597 is going to 
> up the OK count.
> This may lead to other issues when we re-factor the code, if we induce new 
> valid ones and remove invalid bugs also can not be reported by QA.
> So, I would propose to add the exclude filter file for findbugs(for the 
> invalid bugs). If we find any valid ones, we can fix under this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238802#comment-13238802
 ] 

Jean-Daniel Cryans commented on HBASE-5623:
---

Funny, I just saw that NPE for the first time in my testing.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated

[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238800#comment-13238800
 ] 

Lars Hofhansl commented on HBASE-5623:
--

Thanks for the patch Enis! And thanks for the reviews Stack and Ted.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated b

[jira] [Updated] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

2012-03-26 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5623:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.

> Race condition when rolling the HLog and hlogFlush
> --
>
> Key: HBASE-5623
> URL: https://issues.apache.org/jira/browse/HBASE-5623
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 
> 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, 
> HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch
>
>
> When doing a ycsb test with a large number of handlers 
> (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
>   at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
>   at $Proxy1.multi(Unknown Source)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
>   at 
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
>   at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
>   at 
> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
>   at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and 
> close the old one at HLog#rollWriter() holding the updateLock, but the other 
> threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against 
> concurrent appends and flush(), but not on the passed writer, which can be 
> closed already by rollWriter(). In this case, since 
> SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it wa

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Prakash Khemani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238794#comment-13238794
 ] 

Prakash Khemani commented on HBASE-5606:


@Jimmy This is similar to HBASE-5081 w.r.t what goes wrong - a pending delete 
creates havoc on the next create. But it is different from HBASE-5081 because 
the pending Delete is created at a different point in the code - in the 
timeoutMonitor and not when the task actually fails ...

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 5606.txt
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Prakash Khemani (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Khemani updated HBASE-5606:
---

Attachment: 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch

Do not do any error processing if the getDataSetWatch() call from 
SplitLogManager timeoutMonitor fails

> SplitLogManger async delete node hangs log splitting when ZK connection is 
> lost 
> 
>
> Key: HBASE-5606
> URL: https://issues.apache.org/jira/browse/HBASE-5606
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.0
>Reporter: Gopinathan A
>Priority: Critical
> Fix For: 0.92.2
>
> Attachments: 
> 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 5606.txt
>
>
> 1. One rs died, the servershutdownhandler found it out and started the 
> distributed log splitting;
> 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
> deleted asynchronously;
> 3. Servershutdownhandler retried the log splitting;
> 4. The asynchronously deletion in step 2 finally happened for new task
> 5. This made the SplitLogManger in hanging state.
> This leads to .META. region not assigened for long time
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(55413,79):2012-03-14 
> 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89303,79):2012-03-14 
> 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
> splitlog task at znode 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}
> {noformat}
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(80417,99):2012-03-14 
> 19:34:31,196 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> hbase-root-master-HOST-192-168-47-204.log.2012-03-14"(89456,99):2012-03-14 
> 19:34:32,497 DEBUG 
> org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
> /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >