[
https://issues.apache.org/jira/browse/HBASE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083461#comment-13083461
]
[email protected] commented on HBASE-4192:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1463/#review1410
-----------------------------------------------------------
Great stuff! Mostly documentation stuff at this point but I think HLog will be
much clearer by the end of this.
src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
<https://reviews.apache.org/r/1463/#comment3221>
isn't there an HRegion.createHRegion that doesn't take the rpc server and
handles passing the null?
src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
<https://reviews.apache.org/r/1463/#comment3220>
make this proper javadoc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3222>
i think now might be a good time to fill in javadoc for every single
variable in this class (let's use this as an opportunity to clean up HLog a bit)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3223>
We shouldn't hard-code anything like this. It should probably be marked
final and set in the constructor. You can always have a
DEFAULT_DELAYED_RESPONSE_MAX_SIZE or something.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3224>
final?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3225>
is this the best description for what this does? my understanding is that
there is actually always going to be batching, and it's not that this is really
about making them any more aggressive (in all cases it loops around and
immediately syncs anything pending). this is really about... a non-blocking
append, freeing of the handler thread, and the sync being called and response
returned from a separate thread. (a mouthful but there must be some concise
way to describe this)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3226>
This is all static so there might be issues running multiple HRegionServers
in a single JVM during unit tests?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3227>
since you've kept the old constructors, why did we need to update anything
elsewhere passing in null?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3228>
what does this mean? we should also add some documentation to the default
xml and/or book about the new config params and this feature
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3229>
yeah this is fine, so just don't set it when you declare it and make it
final
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3230>
should this throw an exception? this seems like a bad configuration (user
error) and we should halt instead of continue.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3232>
this this while() { yield() } the best pattern to use here? Any reason not
to use the typical wait/notify pattern?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3231>
looks like there are double spaces on each line wrap
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3234>
What's the contract of this method? When should this be called? Does it
only append or does it also sync? Is it blocking or non-blocking and what does
it do with the RWCC if i pass it in?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3233>
javadoc still needs to be updated
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3235>
there's a bunch of javadoc on this method. it's unclear when to use this
method vs. the other above it.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3236>
should be info? and need to add rwcc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3237>
You can get fancy with javadoc notation by referencing the
classes/interfaces: {@link Delayable} for example.
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3238>
final
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3239>
Any updates to the description of this thread?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3240>
why no longer final?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3241>
comment
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3244>
these comments can all be changed to be the form:
/** Comment about what this variable is */
(then it will be loaded up as javadoc in eclipse and such)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3242>
and is used when? when the server is shutting down?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3243>
what happens if you overrun max? (seems like this feature could use a doc
describing what it does, the trade-offs, and configuration parameters)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3245>
I don't think we want to just do the printStackTrace() here which will go
to stdout... You'll want to LOG.error("Something", e)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3246>
it doesn't always wait. this method deserves a full javadoc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3247>
maybe maxDelayedResponses would be a better variable name
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3248>
double space
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3249>
convert this to javadoc and maybe add more description... it's not really
the case that "All RPC threads that had delayed Responses are now allowed"...
it's actually the case that it will send the responses for anything that was
delayed and is now synced (some stuff could be delayed but not yet synced, thus
not responded to)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3250>
blocking? non-blocking? this just queues and doesn't actually write to
HLog?
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3251>
When is this called
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3252>
same here. can you add some comments about when this is used
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3253>
javadoc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3254>
need to update javadoc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3255>
javadoc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3257>
don't you want to request a log roll if logRollRequested == true ||
sizeExceeded ? (why do we need sizeExceeded at all? couldn't we set
logRollRequested=true?)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3258>
javadoc
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3259>
javadoc... let's be sure to clearly state the contract of these public
methods
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3261>
this method could also use some doc (i know you didn't add this but would
be very helpful)
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3260>
this looks like a change in behavior? previously the postWALWrite would
happen even if this method returned true... now if it returns true we leave
immediately
src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3262>
so looks like this option is not compatible with coprocessors? or does
this postWALWrite move elsewhere?
this looks like the preWALWrite will get called if batchEntries is true but
the postWALWrite never will
src/main/java/org/apache/hadoop/hbase/util/HMerge.java
<https://reviews.apache.org/r/1463/#comment3263>
necessary change? (here and remaining changes)
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestBatchEntriesLogRolling.java
<https://reviews.apache.org/r/1463/#comment3264>
2011
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestBatchEntriesLogRolling.java
<https://reviews.apache.org/r/1463/#comment3265>
there are some ways to run all the unit tests with a varying set of initial
parameters... look at TestCacheOnWrite on the new HFile v2 patch, it uses
@RunWith(Parameterized.class)
- Jonathan
On 2011-08-11 06:59:28, Vlad Dogaru wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1463/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-08-11 06:59:28)
bq.
bq.
bq. Review request for hbase.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. Changes:
bq.
bq. 1. Add hbase.region.wal.batchentries configuration parameter. If this is
bq. enabled, batch entries to the HLog in a queue.
bq. 2. Use delayed RPCs for sync requests when aggresive batching is enabled.
bq. This frees up RPC handler threads for the duration of the sync.
bq. 3. Pass the RPC server instance all the way to down to HLog. This is
needed
bq. to find out the current remote call, mark it as delayed, and finally
complete
bq. it when the sync is done.
bq. 4. Use the region read-write consistency control to avoid exposing to
bq. RegionScanners edits which have not yet been synced.
bq. 5. Change a few tests which directly create HRegions or HLogs. The
bq. rpcServers passed in are null, HLog falls back to classic RPCs when it has
no
bq. knowledge of the RPC server.
bq. 6. Add TestBatchEntriesLogRolling, which is identical to TestLogRolling,
bq. except that it uses aggressive batching. I'm not sure how to add tests
that
bq. verify the same functionality but don't duplicate code, suggestion are
bq. welcome.
bq.
bq. The new parameter is disabled by default.
bq.
bq.
bq. This addresses bug HBASE-4192.
bq. https://issues.apache.org/jira/browse/HBASE-4192
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 6fb1da7
bq. src/main/java/org/apache/hadoop/hbase/master/HMaster.java a00b93d
bq. src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
2f86f04
bq. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 83ff7b2
bq. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
f22fb6e
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
8ec53d3
bq. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 887f736
bq. src/main/java/org/apache/hadoop/hbase/util/HMerge.java 9f0499e
bq. src/main/java/org/apache/hadoop/hbase/util/MetaUtils.java af8d734
bq. src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java bf8004b
bq. src/test/java/org/apache/hadoop/hbase/TestScanMultipleVersions.java
d7baba4
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
36816e8
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestBatchEntriesLogRolling.java
PRE-CREATION
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java
de28418
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java
dc43eb2
bq.
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
381ac90
bq.
bq. Diff: https://reviews.apache.org/r/1463/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. All unit tests run with aggressive batching turned on and off.
bq.
bq.
bq. Thanks,
bq.
bq. Vlad
bq.
bq.
> Optimize HLog for Throughput Using Delayed RPCs
> -----------------------------------------------
>
> Key: HBASE-4192
> URL: https://issues.apache.org/jira/browse/HBASE-4192
> Project: HBase
> Issue Type: New Feature
> Components: wal
> Affects Versions: 0.92.0
> Reporter: Vlad Dogaru
> Priority: Minor
>
> Introduce a new HLog configuration parameter (batchEntries) for more
> aggressive batching of appends. If this is enabled, HLog appends are not
> written to the HLog writer immediately, but batched and written either
> periodically or when a sync is requested. Because sync times become larger,
> they use delayed RPCs to free up RPC handler threads.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira