[
https://issues.apache.org/jira/browse/HBASE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13631976#comment-13631976
]
Jean-Daniel Cryans commented on HBASE-8340:
-------------------------------------------
You are right [~sershe], this piece of code was written only with replication's
use case in mind when log compression is turned on. It doesn't support seeking
back either.
bq. If we assume that one next() is enough to be able to use reader.seek, as
the current code would seem to imply, then there's no need for the first seek
to call next() in a loop - it can call next once and then do reader.seek.
I'm not sure what you mean here, calling next() just once we only populate the
dict with whatever entries were needed to write the first WAL entry, and then
you miss all the other entries.
> WAL compression handling of seeks seems to be either inefficient or incorrect
> -----------------------------------------------------------------------------
>
> Key: HBASE-8340
> URL: https://issues.apache.org/jira/browse/HBASE-8340
> Project: HBase
> Issue Type: Bug
> Reporter: Sergey Shelukhin
>
> In next(...):
> {code}
> if (compressionContext != null && emptyCompressionContext) {
> emptyCompressionContext = false;
> }
> return ...
> {code}
>
> In seek()
> {code}
> if (compressionContext != null && emptyCompressionContext) {
> while (next() != null) {
> if (getPosition() == pos) {
> emptyCompressionContext = false;
> break;
> }
> }
> ...
> reader.seek(pos);
> {code}
> So, seek will seek the file directly if either any next, or any seek, has
> been called before.
> I am not sure what this code is for, but my best guess is that it is to
> populate the dictionary for compression.
> If it is so, it would seem that one next() call (or even one seek() call)
> would not be enough, and seek must always use next(), otherwise it is
> incorrect.
> If we assume that one next() is enough to be able to use reader.seek, as the
> current code would seem to imply, then there's no need for the first seek to
> call next() in a loop - it can call next once and then do reader.seek.
> Note: even in case if all of this works fine because external usage creates
> the object and does one seek before any next-s, and no seeks after (the only
> bug-free pattern currently possible with both methods used if I'm not
> mistaken), then the code needs to be tightened and bug potential removed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira