[jira] [Commented] (HDFS-13607) [Edit Tail Fast Path Pt 1] Enhance JournalNode with an in-memory cache of recent edit transactions

Erik Krogen (JIRA) Tue, 05 Jun 2018 12:10:56 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502335#comment-16502335
 ]


Erik Krogen commented on HDFS-13607:
------------------------------------

Thanks for the reviews [~vagarychen], [~shv], [~csun]!

[~vagarychen]:
I'm not sure what you mean by confusing log warn message. Maybe you are 
thinking that when we log {{capacity}} (doesn't change), we are logging 
{{totalSize}}, which would show an over-capacity value at that time? I am not 
sure what else would be confusing about this log as-is. That being said, I 
think from a code readability standpoint it makes sense if the large items are 
removed before the new item is added, so I refactored this.

[~csun]:
The condition you mentioned is actually triggered before the buffer with 
different version is added, not after. It is {{prevBuf}} (corresponding to 
{{prevTxn}}) which eventually gets added to the {{outputBuffers}}:
{code}
        if (prevBuf != null) { // True except for the first loop iteration
          outputBuffers.add(ByteBuffer.wrap(prevBuf));
          ...
        }
{code}
So since the loop terminates once {{prevTxn}} does not match the correct layout 
version, this {{prevBuf}} will never be added to {{outputBuffers}}. The test 
you mentioned actually verifies this; {{assertTxnCountAndContents}} checks that 
there are no extra transactions returned beyond what it is expected (via the 
{{assertArrayEquals}}, which will fail if there are extra bytes returned). If 
you believe otherwise, can you provide a unit test snippet demonstrating the 
issue? It will definitely be a problem if what you described is true.

[~shv]:
# I removed the use of ConcurrentHashMap in favor of a regular HashMap for the 
{{headerMap}} and also renamed it to {{headerCache}} to make its purpose more 
clear. I don't think we need to move to the {{LightWeight*}} objects given the 
relatively low number of entries each map is expected to have. The layout 
version and header maps should have only a few entries, so they are negligible. 
For {{dataMap}}, there is only one entry per batch of edits. If we assume edits 
are an average of 200 bytes (standard in our environment), and that batches 
contain at least a few hundred edits, this gives us entries which are tens of 
KB in size, so the overhead of the map is very small in comparison. I did, 
however, change both TreeMaps to be defined by the NavigableMap interface to 
allow for more pluggability in the future.
# I would actually consider {{startTxnId}} to be the opposite of 
{{highestTxnId}} (this pair is lowest/highest txn IDs currently in the cache). 
I agree that the presence of both "start" and "initial" is confused, so I have 
renamed {{startTxnId}} to {{lowestTxnId}}. I think this is more clear now: 
"lowest", "highest", "initial".
# Sure, SGTM. Updated.
# Good call, thanks. Updated.


Updating with v003 patch incorporating the comments above.

> [Edit Tail Fast Path Pt 1] Enhance JournalNode with an in-memory cache of 
> recent edit transactions
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13607
>                 URL: https://issues.apache.org/jira/browse/HDFS-13607
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, journal-node
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>            Priority: Major
>         Attachments: HDFS-13607-HDFS-12943.000.patch, 
> HDFS-13607-HDFS-12943.001.patch, HDFS-13607-HDFS-12943.002.patch
>
>
> See HDFS-13150 for full design.
> This JIRA is to add the in-memory cache of recent edit transactions on the 
> JournalNode. This JIRA does not include accesses to this cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-13607) [Edit Tail Fast Path Pt 1] Enhance JournalNode with an in-memory cache of recent edit transactions

Reply via email to