[
https://issues.apache.org/jira/browse/HDFS-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775753#comment-13775753
]
Colin Patrick McCabe commented on HDFS-5191:
--------------------------------------------
re: test timouts, I like to add them at the test level using
\@Test(timeout=...), rather than looking at every part of the test.
re: hash table sizes. Any time you insert something into IdentityHashStore,
you use two array slots... one for the key, and another for the value. The
other 2x factor is because the hash table should stay half empty, to avoid the
risk of collisions. In other words, we have a load factor of 0.50. This is
especially important since every element takes up space (we don't use separate
chaining).
After thinking about it, I don't think we need the extra +1 element. It seems
that System#identityHashCode provides a well-distributed enough hash that
dividing by a power of two size works well.
IdentityHashStore was necessary because ByteBuffer#hash and ByteBuffer#equals
were just very unsuitable for what I was trying to do. And there's no way to
parameterize HashTable to use different functions. Another nice advantage of
IdentityHashStore is that it does zero allocations, unless you are growing the
hash table. This might seem like a trivial point, but given how frequently
read is called, it's nice to avoid generating garbage. We learned that when
dealing with the edit log code...
> revisit zero-copy API in FSDataInputStream to make it more intuitive
> --------------------------------------------------------------------
>
> Key: HDFS-5191
> URL: https://issues.apache.org/jira/browse/HDFS-5191
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client, libhdfs
> Affects Versions: HDFS-4949
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-5191-caching.001.patch,
> HDFS-5191-caching.003.patch, HDFS-5191-caching.006.patch,
> HDFS-5191-caching.007.patch, HDFS-5191-caching.008.patch
>
>
> As per the discussion on HDFS-4953, we should revisit the zero-copy API to
> make it more intuitive for new users.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira