[
https://issues.apache.org/jira/browse/HDFS-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073404#comment-14073404
]
Daryn Sharp commented on HDFS-6709:
-----------------------------------
Questions/comments on the advantages:
* I thought RTTI is per class, not instance? If yes, the savings are
immaterial?
* Using misaligned access may result in processor incompatibility, impact
performance, introduces atomicity and CAS problems, concurrent access to
adjacent misaligned memory in the cache line may be completely unsafe.
* No references, only primitives can be stored off-heap, so how do value types
(non-boxed primitives, correct?) apply? Wouldn't the instance managing the
slab have methods that return the correct primitive?
I think off-heap may be a win in some limited cases, but I'm struggling with
how it will work in practice. Here's thoughts for clarification on actual
application of the technique:
# OO encapsulation and polymorphism are lost?
# We can't store references anymore so we're reduced to primitives?
# Let's say we used to have a class {{Foo}} with instance fields
{{field1..field4}} of various types. {{FooManager.get(id)}} returns a {{Foo}}
instance. But now a off-heap structure doesn't have any instantiated {{Foo}}
entries else there is no GC benefit other than smaller instances to compact.
# Does {{FooManager}} instantiate new {{Foo}} instances every time
{{FooManager.get(id)}} is called? If yes, it generates a tremendous amount of
garbage that defeats the GC benefit of going off heap.
# Does {{FooManager}} try to maintain a limited pool of mutable {{Foo}} objects
for reuse (ex. via a {{Foo#reinitialize(id, f1..f4)}}? (I've tried this
technique elsewhere with degraded performance but maybe there's a good way to
do)
# If no {{Foo}} entries are allowed:
## does {{FooManager}} have methods for every data member that used to be
encapsulated by {{Foo}}? Ie. {{FooManager.getField$N(id)}}? We'll have to
make N-many calls probably within a critical section?
## Will apis change from {{doSomething(Foo foo, String msg, boolean flag)}} to
{{doSomething(Long fooId, int fooField1, long fooField2, boolean fooField3,
long fooField4, String msg, boolean flag)}}?
## If we add another field, do we go back and update all the apis again?
> Implement off-heap data structures for NameNode and other HDFS memory
> optimization
> ----------------------------------------------------------------------------------
>
> Key: HDFS-6709
> URL: https://issues.apache.org/jira/browse/HDFS-6709
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-6709.001.patch
>
>
> We should investigate implementing off-heap data structures for NameNode and
> other HDFS memory optimization. These data structures could reduce latency
> by avoiding the long GC times that occur with large Java heaps. We could
> also avoid per-object memory overheads and control memory layout a little bit
> better. This also would allow us to use the JVM's "compressed oops"
> optimization even with really large namespaces, if we could get the Java heap
> below 32 GB for those cases. This would provide another performance and
> memory efficiency boost.
--
This message was sent by Atlassian JIRA
(v6.2#6252)