[ 
https://issues.apache.org/jira/browse/HDFS-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073404#comment-14073404
 ] 

Daryn Sharp commented on HDFS-6709:
-----------------------------------

Questions/comments on the advantages:
* I thought RTTI is per class, not instance?  If yes, the savings are 
immaterial?
* Using misaligned access may result in processor incompatibility, impact 
performance, introduces atomicity and CAS problems, concurrent access to 
adjacent misaligned memory in the cache line may be completely unsafe.
* No references, only primitives can be stored off-heap, so how do value types 
(non-boxed primitives, correct?) apply?  Wouldn't the instance managing the 
slab have methods that return the correct primitive?

I think off-heap may be a win in some limited cases, but I'm struggling with 
how it will work in practice.  Here's thoughts for clarification on actual 
application of the technique:
# OO encapsulation and polymorphism are lost?
# We can't store references anymore so we're reduced to primitives?
# Let's say we used to have a class {{Foo}} with instance fields 
{{field1..field4}} of various types.  {{FooManager.get(id)}} returns a {{Foo}} 
instance.  But now a off-heap structure doesn't have any instantiated {{Foo}} 
entries else there is no GC benefit other than smaller instances to compact.
# Does {{FooManager}} instantiate new {{Foo}} instances every time 
{{FooManager.get(id)}} is called?  If yes, it generates a tremendous amount of 
garbage that defeats the GC benefit of going off heap.
# Does {{FooManager}} try to maintain a limited pool of mutable {{Foo}} objects 
for reuse (ex. via a {{Foo#reinitialize(id, f1..f4)}}?  (I've tried this 
technique elsewhere with degraded performance but maybe there's a good way to 
do)
# If no {{Foo}} entries are allowed:
## does {{FooManager}} have methods for every data member that used to be 
encapsulated by {{Foo}}?  Ie. {{FooManager.getField$N(id)}}?  We'll have to 
make N-many calls probably within a critical section?
## Will apis change from {{doSomething(Foo foo, String msg, boolean flag)}} to 
{{doSomething(Long fooId, int fooField1, long fooField2, boolean fooField3, 
long fooField4, String msg, boolean flag)}}?
## If we add another field, do we go back and update all the apis again?


> Implement off-heap data structures for NameNode and other HDFS memory 
> optimization
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-6709
>                 URL: https://issues.apache.org/jira/browse/HDFS-6709
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-6709.001.patch
>
>
> We should investigate implementing off-heap data structures for NameNode and 
> other HDFS memory optimization.  These data structures could reduce latency 
> by avoiding the long GC times that occur with large Java heaps.  We could 
> also avoid per-object memory overheads and control memory layout a little bit 
> better.  This also would allow us to use the JVM's "compressed oops" 
> optimization even with really large namespaces, if we could get the Java heap 
> below 32 GB for those cases.  This would provide another performance and 
> memory efficiency boost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to