Re: [eclipse-incubator-e4-dev] Interesting article on modelling with prototype-based leanings

Ed Merks Wed, 22 Oct 2008 05:10:32 -0700

Markus,

Comments below.


Markus Kohler wrote:

Hi,
Thanks for the comments.I fully agree with you.Regarding byte code. As soon as you generate code, of course you canget into trouble with code size.

Regardless of how you produce the code, lots of it can you can get intotrouble. Generators will definitely help you get intro trouble morequickly! That why in my first advice in yesterday's blog was toconsider generating nothing:


   
http://ed-merks.blogspot.com/2008/10/hand-written-and-generated-code-never.html

I was also suggesting that folks take byte code into consideration whendeciding on their generation patterns.

I have seen applications where permsize would need to be set to 1Gbytebecause of code bloat caused by code generation.

The world is full of obscenely large models/schemas and our tools makeit trivial to generate code far beyond what any human would everconsider reasonable to write by hand...

I think it would be interesting to get an overview about how much thecode needs in Eclipse.

Yeah, I wonder how big all the Galileo stuff would be if you installedevery last thing...

Hmm, let's see whether I find some time to get this done this week. Ifsomeone already has a reasonable large heap dump of Eclipse, thatcould be shared with me, let me know.Regarding code generation it's also not well known that at least theSUN JVM generates code at runtime for Dynamic Proxies. You will find alot of $Proxy<some number> classes.

I suppose they all must do that because it does produce newjava.lang.Class instances...



Regards,
Markus

On Wed, Oct 22, 2008 at 11:30 AM, Ed Merks <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:


    Guys,

    Comments below.


    Markus Kohler wrote:

    Hi Michael,
Thanks for the info.Yes, there a ways to minimize the overhead and IMHO in practice
    a naive implementation of this pattern has just too much overhead.

    Yes, hash maps are just about the worst case of memory footprint
    you can imagine, especially given that most implementations use
    instances of Map.Entry to cause bloat in addition to the large index.


    I know at least one real world example, where the memory usage of
    a software component using this pattern could be reduced by  a
    factor of 10.

    The only potential upside of the naive pattern might be huge
    sparsely populated instances.  I.e., you have 1000 feature but
    only two or three tend to be set on average.

    People sometimes claim that memory is so cheap that this kind of
    optimizations don't really matter.

    Sometimes I make the silly claim that Java doesn't scale because
    although my hardware has 4G I can't have a heap anywhere  close to
    2G in size.  The cheap memory claim is just silly.

    I don't believe in this, just because if you use 10x more memory
    per user, your scalability will most likely be limited by the

memory usage.Which basically means you will need more machines to serve the

    same number of users, just because you didn't care that much
    about memory usage.

    It's just a stupid claim.


    We had a discussion here about "bloat" lately and my
    understanding is, that this topic is becoming more important
    because e4 will support a multi user environment (please correct
    me if I'm wrong).

    A lot of that talk was about bloat in the byte code and also about
    static data that can never be garbage collected, but instance size
    is quite important too.

    I've been prototyping techniques for significantly reducing the
    size of EObjectImpl.  Perhaps by as much as 50% or more...  In my
    opinion, ever byte saved is a byte earned. :-P


    In such a multi user environment the main concern is the amount
    of memory you need per user, because as you increase the number
    of users at some point in time the memory usage will be dominated

by the objects that are needed per user.Therefore, if we talk about bloat I think that duplicated code

    might not be the biggest problem, but rather duplicated data,
    especially data duplicated per user.

    I think they all add up.  Often people are surprised by the byte
    code as an issue because it's not an issue that scales, but rather
    is a constant.  I recall a case where folks changed their EMF
    generation feature delegation pattern from the normal one to the
    less time efficient Reflection delegation pattern.  They also
    changed the GenPackage's to use Initialize by Loading.  They had
    *huge *models that generally were used only during
    initialization.  The reduction in byte code resulted in a huge
    improvement in startup time and a huge reduction in "retained
    memory", which the the performance loss for data access and the
    increased memory footprint of the instances had no negative
    impact.  This was an excellent example of the opposite of what you
    might expect and a great reminder that measurements speak louder
    than mental exercises and abstract thinking..

    IMHO the only approach that can avoid bloat is therefore to
    carefully design which data can be shared between users and which
data needs to be there per user.I think it would make sense to constantly monitor the memoryusage using automatic tests.The Eclipse Memory Analyzer could be used for this kind of memory
    usage tests.

    I so totally agree.  Measure, measure, measure again.  Measure
    everything.  And when it comes to performance measurement,
    remember that the observer often affects the observed and that
    unfortunately that different JREs and different JIT
    implementations have a huge impact on performance; often more than
    the optimizations you might be trying to achieve with the changes
    you make.


    Regards,
    Markus



    On Wed, Oct 22, 2008 at 8:34 AM, Michael Scharf
    <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>> wrote:

        Hi Markus,


        > I once did some calculations for a simple Hashmap
        implementation versus
        > just using instance variables. See my old blog
        > at http://www.sdn.sap.com/irj/sdn/weblogs?blog=/pub/wlg/5163

        interesting post.

    Yes, I thought both posts were interesting.

        EMF is something in between.

    Almost like a panacea. :-P

        If you use
        generated classes (fixed properties), the overhead is 4

additional object attributes.

    A little worse than that, but I'm working on it in my copious
    spare time.

        In case of dynamic EMF you
        are much better than using HashMaps,

    It's always much better than HashHaps, even for dynamic.  And the
    performance is better as well.

        because the attributes
        are stored in an array and the key (IStructuralFeature) has
        an index into that array (I am sure Ed can give some

numbers here).

    I think Eric confirmed that a EObject.eGet(feature) is twice as
    fast as HashMap.get(key), and we even have
    InternalEObject.eGet(featureID) which is faster yet...

        So, with EMF you have the choice
        between dynamic and fixed properties and you can
        mix both approaches.....

    In the sense you're using here, the set of properties is fixed;
    it's just a case of are individual fields allocated per feature,
    or is an array of slots allocated to hold all the features.


        Unfortunately EMF is not good at delegating non existing

properties to another instance.

    That's not quite true either. :-P

    EMF supports the same type of thing as XML Schema's wildcards.  So
    you can have a property just like <xsd:any>.  Other models
    (<schema>s) can then declare global elements and those global
    elements (properties of the document root of the corresponding
    EPackage) can be used as properties on the object with the
    wildcard property.

        Just two weeks ago I
        worked with a colleague on an extension of EMF that
        allows this (in fact it adds a kind of aspects (AOP) to
        EMF that allows interception of the set/get methods).

>http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html


        Pretty interesting article but quite long -- I started reading
        but after 30 min I decided to "fast read" the rest...

    Yes, I'm not sure I agree with the overall outlook.  Often people
    see difference where I'll see commonalities.  For example, I see
    little significant difference between UML and XML Schema for the
    purpose of this article.  They're both modeling languages, each
    with a few features the other doesn't have, but modeling languages
    nevertheless.




        Michael


            Hi all,
            I agree that's an interesting post. But Steve IMHO
            doesn't point out  that the main problem with this
            approach is that it can have a high memory overhead.
            I once did some calculations for a simple Hashmap
            implementation versus just using instance variables. See
            my old blog at
            http://www.sdn.sap.com/irj/sdn/weblogs?blog=/pub/wlg/5163

            Regards,
            Markus

            On Mon, Oct 20, 2008 at 5:44 PM, Simon Kaegi
            <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
            <mailto:[EMAIL PROTECTED]
            <mailto:[EMAIL PROTECTED]>>> wrote:

http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html



               _______________________________________________
               eclipse-incubator-e4-dev mailing list
               [email protected]
            <mailto:[email protected]>
               <mailto:[email protected]
            <mailto:[email protected]>>

https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev




            
------------------------------------------------------------------------



            _______________________________________________
            eclipse-incubator-e4-dev mailing list
            [email protected]
            <mailto:[email protected]>
            https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev


        _______________________________________________
        eclipse-incubator-e4-dev mailing list
        [email protected]
        <mailto:[email protected]>
        https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev


    ------------------------------------------------------------------------
    _______________________________________________
    eclipse-incubator-e4-dev mailing list
    [email protected]
    <mailto:[email protected]>
    https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev


    _______________________________________________
    eclipse-incubator-e4-dev mailing list
    [email protected]
    <mailto:[email protected]>
    https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev


------------------------------------------------------------------------

_______________________________________________
eclipse-incubator-e4-dev mailing list
[email protected]
https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev

_______________________________________________
eclipse-incubator-e4-dev mailing list
[email protected]
https://dev.eclipse.org/mailman/listinfo/eclipse-incubator-e4-dev

Re: [eclipse-incubator-e4-dev] Interesting article on modelling with prototype-based leanings

Reply via email to