Re: [Jruby-devel] Profiling work - memory

Charles O Nutter Sat, 08 Jul 2006 15:17:44 -0700

On 7/8/06, Thomas E Enebo <[EMAIL PROTECTED]> wrote:

  When I originally changed SourcePosition for RDT and jEdit
folks I had planned on leaving our impl at the single SourcePosition
per line (that is all Ruby cares about for a runtime).  After changing
SourcePosition I thought of some ways to get the memory usage much lower
for SourcePosition (I wrote about this some time ago but I believe the idea
was):
1. Make Node impl SourcePosition which saves one Object reference [4 bytes]
2. Make line number a derived thing which calls back on position
   to the LexerSource newline table [2 x 4 bytes].  Line endings
   would go from O(1) to O(ln n) in int accesses.
3. Consider splitting start and end line into bit operations and
   storing them into a single int [4 bytes].  This one is most
   dubious to me.  This obviously limits precision to less than int
   (Ruby uses int in C so this potentially would make JRuby have less
    precision -- what is the largest byte offset you have seen in
    Ruby code?)

I have a patch for this third one I can toss out (it's at home). I wouldn't be concerned about line having a problem (>65k lines??) but I'm a little sketch about offset (since I'm pretty sure I've seen some .rb files >65k bytes).

The Node < Position idea would be a good one, especially if we can reduce the size of positioning info. Also, I see that name is stored in SourcePosition for the name of the file. I assume this is a single object and not a copy for each, yes?

You'll have to explain #2 to me a bit more.

All this will have bearing on compilation as well (how to handle stack traces in compiled code) and on pre-parsing (pre-parsing more of the libraries may eat up more memory up front).

  The nice thing about this stuff is later on once RDT and friends
start embedding our interpreter (not just AST) then they still can
without needing to hack stuff.

  Also some of my newline consolidation was left out for 0.9.0 since it
meant updating all my sourceposition tests.  That should kill off some
extra nodes.

  The other thing we talked about is that we have CharSequence and
StringBuffer for every string right now.  I suspect only creating a
StringBuffer for duration of mutable operations may still yield close
performance while halving the memory cost (of course backing string with
byte also seems like a decent space winner).  That may have quite an impact
on char[] profile.  I wonder if anyone has a byte String and StringBuffer
impl?

Yeah, I think we had these discussions offline, but for those out in the ether, consider this: Ruby uses string for all IO...you read files into strings, read sockets into strings...strings string strings. The problem with JRuby is that our string uses char[] internally because Java is all UTF-16. That means for all those byte operations and byte streams we're basically using double the memory necessary. Ouch.

I have been playing and emailing back and forth with Tim Bray about his Ustr package, a string implementation that uses UTF-8 internally. It's slow when converting UTF-8 to UTF-16 and back, naturally, but for operations that remain UTF-8 it's actually just as fast as Java's UTF-16 string...and faster in some cases since it's a mutable null-terminated string. The idea I had was that Ustr could be used to back RubyString so that JRuby actually uses UTF-8 internally; it would still support full unicode, like we want, but we wouldn't be so wasteful about bytes when doing IO. Tim asked me not to share the code until he works out licensing and makes it officially public/open, but he said he'd commit to improving it and building it out if we decided to use it. I think it's worth some study.

yyparse is pretty transient in the memory department no? Once we are
done parsing that memory gets reclaimed.

The profiling I did referred to retained memory; I can look at the profiling information more to see for sure.

  Just more fuel for the memory fire....

-Tom

On Sat, 08 Jul 2006, Charles O Nutter defenestrated me:
>
>    This is an early snapshot from an ongoing gem run, same as the
>    previous CPU profiling. At this point only 34MB of objects were live.
>    This is with my ObjectSpace impl in place. I will run another profile
>    without it to see how it changes.
>    By method...
>    - ObjectSpace is a BIG winner in the memory department. At the time I
>    pulled off a snapshot, ObjectSpace.add was responsible for allocating
>    24% of all memory in use.
>    - Second place goes to SourcePosition.getPosition comes in second,
>    being responsible for 6% of memory usage.
>    - DefaultRubyParser.yyparse is third with 4%. There are a number of
>    other parse-related methods within and without this chain of 1-2%
>    each. Parsing and building the AST eats up a substantial chunk of
>    memory which would be partially eliminated by pre-parsing and
>    completely eliminated by compilation.
>    By type...
>    - char[] accounted for 32% of memory in use Consider that all strings
>    in Ruby could be byte[] and we could cut memory usage in this scenario
>    by 16% with a non-char[] RubyString.
>    - ObjectSpace$WeakReferenceListNode instances took up another 22% of
>    memory. ObjectSpace is just a bitch, no matter how you slice it.
>    - SourcePosition instances took up 7%
>    --
>    Charles Oliver Nutter @ [1]headius.blogspot.com
>    JRuby Developer @ [2]www.jruby.org
>    Application Architect @ [3]www.ventera.com
>
> References
>
>    1. http://headius.blogspot.com/
>    2. http://www.jruby.org/
>    3. http://www.ventera.com/

> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

> _______________________________________________
> Jruby-devel mailing list
> Jruby-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jruby-devel

--
+ http://www.tc.umn.edu/~enebo +---- mailto:[EMAIL PROTECTED] ----+
| Thomas E Enebo, Protagonist  | "Luck favors the prepared    |
|                              |  mind." -Louis Pasteur       |

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Jruby-devel mailing list
Jruby-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jruby-devel

--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

_______________________________________________
Jruby-devel mailing list
Jruby-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jruby-devel

Re: [Jruby-devel] Profiling work - memory

Reply via email to