Re: [jruby-dev] Improving load time by parallelizing load/parse?

Charles Oliver Nutter Tue, 25 Oct 2011 08:37:24 -0700

That's a big unknown for us. It does not seem to be heavily IO-driven,
since using nailgun does help "pure load" scenarios speed up
significantly. For example:


INIT OF JRUBY ALONE

system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5
                          user     system      total        real
in-process `jruby `   0.043000   0.000000   0.043000 (  0.027000)
                          user     system      total        real
in-process `jruby `   0.045000   0.000000   0.045000 (  0.045000)
                          user     system      total        real
in-process `jruby `   0.018000   0.000000   0.018000 (  0.018000)
                          user     system      total        real
in-process `jruby `   0.014000   0.000000   0.014000 (  0.014000)
                          user     system      total        real
in-process `jruby `   0.014000   0.000000   0.014000 (  0.014000)

INIT OF JRUBY PLUS -rubygems

system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 -rubygems
                          user     system      total        real
in-process `jruby -rubygems`  0.193000   0.000000   0.193000 (  0.177000)
                          user     system      total        real
in-process `jruby -rubygems`  0.085000   0.000000   0.085000 (  0.085000)
                          user     system      total        real
in-process `jruby -rubygems`  0.085000   0.000000   0.085000 (  0.085000)
                          user     system      total        real
in-process `jruby -rubygems`  0.071000   0.000000   0.071000 (  0.071000)
                          user     system      total        real
in-process `jruby -rubygems`  0.076000   0.000000   0.076000 (  0.076000)

...PLUS require 'activerecord'

system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 "-rubygems
-e \"require 'activerecord'\""
                          user     system      total        real
in-process `jruby -rubygems -e "require 'activerecord'"`  0.192000
0.000000   0.192000 (  0.176000)
                          user     system      total        real
in-process `jruby -rubygems -e "require 'activerecord'"`  0.087000
0.000000   0.087000 (  0.087000)
                          user     system      total        real
in-process `jruby -rubygems -e "require 'activerecord'"`  0.087000
0.000000   0.087000 (  0.087000)
                          user     system      total        real
in-process `jruby -rubygems -e "require 'activerecord'"`  0.069000
0.000000   0.069000 (  0.069000)
                          user     system      total        real
in-process `jruby -rubygems -e "require 'activerecord'"`  0.078000
0.000000   0.078000 (  0.078000)

Note how much startup improves for subsequent runs in the -rubygems
and -r activerecord cases. If it were solely IO-bound, we wouldn't see
that much improvement.

Startup time issues are a combination of factors:

* IO, including filesystem searching and the actual read of the file
* Parsing and AST building
* JVM being cold; our parser, interpreter, core classes are all
running at their slowest
* Internal caches getting vigorously flushed at boot, since there's so
many methods and constants being created

My parallelizing patch helps the first three but didn't make a big
difference in actual execution of commands like "rake test" in a Rails
app. I'm going to poke at startup a bit more today and see if I can
figure out how much time in "rake test" is *actually* booting versus
execution.

- Charlie

On Mon, Oct 24, 2011 at 11:56 PM, Andrew Cholakian <[email protected]> wrote:
> I'm wondering how much of the issue is IO and how much is CPU time required
> to parse. Would it be easiest to just do a quick scan for module
> dependencies and cache all the files ASAP, then parse serially? I'm not sure
> if it'd be possible to do a quick parse for just 'require'.
>
> On Mon, Oct 24, 2011 at 9:47 PM, Jonathan Coveney <[email protected]>
> wrote:
>>
>> I was thinking about the case below, and I think that this is an
>> interesting idea, but I'm wondering how you would resolve certain
>> difficulties. Imagine:
>>
>> require 'ALib'
>> a = 10+2
>> require 'BLib'
>> b=a/2
>>
>> where ALib is a lot of random stuff, then:
>> class Fixnum
>>   def +(other)
>>     self*other
>>   end
>> end
>>
>> and BLib is a lot of random stuff, then:
>> class Fixnum
>>   def /(other)
>>     self*other*other
>>   end
>> end
>>
>> How would you know how to resolve these various pieces? I guess you
>> mention eager interpreting and then a cache, but given that any module can
>> change any other module's functionality, you would have to keep track of
>> everything that you eagerly interpreted, and possibly go back depending on
>> what your module declares. How else would you know that a module that
>> doesn't depend on any other modules is going to actually execute in a
>> radically different way because of another module that you have included?
>> The only way I can think of would be if the thread executing any given piece
>> of code kept track of the calls that it made and where, and then went back
>> to the earliest piece it had to in the case that anything was
>> rewritten...but then you could imagine an even more convoluted case where
>> module A changes an earlier piece of module B such that it changes how a
>> later piece of itself works...and so on.
>>
>> Perhaps this is incoherent, but I think the question of how you deal with
>> the fact that separately running pieces of code can change the fundamental
>> underlying state of the world.
>>
>> 2011/10/24 Charles Oliver Nutter <[email protected]>
>>>
>>> Nahi planted an interesting seed on Twitter...what if we could
>>> parallelize parsing of Ruby files when loading a large application?
>>>
>>> At a naive level, parallelizing the parse of an individual file is
>>> tricky to impossible; the parser state is very much straight-line. But
>>> perhaps it's possible to parallelize loading of many files?
>>>
>>> I started playing with parallelizing calls to the parser, but that
>>> doesn't really help anything; every call to the parser blocks waiting
>>> for it to complete, and the contents are not interpreted until after
>>> that point. That means that "require" lines remain totally opaque,
>>> preventing us from proactively starting threaded parses of additional
>>> files. But there lies the opportunity: what if load/require requests
>>> were done as Futures, require/load lines were eagerly interpreted by
>>> submitting load/require requests to a thread pool, and child requires
>>> could be loading and parsing at the same time as the parent
>>> file...without conflicting.
>>>
>>> In order to do this, I think we would need to make the following
>>> modifications:
>>>
>>> * LoadService would need to explose Future-based versions of "load"
>>> and "require". The initial file loaded as the "main" script would be
>>> synchronous, but subsequent requires and loads could be shunted to a
>>> thread pool.
>>> * The parser would need to initiate eager load+parser of files
>>> encountered in require-like and load-like lines. This load+parse would
>>> encompass filesystem searching plus content parsing, so all the heavy
>>> lifting of booting a file would be pushed into the thread pool.
>>> * Somewhere (perhaps in LoadService) we would maintain an LRU cache
>>> mapping from file paths to ASTs. The cache would contain Futures;
>>> getting the actual parsed library would then simply be a matter of
>>> Future.get, allowing many of the load+parses to be done
>>> asynchronously.
>>>
>>> For a system like Rails, where there might be hundreds of files
>>> loaded, this could definitely improve startup performance.
>>>
>>> Thoughts?
>>>
>>> - Charlie
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe from this list, please visit:
>>>
>>>    http://xircles.codehaus.org/manage_email
>>>
>>>
>>
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Re: [jruby-dev] Improving load time by parallelizing load/parse?

Reply via email to