So here's one discovery...I turned on the JVM's sampling profiler (--sample flag to JRuby) when running "rake test" and discovered that it causes *four* JVM processes to be launched. Seriously?
If they're all booting Rails, it's no wonder "rake test" takes forever. I'm looking into it now. - Charlie On Tue, Oct 25, 2011 at 10:36 AM, Charles Oliver Nutter <head...@headius.com> wrote: > That's a big unknown for us. It does not seem to be heavily IO-driven, > since using nailgun does help "pure load" scenarios speed up > significantly. For example: > > INIT OF JRUBY ALONE > > system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 > user system total real > in-process `jruby ` 0.043000 0.000000 0.043000 ( 0.027000) > user system total real > in-process `jruby ` 0.045000 0.000000 0.045000 ( 0.045000) > user system total real > in-process `jruby ` 0.018000 0.000000 0.018000 ( 0.018000) > user system total real > in-process `jruby ` 0.014000 0.000000 0.014000 ( 0.014000) > user system total real > in-process `jruby ` 0.014000 0.000000 0.014000 ( 0.014000) > > INIT OF JRUBY PLUS -rubygems > > system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 -rubygems > user system total real > in-process `jruby -rubygems` 0.193000 0.000000 0.193000 ( 0.177000) > user system total real > in-process `jruby -rubygems` 0.085000 0.000000 0.085000 ( 0.085000) > user system total real > in-process `jruby -rubygems` 0.085000 0.000000 0.085000 ( 0.085000) > user system total real > in-process `jruby -rubygems` 0.071000 0.000000 0.071000 ( 0.071000) > user system total real > in-process `jruby -rubygems` 0.076000 0.000000 0.076000 ( 0.076000) > > ...PLUS require 'activerecord' > > system ~/projects/jruby $ jruby bench/bench_jruby_init.rb 5 "-rubygems > -e \"require 'activerecord'\"" > user system total real > in-process `jruby -rubygems -e "require 'activerecord'"` 0.192000 > 0.000000 0.192000 ( 0.176000) > user system total real > in-process `jruby -rubygems -e "require 'activerecord'"` 0.087000 > 0.000000 0.087000 ( 0.087000) > user system total real > in-process `jruby -rubygems -e "require 'activerecord'"` 0.087000 > 0.000000 0.087000 ( 0.087000) > user system total real > in-process `jruby -rubygems -e "require 'activerecord'"` 0.069000 > 0.000000 0.069000 ( 0.069000) > user system total real > in-process `jruby -rubygems -e "require 'activerecord'"` 0.078000 > 0.000000 0.078000 ( 0.078000) > > Note how much startup improves for subsequent runs in the -rubygems > and -r activerecord cases. If it were solely IO-bound, we wouldn't see > that much improvement. > > Startup time issues are a combination of factors: > > * IO, including filesystem searching and the actual read of the file > * Parsing and AST building > * JVM being cold; our parser, interpreter, core classes are all > running at their slowest > * Internal caches getting vigorously flushed at boot, since there's so > many methods and constants being created > > My parallelizing patch helps the first three but didn't make a big > difference in actual execution of commands like "rake test" in a Rails > app. I'm going to poke at startup a bit more today and see if I can > figure out how much time in "rake test" is *actually* booting versus > execution. > > - Charlie > > On Mon, Oct 24, 2011 at 11:56 PM, Andrew Cholakian <and...@andrewvc.com> > wrote: >> I'm wondering how much of the issue is IO and how much is CPU time required >> to parse. Would it be easiest to just do a quick scan for module >> dependencies and cache all the files ASAP, then parse serially? I'm not sure >> if it'd be possible to do a quick parse for just 'require'. >> >> On Mon, Oct 24, 2011 at 9:47 PM, Jonathan Coveney <jcove...@gmail.com> >> wrote: >>> >>> I was thinking about the case below, and I think that this is an >>> interesting idea, but I'm wondering how you would resolve certain >>> difficulties. Imagine: >>> >>> require 'ALib' >>> a = 10+2 >>> require 'BLib' >>> b=a/2 >>> >>> where ALib is a lot of random stuff, then: >>> class Fixnum >>> def +(other) >>> self*other >>> end >>> end >>> >>> and BLib is a lot of random stuff, then: >>> class Fixnum >>> def /(other) >>> self*other*other >>> end >>> end >>> >>> How would you know how to resolve these various pieces? I guess you >>> mention eager interpreting and then a cache, but given that any module can >>> change any other module's functionality, you would have to keep track of >>> everything that you eagerly interpreted, and possibly go back depending on >>> what your module declares. How else would you know that a module that >>> doesn't depend on any other modules is going to actually execute in a >>> radically different way because of another module that you have included? >>> The only way I can think of would be if the thread executing any given piece >>> of code kept track of the calls that it made and where, and then went back >>> to the earliest piece it had to in the case that anything was >>> rewritten...but then you could imagine an even more convoluted case where >>> module A changes an earlier piece of module B such that it changes how a >>> later piece of itself works...and so on. >>> >>> Perhaps this is incoherent, but I think the question of how you deal with >>> the fact that separately running pieces of code can change the fundamental >>> underlying state of the world. >>> >>> 2011/10/24 Charles Oliver Nutter <head...@headius.com> >>>> >>>> Nahi planted an interesting seed on Twitter...what if we could >>>> parallelize parsing of Ruby files when loading a large application? >>>> >>>> At a naive level, parallelizing the parse of an individual file is >>>> tricky to impossible; the parser state is very much straight-line. But >>>> perhaps it's possible to parallelize loading of many files? >>>> >>>> I started playing with parallelizing calls to the parser, but that >>>> doesn't really help anything; every call to the parser blocks waiting >>>> for it to complete, and the contents are not interpreted until after >>>> that point. That means that "require" lines remain totally opaque, >>>> preventing us from proactively starting threaded parses of additional >>>> files. But there lies the opportunity: what if load/require requests >>>> were done as Futures, require/load lines were eagerly interpreted by >>>> submitting load/require requests to a thread pool, and child requires >>>> could be loading and parsing at the same time as the parent >>>> file...without conflicting. >>>> >>>> In order to do this, I think we would need to make the following >>>> modifications: >>>> >>>> * LoadService would need to explose Future-based versions of "load" >>>> and "require". The initial file loaded as the "main" script would be >>>> synchronous, but subsequent requires and loads could be shunted to a >>>> thread pool. >>>> * The parser would need to initiate eager load+parser of files >>>> encountered in require-like and load-like lines. This load+parse would >>>> encompass filesystem searching plus content parsing, so all the heavy >>>> lifting of booting a file would be pushed into the thread pool. >>>> * Somewhere (perhaps in LoadService) we would maintain an LRU cache >>>> mapping from file paths to ASTs. The cache would contain Futures; >>>> getting the actual parsed library would then simply be a matter of >>>> Future.get, allowing many of the load+parses to be done >>>> asynchronously. >>>> >>>> For a system like Rails, where there might be hundreds of files >>>> loaded, this could definitely improve startup performance. >>>> >>>> Thoughts? >>>> >>>> - Charlie >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe from this list, please visit: >>>> >>>> http://xircles.codehaus.org/manage_email >>>> >>>> >>> >> >> > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email