Re: [jruby-dev] Improving load time by parallelizing load/parse?

Andrew Cholakian Mon, 24 Oct 2011 21:56:33 -0700

I'm wondering how much of the issue is IO and how much is CPU time required
to parse. Would it be easiest to just do a quick scan for module
dependencies and cache all the files ASAP, then parse serially? I'm not sure
if it'd be possible to do a quick parse for just 'require'.


On Mon, Oct 24, 2011 at 9:47 PM, Jonathan Coveney <jcove...@gmail.com>wrote:

> I was thinking about the case below, and I think that this is an
> interesting idea, but I'm wondering how you would resolve certain
> difficulties. Imagine:
>
> require 'ALib'
> a = 10+2
> require 'BLib'
> b=a/2
>
> where ALib is a lot of random stuff, then:
> class Fixnum
>   def +(other)
>     self*other
>   end
> end
>
> and BLib is a lot of random stuff, then:
> class Fixnum
>   def /(other)
>     self*other*other
>   end
> end
>
> How would you know how to resolve these various pieces? I guess you mention
> eager interpreting and then a cache, but given that any module can change
> any other module's functionality, you would have to keep track of everything
> that you eagerly interpreted, and possibly go back depending on what your
> module declares. How else would you know that a module that doesn't depend
> on any other modules is going to actually execute in a radically different
> way because of another module that you have included? The only way I can
> think of would be if the thread executing any given piece of code kept track
> of the calls that it made and where, and then went back to the earliest
> piece it had to in the case that anything was rewritten...but then you could
> imagine an even more convoluted case where module A changes an earlier piece
> of module B such that it changes how a later piece of itself works...and so
> on.
>
> Perhaps this is incoherent, but I think the question of how you deal with
> the fact that separately running pieces of code can change the fundamental
> underlying state of the world.
>
>
> 2011/10/24 Charles Oliver Nutter <head...@headius.com>
>
>> Nahi planted an interesting seed on Twitter...what if we could
>> parallelize parsing of Ruby files when loading a large application?
>>
>> At a naive level, parallelizing the parse of an individual file is
>> tricky to impossible; the parser state is very much straight-line. But
>> perhaps it's possible to parallelize loading of many files?
>>
>> I started playing with parallelizing calls to the parser, but that
>> doesn't really help anything; every call to the parser blocks waiting
>> for it to complete, and the contents are not interpreted until after
>> that point. That means that "require" lines remain totally opaque,
>> preventing us from proactively starting threaded parses of additional
>> files. But there lies the opportunity: what if load/require requests
>> were done as Futures, require/load lines were eagerly interpreted by
>> submitting load/require requests to a thread pool, and child requires
>> could be loading and parsing at the same time as the parent
>> file...without conflicting.
>>
>> In order to do this, I think we would need to make the following
>> modifications:
>>
>> * LoadService would need to explose Future-based versions of "load"
>> and "require". The initial file loaded as the "main" script would be
>> synchronous, but subsequent requires and loads could be shunted to a
>> thread pool.
>> * The parser would need to initiate eager load+parser of files
>> encountered in require-like and load-like lines. This load+parse would
>> encompass filesystem searching plus content parsing, so all the heavy
>> lifting of booting a file would be pushed into the thread pool.
>> * Somewhere (perhaps in LoadService) we would maintain an LRU cache
>> mapping from file paths to ASTs. The cache would contain Futures;
>> getting the actual parsed library would then simply be a matter of
>> Future.get, allowing many of the load+parses to be done
>> asynchronously.
>>
>> For a system like Rails, where there might be hundreds of files
>> loaded, this could definitely improve startup performance.
>>
>> Thoughts?
>>
>> - Charlie
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>    http://xircles.codehaus.org/manage_email
>>
>>
>>
>

Re: [jruby-dev] Improving load time by parallelizing load/parse?

Reply via email to