Re: [jruby-dev] Improving load time by parallelizing load/parse?

Jonathan Coveney Mon, 24 Oct 2011 21:47:21 -0700

I was thinking about the case below, and I think that this is an interesting
idea, but I'm wondering how you would resolve certain difficulties. Imagine:


require 'ALib'
a = 10+2
require 'BLib'
b=a/2

where ALib is a lot of random stuff, then:
class Fixnum
  def +(other)
    self*other
  end
end

and BLib is a lot of random stuff, then:
class Fixnum
  def /(other)
    self*other*other
  end
end

How would you know how to resolve these various pieces? I guess you mention
eager interpreting and then a cache, but given that any module can change
any other module's functionality, you would have to keep track of everything
that you eagerly interpreted, and possibly go back depending on what your
module declares. How else would you know that a module that doesn't depend
on any other modules is going to actually execute in a radically different
way because of another module that you have included? The only way I can
think of would be if the thread executing any given piece of code kept track
of the calls that it made and where, and then went back to the earliest
piece it had to in the case that anything was rewritten...but then you could
imagine an even more convoluted case where module A changes an earlier piece
of module B such that it changes how a later piece of itself works...and so
on.

Perhaps this is incoherent, but I think the question of how you deal with
the fact that separately running pieces of code can change the fundamental
underlying state of the world.

2011/10/24 Charles Oliver Nutter <head...@headius.com>

> Nahi planted an interesting seed on Twitter...what if we could
> parallelize parsing of Ruby files when loading a large application?
>
> At a naive level, parallelizing the parse of an individual file is
> tricky to impossible; the parser state is very much straight-line. But
> perhaps it's possible to parallelize loading of many files?
>
> I started playing with parallelizing calls to the parser, but that
> doesn't really help anything; every call to the parser blocks waiting
> for it to complete, and the contents are not interpreted until after
> that point. That means that "require" lines remain totally opaque,
> preventing us from proactively starting threaded parses of additional
> files. But there lies the opportunity: what if load/require requests
> were done as Futures, require/load lines were eagerly interpreted by
> submitting load/require requests to a thread pool, and child requires
> could be loading and parsing at the same time as the parent
> file...without conflicting.
>
> In order to do this, I think we would need to make the following
> modifications:
>
> * LoadService would need to explose Future-based versions of "load"
> and "require". The initial file loaded as the "main" script would be
> synchronous, but subsequent requires and loads could be shunted to a
> thread pool.
> * The parser would need to initiate eager load+parser of files
> encountered in require-like and load-like lines. This load+parse would
> encompass filesystem searching plus content parsing, so all the heavy
> lifting of booting a file would be pushed into the thread pool.
> * Somewhere (perhaps in LoadService) we would maintain an LRU cache
> mapping from file paths to ASTs. The cache would contain Futures;
> getting the actual parsed library would then simply be a matter of
> Future.get, allowing many of the load+parses to be done
> asynchronously.
>
> For a system like Rails, where there might be hundreds of files
> loaded, this could definitely improve startup performance.
>
> Thoughts?
>
> - Charlie
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>

Re: [jruby-dev] Improving load time by parallelizing load/parse?

Reply via email to