On Sun, Jan 12, 2014 at 6:48 PM, Daniel Farina <[email protected]> wrote:
>
> On Sun, Jan 12, 2014 at 6:40 PM, Joe Van Dyk <[email protected]> wrote:
> > from "perf top":
> >
> > 75.38%  [kernel]               [k] hypercall_page
> > 3.39%  python2.7              [.] 0x179be3
> > 3.18%  libc-2.15.so           [.] 0x140931
> > 2.86%  libcrypto.so.1.0.0     [.] 0x66dbf
> > 2.78%  python2.7              [.] PyEval_EvalFrameEx
> > 1.77%  liblzo2.so.2.0.0       [.] lzo1x_decompress_safe
> > 1.02%  [kernel]               [k] copy_user_generic_string
> > 0.73%  liblzo2.so.2.0.0       [.] lzo_adler32
> > 0.59%  python2.7              [.] _PyObject_GenericGetAttrWithDict
> > 0.28%  libc-2.15.so           [.] epoll_ctl
>
> Yeah, this is the kind of trace I don't know what to do about.  I
> suspect it's from making too many syscalls, but I'm not sure, and
> cursory search doesn't give me a slam dunk either, roughly the same as
> last time.  Someone who knows Xen a bit is going to have to divine
> something or I have to put aside a block of time to figure out what it
> means, which is why I didn't go ahead and Just Fix It in the past.
>
> Another profitable use of time might be to run the program through one
> of the Python-level profilers.  Lately I've been thinking I should
> commit "hidden" (as to not clutter the help messages) '--debug'-family
> options, including that one.

In this general vein, one of my colleagues -- Greg Stark -- made a
suggestion: dump gevent and use threads, since WAL-E has a modest
amount of concurrency and the extra memory bloat of a handful of
threads is probably not a big deal.

He suggested this to get more parallel execution of I/O with other
stuff going on in Python, under the model that the "threading" module
spawns new real threads that contend on the GIL, but that GIL
contention is not so important because most of the time the GIL is
released while Python is stuck in read()/write() or whatever.  In
addition, he pointed out that gevent is a weird dependency (it sure
is) and it'd be nice to do without, which I agree (although the size
of the behavioral difference probably means it won't make the nascent
0.7 still in alpha).  But, more relevant to this case, an additional
optimization on the table should the number of syscall executions are
indeed the problem as I am guessing here is that use of blocking I/O
can cut the number of syscalls quite dramatically.

Finally, it looks tractable to do this.  WAL-E's reliance on gevent is
somewhat detailed, but gevent also tends to copy threading constructs
whenever it can.  So, I suspect a port will be of some labor, but
perhaps not too much, and maybe less than some of the new storage
backends that went in 0.7.

I don't have time to do this right now, but if someone is interested
in performance work, well...

-- 
You received this message because you are subscribed to the Google Groups 
"wal-e" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to