On Thu, Dec 10, 2015 at 12:52:53PM -0800, John Johansen wrote: > commit 7bb90276fd110c3da605e8633dc7a7d220f2ed26 > Author: John Johansen <[email protected]> > Date: Sat Oct 17 00:28:46 2015 -0700 > > parser: add basic support for parallel compiles and loads > > This adds a basic support for parallel compiles. It uses a fork()/wait > model due to the parsers current dependence on global variables and > structures. It has been setup in a similar manner to how cilk handles > multithreading to make it easy to port to a managed thread model once > the parser removes the dependence on global compute structures in the > backend. > > This patch adds two new command line flags > -j or --jobs which follows the make syntax of specifying parallel jobs > currently defaults to -jauto > -j8 or --jobs=8 allows for 8 parallel jobs > -j or -jauto or --jobs=auto sets the jobs to the # of cpus > -j*4 or --jobs=*4 sets the jobs to # of cpus * 4 > -j*1 is equivalent to -jauto > > --max_jobs=n allows setting hard cap on the number of jobs > that can be specified by --jobs. > It defaults to the number of processors in the system * 8. It supports > the "auto" and "max" keywords, and using *n for a multiple of the > available cpus. > > additionally the -d flag has been modified to take an optional parameter > and > --debug=jobs > will output debug information for the job control logic. > > In light testing on one machine the job control logic provides a nice > performance boost. On an x86 test machine with 60 profiles in the > /etc/apparmor.d/ directory, for the command > time apparmor_parser -QT /etc/apparmor.d/ > > old (equiv of -j1): > real 0m10.968s > user 0m10.888s > sys 0m0.088s > > ubuntu parallel load using xargs: > real 0m8.003s > user 0m21.680s > sys 0m0.216s > > -j: > real 0m6.547s > user 0m17.900s > sys 0m0.132s > > Signed-off-by: John Johansen <[email protected]>
I've just started playing around with this, so not a full review. I did
notice one thing, that was discussed a bit in the previous thread:
> >> +#define work_spawn(WORK, RESULT) \
> >> +do {
> >> \
> >> + if (jobs_max == 1) { \
> >> + /* no parallel work so avoid fork() overhead */ \
> >> + RESULT(WORK); \
> >> + break; \
> >> + } \
> >
> > I'm not sure I like this optimization; even though it looks correct, I
> > fear it would complicate future maintenance efforts. (If we want to use
> > work_spawn to work on a single item, then call work_sync_one(), the whole
> > thing will exit because of an error return from wait().) If it really
> > saves a ton of time maybe it's worth the extra maintenance effort but I
> > think even the tiniest of environments will still benefit from two jobs.
> >
> Do you really think, this 1 little condition tucked away is that much more
> maintenance? The way it is set up, it is completely transparent to the
> rest of the code. That is we can delete it and the rest of the code doesn't
> need to change.
>
> Notice this isn't about working on a single item, like work_sync_one()
> would be for, this is you can have 1 job running (jobs_max == 1), we
> still may be processing 50 profiles, they are just all done sequentially.
>
> So atm if you have jobs_max > 1 and only pass it a single profile the
> parser does fork an process that job in a separate forked child.
Note that there is one behavioral difference by doing so: if a specific
policy compilation fails in such a way that exit() is called, in the
-j1 case it stops there, but in the -jN where N > 1 case it continues
on attempting to load additional policies. That's a behavioral change
that I don't think users would expect from switching the number of
threads used.
--
Steve Beattie
<[email protected]>
http://NxNW.org/~steve/
signature.asc
Description: Digital signature
-- AppArmor mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/apparmor
