Re: [Help-glpk] Multithreading/parallelization

Reginald Beardsley Sat, 15 Dec 2012 06:57:48 -0800

While we wait to hear from Andrew, I made a quick assessment.  glpk is under 
100k lines w/ under 600 static declarations in the src directory.  I did not 
look in the w32 & w64 directories.


I consider that large, but comfortable.  Particularly when I take into account 
the high quality of the code.  I've found myself the sole support for much 
larger codes that were  poorly written by many hands.  Relative to past 
experience such as porting 500k lines of FORTRAN from VMS to Unix, this looks 
pretty simple.  I've also dealt w/ running old non-reentrant FORTRAN codes in a 
large seismic processing system by loading and unloading  named COMMON in a 
wrapper so the FORTRAN codes did not require modification.  I don't have any 
experience w/ threads per se, but that's a minor detail relative to a project 
like this.

If Andrew is amenable to an attempt to make glpk multithreaded, I'll print the 
source and start reading code.  That will take some time, but it may save a lot 
of effort.  In particular I want to study the possible options for doing this 
w/ minimum effort.  The fact that the code is the work of a single, disciplined 
hand offers the possibility of this being rather less work than if many people 
had worked on it.

A possible solution of particular interest is making each of the non-reentrant 
items an array indexed by thread number.  I won't know if that's possible until 
I read the code, but it might yield a very elegant solution.  Other 
possibilities will suggest themselves in due course once I understand the 
internal structure.

For many years I supported the Seismic Unix package from the Center for Wave 
Phenomena at the Colorado School of Mines.  It's 360k+ lines w/ 400+ programs 
written by many hands using key=value parameter input. A continual problem w/ 
that package was the lack of any error checking for typos.  I had made an 
experiment at fixing the problem, but never implemented it because it would 
have taken several months to modify all the programs in the package.  Then one 
day much later I woke up w/ an important insight into the calling structure of 
the package and did the entire job in under two hours from start to finish.  
Now unrecognized command line parameters are reported to the user, avoiding the 
user discovering the error after a long run when looking at the results.

In general, Harley's proposal seems to me the way to go.  That way if we get 
into trouble, it's easy to back up and start over.  If my idea about indexing 
by thread is viable, we'd actually just do the whole package in one go.  I'd 
like to keep the number of people working on this as small as practical. 2-3 is 
pretty much ideal, but certainly no more than 5.

However, the most important consideration is a good fit w/ Andrew's intentions 
and desires.  I wouldn't expect him to accept something like this until he'd 
seen it, but I don't want to start if he has a fundamental objection to the 
project.

Have Fun!
Reg

_______________________________________________
Help-glpk mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/help-glpk

Re: [Help-glpk] Multithreading/parallelization

Reply via email to