I just discovered a fork of GNU Parallel: https://github.com/stephen-fralich/parallel-sql/
It saves into PostgreSQL. If GNU Parallel should have an --sql option, it should be more general than that. It would be obvious to use a DBURL to specify which driver, username, password, and database to use. The most obvious would be having a table containing the columns from --joblog and the arguments. For some uses it would also make sense to have the stderr+stdout. So I am thinking of: --sql mysql://user:pass@host/db/table If the table does not exist: Create it. But should there be an option to not store stderr+stdout? And if so: Should that be default? If saving is forced, then you can always just >/dev/null the output from the job. I can definitely see uses of being able to run 1000000 simulations with 10 different variables and then be able to easily get the output of the jobs where variable A is odd and > variable B (or similar). What should happen if the user uses variable names that are the same as the header of --joblog (e.g. Seq or stdout)? It would also be handy if you could change the status of a job to 'not-run' (which could be represented with exit status -2), so you could change this while GNU Parallel was running or add new jobs. You could then have workers that did took jobs out of a database table: forever parallel --sql mysql://user:pass@host/db/table And a master node that submitted jobs to the table: parallel --dry-run --sql mysql://user:pass@host/db/table the_job ::: the args --dry-run with --sql should put status to 'not-run'. But that would also require some sort of handling of timeout: worker-2 has started job-seq-4 3 seconds ago, and should not be considered timed out, thus no other worker should take that job. GNU Parallel will not depend on DBD-packages installed, but will only used these when the user asks for the driver. So in package speak it should probably 'suggest' the DBD-packages. Ideas? Suggestions? Observations? /Ole