Re: Parallel Digest, Vol 63, Issue 5

Rick Leir Thu, 30 Jul 2015 09:47:28 -0700

The Unix 'way' of simple tools connected by pipes is good in my mind. And
Parallel is perhaps a bit complex for this model. Sorry if it sounds like a
complaint; Parallel is working great for me.


I use Parallel to launch a Perl program, and it writes to a DB. There are
other options too.
Rick

On Thu, Jul 30, 2015 at 12:00 PM, <parallel-requ...@gnu.org> wrote:

> Send Parallel mailing list submissions to
>         parallel@gnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.gnu.org/mailman/listinfo/parallel
> or, via email, send a message with subject or body 'help' to
>         parallel-requ...@gnu.org
>
> You can reach the person managing the list at
>         parallel-ow...@gnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Parallel digest..."
>
>
> Today's Topics:
>
>    1. SQL save mode for GNU Parallel? (Ole Tange)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 30 Jul 2015 14:26:34 +0200
> From: Ole Tange <ta...@gnu.org>
> To: "parallel@gnu.org" <parallel@gnu.org>
> Cc: Stephen Fralich <s...@uw.edu>
> Subject: SQL save mode for GNU Parallel?
> Message-ID:
>         <CA+4vN7xMnunACOgrCMLWXNR_hn1OwWi20=
> opcr+wmzxy_ke...@mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> I just discovered a fork of GNU Parallel:
> https://github.com/stephen-fralich/parallel-sql/
>
> It saves into PostgreSQL.
>
> If GNU Parallel should have an --sql option, it should be more general
> than that. It would be obvious to use a DBURL to specify which driver,
> username, password, and database to use.
>
> The most obvious would be having a table containing the columns from
> --joblog and the arguments. For some uses it would also make sense to
> have the stderr+stdout.
>
> So I am thinking of:
>
>   --sql mysql://user:pass@host/db/table
>
> If the table does not exist: Create it.
>
> But should there be an option to not store stderr+stdout? And if so:
> Should that be default? If saving is forced, then you can always just
> >/dev/null the output from the job.
>
> I can definitely see uses of being able to run 1000000 simulations
> with 10 different variables and then be able to easily get the output
> of the jobs where variable A is odd and > variable B (or similar).
>
> What should happen if the user uses variable names that are the same
> as the header of --joblog (e.g. Seq or stdout)?
>
> It would also be handy if you could change the status of a job to
> 'not-run' (which could be represented with exit status -2), so you
> could change this while GNU Parallel was running or add new jobs.
>
> You could then have workers that did took jobs out of a database table:
>
>   forever parallel --sql mysql://user:pass@host/db/table
>
> And a master node that submitted jobs to the table:
>
>   parallel --dry-run --sql mysql://user:pass@host/db/table the_job :::
> the args
>
> --dry-run with --sql should put status to 'not-run'.
>
> But that would also require some sort of handling of timeout: worker-2
> has started job-seq-4 3 seconds ago, and should not be considered
> timed out, thus no other worker should take that job.
>
> GNU Parallel will not depend on DBD-packages installed, but will only
> used these when the user asks for the driver. So in package speak it
> should probably 'suggest' the DBD-packages.
>
> Ideas? Suggestions? Observations?
>
>
> /Ole
>
>
>
> ------------------------------
>
> _______________________________________________
> Parallel mailing list
> Parallel@gnu.org
> https://lists.gnu.org/mailman/listinfo/parallel
>
>
> End of Parallel Digest, Vol 63, Issue 5
> ***************************************
>



-- 
Rick Leir
Developer, Canadiana.ca

Re: Parallel Digest, Vol 63, Issue 5

Reply via email to