[task #3195] Introduce priority management in BibSched

noreply [Samuele Kaplun] Tue, 24 Jun 2008 08:55:55 +0200

This is an automated notification sent by LCG Savannah.
It relates to:
                task #3195, project CDS Invenio


==============================================================================
 LATEST MODIFICATIONS of task #3195:
==============================================================================

Update of task #3195 (project cdsware):

                  Status:                    None => Done                   
        Percent Complete:                      0% => 100%                   
             Open/Closed:                    Open => Closed                 


==============================================================================
 OVERVIEW of task #3195:
==============================================================================

URL:
  <http://savannah.cern.ch/task/?3195>

                 Summary: Introduce priority management in BibSched
                 Project: CDS Invenio
            Submitted by: jeanyves
            Submitted on: 2006-03-27 09:29
         Should Start On: 2007-03-01 00:00
   Should be Finished on: 2008-07-01 00:00
                Category: BibSched
                Priority: 6
                  Status: Done
                 Privacy: Public
        Percent Complete: 100%
             Assigned to: skaplun
             Open/Closed: Closed
         Discussion Lock: Any
                  Effort: 0.00

    _______________________________________________________


Enable to launch a task with high priority that will interrupt current
running task and delay other tasks with minor priority

    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: 2008-04-24 09:29              By: Tibor Simko <simko>
1) We don't really need to run many parallel tasks in an auto mode in
a *generic* way, taking into account various possible task arguments
as you described.  This could result in a more complex (and therefore
more fragile) task logic.  A more simple (and therefore more robust)
solution should be enough here, with all the knowledge which tasks can
run in parallel and which not being maintained solely within bibsched,
and irrespective of tasks arguments.  That is, tasks {A,B} and {A,C}
can always run in parallel, but never {B,C}.

If we need to run the same task {A,A} in parallel in auto mode with
different arguments, e.g. to run the fulltext indexer alongside the
metadata indexer, operating on different tables, then we can rather
introduce a concept of a different task name (bibindex1, bibindex2) to
distinguish these cases.  (But see also the multi node setup task
#3194, for one single node is strained already enough by Apache,
MySQL, and the running indexer, even if it has several cores.)

2) The task priority management is a different task though, and can be
implemented independently.  (And first, because it is more urgent to
do, and probably easier too.)  The task priority is needed in order to
allow an urgent bibupload process to interrupt any other bibupload
that might be running.  This is best implemented via a new optional
task priority option, because otherwise it's hard to distinguish
between an urgent small maintenance upload and a less-urgent big
maintenance upload submitted by the same cataloguer.  Only the
cataloguer who submits these jobs can assign priorities, we cannot
guess them programmatically from the task arguments.

(Although we could start by using the task originator, that is, if a
bibupload process entered the queue and was submitted by bibedit, it
goes immediately; if it was submitted by bibreformat, it can wait in
the queue, not interrupting anything, not even bibindex.  This would
cover a vast majority of requirements already, except the case cited
above, with several maintenance tasks submitted by the same
cataloguer.  Which is why it's better to introduce this new priority
option, and alter bibedit to submit tasks with the highest priority by
default, etc.)


-------------------------------------------------------
Date: 2008-04-24 07:12              By: Samuele Kaplun <skaplun>
At the same time, adding the possibility of tasks running concurrently. For
this an abstract idea could be:
each running task should update not only its current status and progress, but
an abstract concept of "task runtime description".
e.g. bibindex, once started, can declare on which indexes/records is working
on (at the beginning it could be any index/any record, but after the first
analysis it can definitively update its status) and go to sleep. If nothing
else should run, bibsched awake it and let it go. If another instance of e.g.
bibindex happens to work but on a set of records/indexes than the sleeping
bibindex can be awaken and let work concurrently. if there's a bibupload
correct instance that should run this can do the same. It discover on which
record it is going to run, and if these records are not also
indexed/corrected etc. by other tasks it can be awakened.
Bibsched would just require a checker function that checks whether two tasks
can run concurrently and decide accordingly.





    _______________________________________________________

Carbon-Copy List:

CC Address                          | Comment
------------------------------------+-----------------------------
1576                                | -COM-
2195                                | -COM-
1574                                | -SUB-




==============================================================================

This item URL is:
  <http://savannah.cern.ch/task/?3195>

_______________________________________________
  Message sent via/by LCG Savannah
  http://savannah.cern.ch/

[task #3195] Introduce priority management in BibSched

Reply via email to