Re: the purpose of the two scheduling tasks in Griffin

大鹏 Mon, 04 Mar 2019 18:20:32 -0800

What are the benefits of the current design?
I think only one job is enough. Is there only one JobInstance? After the 
preparation of the data required by spark, JobInstance can pass the data to the 
relevant methods of the SparkSubmitJob class





On 03/5/2019 10:15，Kevin Yao<[email protected]> wrote：
Hi,
JobInstance and SparkSubmitJob all implement the job interface.
JobInstance is mainly used to set source and predicate partitions that is
to split source data or predicate paths into several part and get every
part start timestamp. For examlple, When creating a measure, you configure
the *where *field to *dt=#YYYYMMdd# AND hour=#HH# *format and the predicate
*path* field to */dt=#YYYYMMdd#/hour=#HH#/_DONE* format. After JobInstance
is executed, the *where* value will become *dt=20190305 AND hour=01* and
the *path* value will become */dt=20190305/hour=00/_DONE *(value is just a
sample).  In short, it converts some of the configurations in measurement
into directly available data for SparkSubmitJob prediction and spark
calculation.

The SparkSubmitJob is mainly used to predicate whether the calculated data
is ready. If ready, Livy will submit the measure configuration where
converted in JobInstance to spark. Otherwise,  predication will be
continued for a certain number of times  wiht your configuration (default
is 12 times).


Thanks,
Kevin


On Mon, Mar 4, 2019 at 11:31 AM 大鹏 <[email protected]> wrote:





I don't know the purpose of the two scheduling tasks in Griffin:


JobInstance and SparkSubmitJob，what is the connection between them?

Re: the purpose of the two scheduling tasks in Griffin

Reply via email to