Re: the purpose of the two scheduling tasks in Griffin

Kevin Yao Mon, 04 Mar 2019 18:14:27 -0800

Hi,
JobInstance and SparkSubmitJob all implement the job interface.
JobInstance is mainly used to set source and predicate partitions that is
to split source data or predicate paths into several part and get every
part start timestamp. For examlple, When creating a measure, you configure
the *where *field to *dt=#YYYYMMdd# AND hour=#HH# *format and the predicate
*path* field to */dt=#YYYYMMdd#/hour=#HH#/_DONE* format. After JobInstance
is executed, the *where* value will become *dt=20190305 AND hour=01* and
the *path* value will become */dt=20190305/hour=00/_DONE *(value is just a
sample).  In short, it converts some of the configurations in measurement
into directly available data for SparkSubmitJob prediction and spark
calculation.

The SparkSubmitJob is mainly used to predicate whether the calculated data
is ready. If ready, Livy will submit the measure configuration where
converted in JobInstance to spark. Otherwise,  predication will be
continued for a certain number of times  wiht your configuration (default
is 12 times).

Thanks,
Kevin

On Mon, Mar 4, 2019 at 11:31 AM 大鹏 <[email protected]> wrote:

>
>
>
>
> I don't know the purpose of the two scheduling tasks in Griffin:
>
>
> JobInstance and SparkSubmitJob，what is the connection between them?

Re: the purpose of the two scheduling tasks in Griffin

Reply via email to