[ 
https://issues.apache.org/jira/browse/GRIFFIN-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wan kun updated GRIFFIN-259:
----------------------------
    Description: 
There are sequence of dq steps in each DQJob,and run those steps one by one 
(with foldLeft function).

We can use multiple threads to run some of those steps which have no dependency.

For example:

In a DQBatchJob, a accuracyExpr will have for steps *__missRecords ,__missCount 
, __totalCount,   accu .*

*__missCount_* _and ***__totalCount* step can run at the same time .

 

In SeqDQStep ,it just need contains some root steps without dependency steps.

If each step knows it's dependency steps, and when they are ready, we can run 
the step itself .

 

 
{code:java}
Running step : 
accu
| |---__missCount
| | |---__missRecords
| |---__totalCount

Running step : 
__missCount
| |---__missRecords

Running step : 
__missRecords

Running step : 
__totalCount
 
{code}
 

  was:
There are sequence of dq steps in each DQJob,and run those steps one by one 
(with foldLeft function).

We can use multiple threads to run some of those steps which have no dependency.

For example:

In a DQBatchJob, a accuracyExpr will have for steps *__missRecords ,__missCount 
, __totalCount,   accu .*

*__missCount_* _and **_*_totalCount* step can run at the same time .

 

In SeqDQStep ,it just need contains some root steps without dependency steps.

If each step knows it's dependency steps, and when they are ready, we can run 
the step itself .

 

 
{code:java}
Running step : 
accu
| |---__missCount
| | |---__missRecords
| |---__totalCount

Running step : 
__missCount
| |---__missRecords

Running step : 
__missRecords

Running step : 
__totalCount
{code}
 

Running step : 
 __totalCount

 


> Running measure transformStep with multiple threads
> ---------------------------------------------------
>
>                 Key: GRIFFIN-259
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-259
>             Project: Griffin
>          Issue Type: Improvement
>            Reporter: wan kun
>            Priority: Major
>
> There are sequence of dq steps in each DQJob,and run those steps one by one 
> (with foldLeft function).
> We can use multiple threads to run some of those steps which have no 
> dependency.
> For example:
> In a DQBatchJob, a accuracyExpr will have for steps *__missRecords 
> ,__missCount , __totalCount,   accu .*
> *__missCount_* _and ***__totalCount* step can run at the same time .
>  
> In SeqDQStep ,it just need contains some root steps without dependency steps.
> If each step knows it's dependency steps, and when they are ready, we can run 
> the step itself .
>  
>  
> {code:java}
> Running step : 
> accu
> | |---__missCount
> | | |---__missRecords
> | |---__totalCount
> Running step : 
> __missCount
> | |---__missRecords
> Running step : 
> __missRecords
> Running step : 
> __totalCount
>  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to