I chatted with Chathura.

We can use spark to aggregate data grouped by user and task-id and save it
a SQL DB. Then we can use SQL query (called from the UI) to get the data
for a specific task-id.

Thanks
Srinath

On Thu, Oct 1, 2015 at 1:21 PM, Anjana Fernando <[email protected]> wrote:

> Hi Chathura,
>
> The only way you can pass a parameter to a query as such in a script would
> be to use an UDF. This is mentioned in the docs on how to do it. But I'm
> wondering, if this would also be proper. Since, these are scheduled batch
> scripts, and will most probably take some time to again start executing and
> finish it. So a user setting from a UI setting this parameters, not sure if
> it's practical. Like, it cannot be used in a dashboard, where the results
> are expected quickly. You may also want to check out indexing
> functionality, where you can most probably use a static query for the batch
> operation, and when inserting the resultant summarized data, you can index
> it, so you can quickly look it up using time ranges and so on. Also, there
> is a possibility to bypass Spark SQL altogether using our aggregates
> features in our indexing functionality.
>
> @Gimantha, is [1] the only documentation we have on the indexing
> aggregation features? .. if so, please update it to be more comprehensive.
> It is better if we can give side by side solutions onto how we do
> aggregates in SQL, and the comparable approach we would do in our indexing
> features.
>
> [1]
> https://docs.wso2.com/display/DAS300/Retrieving+Aggregated+Values+of+Given+Records+via+JS+API
>
> Cheers,
> Anjana.
>
> On Wed, Sep 30, 2015 at 10:43 PM, Chathura Ekanayake <[email protected]>
> wrote:
>
>> Process monitoring graphs in [1] were proposed to give some level of top
>> to bottom analysis. For example, a business analyst may first identify slow
>> performing processes using the graph number 2. Then he can analyze
>> bottleneck tasks of those slow processes from the graph number 10, where he
>> has to generate graph 10 for each slow process. Then he can further analyze
>> the users who performed bottleneck tasks frequently by generating graph
>> number 11 for each slow task. Therefore, ability to execute parameterized
>> queries is critical for these process monitoring features.
>>
>> a.)  Is that possible in DAS side ?
>>>
>>> eg: SELECT processDefinitionId, COUNT(processInstanceId) AS
>>> processInstanceCount, AVG(duration) AS avgExecutionTime FROM
>>> BPMNProcessInstances WHERE date BETWEEN *"fromDate" *AND* "toDate" *GROUP
>>> BY processDefinitionId;
>>> (here *fromDate* and *toDate* are variables that need to be passed at
>>> runtime)
>>>
>>> b.) If not we can store the summarized data with primary and secondary
>>> filters which mentioned in [1]  on DAS and then we can fetch them through
>>> DAS REST API by passing appropriate parameters.
>>>
>>
>> Isuru, I think the approach (b) does not scale.  There can be hundreds of
>> processes and thousands of tasks (in all processes). Therefore, it is not
>> practical to pre-compute data for all graphs.
>>
>> Ability to execute parameterized queries or to provide queries at runtime
>> through an API would be helpful to solve this problem.
>>
>> [1]
>> https://docs.google.com/a/wso2.com/spreadsheets/d/1pQAK6x4-rL-hQA7-NOaoT2llyjxv_nfc_vUarwUr74w/edit?usp=sharing
>>
>> Regards,
>> Chathura
>>
>>
>>
>>
>
>
> --
> *Anjana Fernando*
> Senior Technical Lead
> WSO2 Inc. | http://wso2.com
> lean . enterprise . middleware
>



-- 
============================
Srinath Perera, Ph.D.
   http://people.apache.org/~hemapani/
   http://srinathsview.blogspot.com/
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to