I chatted with Chathura. We can use spark to aggregate data grouped by user and task-id and save it a SQL DB. Then we can use SQL query (called from the UI) to get the data for a specific task-id.
Thanks Srinath On Thu, Oct 1, 2015 at 1:21 PM, Anjana Fernando <[email protected]> wrote: > Hi Chathura, > > The only way you can pass a parameter to a query as such in a script would > be to use an UDF. This is mentioned in the docs on how to do it. But I'm > wondering, if this would also be proper. Since, these are scheduled batch > scripts, and will most probably take some time to again start executing and > finish it. So a user setting from a UI setting this parameters, not sure if > it's practical. Like, it cannot be used in a dashboard, where the results > are expected quickly. You may also want to check out indexing > functionality, where you can most probably use a static query for the batch > operation, and when inserting the resultant summarized data, you can index > it, so you can quickly look it up using time ranges and so on. Also, there > is a possibility to bypass Spark SQL altogether using our aggregates > features in our indexing functionality. > > @Gimantha, is [1] the only documentation we have on the indexing > aggregation features? .. if so, please update it to be more comprehensive. > It is better if we can give side by side solutions onto how we do > aggregates in SQL, and the comparable approach we would do in our indexing > features. > > [1] > https://docs.wso2.com/display/DAS300/Retrieving+Aggregated+Values+of+Given+Records+via+JS+API > > Cheers, > Anjana. > > On Wed, Sep 30, 2015 at 10:43 PM, Chathura Ekanayake <[email protected]> > wrote: > >> Process monitoring graphs in [1] were proposed to give some level of top >> to bottom analysis. For example, a business analyst may first identify slow >> performing processes using the graph number 2. Then he can analyze >> bottleneck tasks of those slow processes from the graph number 10, where he >> has to generate graph 10 for each slow process. Then he can further analyze >> the users who performed bottleneck tasks frequently by generating graph >> number 11 for each slow task. Therefore, ability to execute parameterized >> queries is critical for these process monitoring features. >> >> a.) Is that possible in DAS side ? >>> >>> eg: SELECT processDefinitionId, COUNT(processInstanceId) AS >>> processInstanceCount, AVG(duration) AS avgExecutionTime FROM >>> BPMNProcessInstances WHERE date BETWEEN *"fromDate" *AND* "toDate" *GROUP >>> BY processDefinitionId; >>> (here *fromDate* and *toDate* are variables that need to be passed at >>> runtime) >>> >>> b.) If not we can store the summarized data with primary and secondary >>> filters which mentioned in [1] on DAS and then we can fetch them through >>> DAS REST API by passing appropriate parameters. >>> >> >> Isuru, I think the approach (b) does not scale. There can be hundreds of >> processes and thousands of tasks (in all processes). Therefore, it is not >> practical to pre-compute data for all graphs. >> >> Ability to execute parameterized queries or to provide queries at runtime >> through an API would be helpful to solve this problem. >> >> [1] >> https://docs.google.com/a/wso2.com/spreadsheets/d/1pQAK6x4-rL-hQA7-NOaoT2llyjxv_nfc_vUarwUr74w/edit?usp=sharing >> >> Regards, >> Chathura >> >> >> >> > > > -- > *Anjana Fernando* > Senior Technical Lead > WSO2 Inc. | http://wso2.com > lean . enterprise . middleware > -- ============================ Srinath Perera, Ph.D. http://people.apache.org/~hemapani/ http://srinathsview.blogspot.com/
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
