Hi all,

With the new changes it is possible to do Aggregate function by grouping
according to this document[1].

It was worked well with the given scenarios. But we are facing new issues
with above API because we cannot do aggregate functions, grouping by
multiple Attribute. The next issue is that we cannot define other extra
attribute value we need with Aggregate values.

Let's consider this example,

say we have SQL statement like this,

Select api,Sum(requestCount) from MyTable group by api.

So the equivalent REST API request is,

{
 "tableName":"MyTable",
 "groupByField":"api",
 "aggregateFields":[
   {
     "fieldName":"requestCount",
     "aggregate":"SUM",
     "alias":"sum_req"
   }
 ]
}

Next consider this SQL query.

Select api,version,time,user,Sum(requestCount) from MyTable group by
api,version

Here we need extract attributes time and user. And we group by both api and
version attributes.
But the issue is we cannot request extra attribute time and user with
aggregated value. And not possible to group by multiple attributes.

I tried with  "groupByField":"api,version" but did not work.

So can you have a looking to the above issue and provide some suggestions?


[1]
https://docs.wso2.com/display/DAS300/Retrieving+Aggregated+Values+of+Given+Records+via+REST+API

Thanks and Regards.

On Wed, Aug 5, 2015 at 12:51 PM, Anjana Fernando <[email protected]> wrote:

> Hi Rukshan,
>
> Let's have a F2F chat on this and decide what we can do. We need to start
> from the requirements and go from there.
>
> Cheers,
> Anjana.
>
> On Tue, Aug 4, 2015 at 10:21 PM, Rukshan Premathunga <[email protected]>
> wrote:
>
>> Hi all,
>>
>> We are currently working on showing APIM stat data on dashboard, taking
>> summarised data directly from DAS using its REST API. In that case, we are
>> not depend on any other RDBMS to analyse our data. APIM send event stream
>> to DAS and sparkSQL script hosted on DAS run periodically to generate
>> Summarised data.We use DAS REST API to fetch these summarised data to APIM
>> dashboard.
>>
>> But we are facing some difficulties in Memory and Processing performance,
>>
>>    - APIM need to do some analytics on Summarised data. But DAS REST API
>>    is not capable to do such analytics.
>>    - To analyse data, all the records has to fetch from DAS to APIM.
>>    - But it could be a network waste and memory overflow.
>>    - APIM need extra processing time to analyse data and display on
>>    dashboard.
>>
>> Because of the above issues we cannot take the full advantage of the DAS
>> using its REST API. So that we are looking a mechanism to use DAS REST API
>> effectively, such that doing some Analytics on DAS table data using REST
>> API.
>>
>> Can you please suggest a solution or improvement to this scenario?
>>
>> Thanks and Regards.
>>
>>
>>
>> On Wed, Jul 29, 2015 at 1:34 PM, Rukshan Premathunga <[email protected]>
>> wrote:
>>
>>> Hi sll,
>>>
>>> @Gimantha Can you please provide an example or docs for calculate sum by
>>> groups using facet.
>>>
>>>
>>> Thanks and Regards.
>>>
>>> On Tue, Jul 28, 2015 at 4:52 PM, Rukshan Premathunga <[email protected]>
>>> wrote:
>>>
>>>> Hi Joseph,
>>>>
>>>>
>>>>    - In APIM store we need 4 types of statistical data. those are,
>>>>    - api usage per application
>>>>       - top users for application
>>>>       - API Usage from Resource Path
>>>>       - Faulty Invocations per Application
>>>>
>>>> each of the above scenarios stats are fetched using this query *SELECT
>>>> * FROM `API_REQUEST_SUMMARY` WHERE time BETWEEN "fromDate" AND "toDate";*
>>>>
>>>> Using DAS REST api we can fulfill this requirement.
>>>>
>>>>
>>>>    - APIM publisher there are 7 scenarios. 4 of them are similar to
>>>>    the above query and it is possible with the existing REST API. But below
>>>>    queries are not simple as above,
>>>>       - api usage - *SELECT API, API_VERSION, VERSION, APIPUBLISHER,
>>>>       USERID, SUM(TOTAL_REQUEST_COUNT) AS TOTAL_REQUEST_COUNT, CONTEXT FROM
>>>>       `API_REQUEST_SUMMARY` WHERE time BETWEEN "fromDate" AND "toDate" 
>>>> GROUP BY
>>>>       API, API_VERSION, USERID, VERSION, APIPUBLISHER, CONTEXT ORDER BY
>>>>       TOTAL_REQUEST_COUNT DESC;*
>>>>       - API Usage by Destination* :- SELECT
>>>>       
>>>> api,version,apiPublisher,context,destination,SUM(total_request_count) as
>>>>       total_request_count FROM `columnFamily` WHERE time BETWEEN 
>>>> "fromDate" AND
>>>>       "toDate" GROUP BY api,version,apiPublisher,context,destination;*
>>>>       - Faulty Invocations:-* SELECT
>>>>       api,version,apiPublisher,context,SUM(total_fault_count) as
>>>>       total_fault_count FROM  `columnFamily` WHERE time BETWEEN "fromDate" 
>>>> AND
>>>>       "toDate" GROUP BY api,version,apiPublisher,context;*
>>>>
>>>>
>>>>    - Above are the Queries currently using. Queries may get optimised
>>>>    in future and new stats are yet to added.
>>>>
>>>>
>>>> Thanks  and Regards.
>>>>
>>>>
>>>> On Tue, Jul 28, 2015 at 3:34 PM, Joseph Fonseka <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Rukshan
>>>>>
>>>>> Shall we create a list of scenarios of APIM statistical needs then we
>>>>> can cross check each of them with DAS if they can be implemented with the
>>>>> available functionality.
>>>>>
>>>>> Regards
>>>>> Jo
>>>>>
>>>>> On Tue, Jul 28, 2015 at 2:13 PM, Gimantha Bandara <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Searching and getting summation by grouping is possible. Note: Even
>>>>>> in grouping, if we consider your example,
>>>>>>
>>>>>> SELECT api,version,apiPublisher,context,SUM(total_fault_count) as
>>>>>> total_fault_count FROM  `table` WHERE time BETWEEN fromDate AND toDate
>>>>>> GROUP BY api,version,apiPublisher,context;
>>>>>>
>>>>>> api, version, apiPublisher, context is a multi-field grouping. So in
>>>>>> one  "api" value there can be several "versions".. so on. By simply using
>>>>>> your given SQL query, we can get all the records grouped by executing a
>>>>>> single query. Since facets are basically designed to implement drill-down
>>>>>> functionalities, you cannot get SUMs of all the groups at once. You will
>>>>>> first get SUM (total_fault_count) for all the "apis", then you drilldown
>>>>>> one more group down (lets say you selected  api called "api1", there you
>>>>>> will have several "versions" under "api1". Now you will see
>>>>>> SUM(total_fault_count) for all the "version" and so on. If  you want to 
>>>>>> get
>>>>>> the summation by groups for drilldown, then facets can be used. Facets
>>>>>> features cannot be used to get SUM(total_fault_count) for all the 
>>>>>> groupings
>>>>>> at once. You will have to call facet apis multiple times for that.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 28, 2015 at 1:44 PM, Rukshan Premathunga <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Thanks for the feedbacks.
>>>>>>>
>>>>>>> We cannot say these queries are bounded. In further these queries
>>>>>>> are tend to be optimised by introducing more constraints.
>>>>>>> Also, is there any possibilities that searching and getting
>>>>>>> Summation by grouping?
>>>>>>>
>>>>>>> Thanks and Regards.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 28, 2015 at 1:24 PM, Sinthuja Ragendran <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Gimantha,
>>>>>>>>
>>>>>>>> Thanks for the clarification.
>>>>>>>>
>>>>>>>> AFAIU APIM spark queries will be anyhow executed and summarized
>>>>>>>> into DAS tables, and the above mentioned sample query is something 
>>>>>>>> going to
>>>>>>>> be executed against the summarized data dynamically according to user's
>>>>>>>> input. For example, if the use slides the time-range between some 
>>>>>>>> values in
>>>>>>>> APIM, the the gadgets in the APIM dashboard needs to be updated with 
>>>>>>>> SUM of
>>>>>>>> requests, AVG, etc by filtering the summarized data further for the 
>>>>>>>> given
>>>>>>>> time range. I don't think issuing an SparkSQL query dynamically for 
>>>>>>>> such
>>>>>>>> dynamic changes is viable solution.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sinthuja.
>>>>>>>>
>>>>>>>> On Tue, Jul 28, 2015 at 1:15 PM, Gimantha Bandara <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> [Adding Niranda]
>>>>>>>>>
>>>>>>>>> Hi Sinduja/Rukshan,
>>>>>>>>>
>>>>>>>>> Yes this can be achieved using facets, There are several facets
>>>>>>>>> related APIs. One is to get the Facet record count. Default behavior 
>>>>>>>>> is to
>>>>>>>>> return the number of matching records given the Facet array. We can
>>>>>>>>> override the default behavior by defining a score function. So each 
>>>>>>>>> record
>>>>>>>>> will represent the value of the score function. So id we query for the
>>>>>>>>> facet count, it will simply return the sum of score function values. 
>>>>>>>>> This
>>>>>>>>> behavior is similar to SUM aggregate. But for Aggregates like AVG we 
>>>>>>>>> need
>>>>>>>>> the current number of records to calculate the average. We will need 
>>>>>>>>> to
>>>>>>>>> make two REST calls ( to get the record count with default score 
>>>>>>>>> function
>>>>>>>>> "1" and to get the SUM another record count with the score function
>>>>>>>>> "aggregating field") since aggregates like AVG are not supported.
>>>>>>>>>
>>>>>>>>> As Gihan said, I think Spark SQL will more suitable for this
>>>>>>>>> scenario. It support basic aggregates and if a custom function 
>>>>>>>>> required, we
>>>>>>>>> can write a UDF.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> On Tue, Jul 28, 2015 at 11:27 AM, Sinthuja Ragendran <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> One of the main feature of DAS is being decoupled from underline
>>>>>>>>>> datastore, basically an user should be able to configure RDBMS, 
>>>>>>>>>> cassandra,
>>>>>>>>>> Hbase/HDFS datastore, the client applications nor analyzation 
>>>>>>>>>> scripts are
>>>>>>>>>> not necessarily need to change.
>>>>>>>>>>
>>>>>>>>>> In APIM 1.9.x and before, we stored the summarized data back into
>>>>>>>>>> RDBMS to be supported with both BAM and DAS. But when we are moving 
>>>>>>>>>> forward
>>>>>>>>>> with APIM, IMHO we need to optimize the design based on DAS.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 28, 2015 at 11:02 AM, Rukshan Premathunga <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> We hope to implement a REST API for WSO2 API manager to expose
>>>>>>>>>>> APIM statistic data. For that we evaluated WSO2 DAS REST API to 
>>>>>>>>>>> check
>>>>>>>>>>> whether we can use it.
>>>>>>>>>>> Here are the use cases of the APIM REST API and evaluation
>>>>>>>>>>> result of DAS REST API.
>>>>>>>>>>>
>>>>>>>>>>> *Motivation*:
>>>>>>>>>>>
>>>>>>>>>>> Currently we use DAS/BAM to generate summarised data and they
>>>>>>>>>>> are stored in an RDBMS.
>>>>>>>>>>> Next, APIM fetch data from the RDBMS and display it on the
>>>>>>>>>>> dashboard. APIM has implemented Client which fetch the statistic 
>>>>>>>>>>> data from
>>>>>>>>>>> the RDBMS, which is used by the UI to show statistics on the 
>>>>>>>>>>> dashboard.
>>>>>>>>>>>
>>>>>>>>>>> But we have a new requirement to providing these statistics for
>>>>>>>>>>> a bill generating application. Since application is separate, we 
>>>>>>>>>>> are facing
>>>>>>>>>>> problems of expose these statistic data to the application.
>>>>>>>>>>> As a solution we suggest implementing REST API, which can be
>>>>>>>>>>> used to support above two scenarios (i.e. Current APIM dashboard 
>>>>>>>>>>> and the
>>>>>>>>>>> billing application).
>>>>>>>>>>>
>>>>>>>>>>> Other advantages.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    - Implementing stat dashboard using REST API
>>>>>>>>>>>       - Client side data processing getting reduced
>>>>>>>>>>>       - Possibilities to expose statistic data to others
>>>>>>>>>>>       - Possible to implement appropriate security model
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *DAS REST API*
>>>>>>>>>>>
>>>>>>>>>>> DAS also contain the REST api where we can use it to expose
>>>>>>>>>>> analytics data of the DAS data analytics layer(DAL). Currently we 
>>>>>>>>>>> use DAS
>>>>>>>>>>> to generate summarised data of the APIM statistics.
>>>>>>>>>>>
>>>>>>>>>>> But summarised data are saved in RDBMS since it was easy to
>>>>>>>>>>> fetch data. But with DAS they have provide the REST API to access 
>>>>>>>>>>> DAL data.
>>>>>>>>>>> Because of this if we used DAS rest api, we can generate and
>>>>>>>>>>> store summarized data on DAS without using RDBMS.
>>>>>>>>>>>
>>>>>>>>>>> This will reduce the APIM configuration for statistics and
>>>>>>>>>>> central point of DATA storing and processing.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Other advantages.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    - Reduce cost to configure RDBMS
>>>>>>>>>>>       - APIM will depend only on DAS
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Issues with DAS REST API*
>>>>>>>>>>>
>>>>>>>>>>> Even though DAS provide the REST API to fetch DAL data, query on
>>>>>>>>>>> DAL data is limited. It provide search mechanism with apache 
>>>>>>>>>>> Lucene, but it
>>>>>>>>>>> is insufficient for APIM requirements.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here are some query examples we used on mysql.
>>>>>>>>>>>
>>>>>>>>>>> SELECT api,version,apiPublisher,context,SUM(total_fault_count)
>>>>>>>>>>> as total_fault_count FROM  `table` WHERE time BETWEEN fromDate AND 
>>>>>>>>>>> toDate
>>>>>>>>>>> GROUP BY api,version,apiPublisher,context;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here there is 3 main sub functions and queries
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    - Time between- searching possible with apache Lucene range
>>>>>>>>>>>       search[1]
>>>>>>>>>>>       - Sum - Not possible to calculate the summation
>>>>>>>>>>>       - Group by - impossible to grouping and make the
>>>>>>>>>>>       calculation like SUM, AVG
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> AFAIR you can use FACET type here which connects necssary fields
>>>>>>>>>> api,version,apiPublisher,context together, and then also you can 
>>>>>>>>>> pass a
>>>>>>>>>> function which is applied on the FACET search result. @Gimantha, 
>>>>>>>>>> please
>>>>>>>>>> confirm whether it's possible in DAS. - I know we can get the total 
>>>>>>>>>> rows
>>>>>>>>>> count of facet search, but I'm not very sure about passing the 
>>>>>>>>>> operation to
>>>>>>>>>> be done on the search result.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Sinthuja.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> With all of these advantages we hope to use to DAS
>>>>>>>>>>> REST API without implementing APIM own REST API. But to take the 
>>>>>>>>>>> advantages
>>>>>>>>>>> of the REST API we have to have a REST API with above queries 
>>>>>>>>>>> supported.
>>>>>>>>>>>
>>>>>>>>>>> Can you please provide a proper way to handle these scenarios?
>>>>>>>>>>> Any suggestions are appreciated.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Range
>>>>>>>>>>> Searches
>>>>>>>>>>>
>>>>>>>>>>> Thanks and Regards.
>>>>>>>>>>> --
>>>>>>>>>>> Rukshan Chathuranga.
>>>>>>>>>>> Software Engineer.
>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Architecture mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> *Sinthuja Rajendran*
>>>>>>>>>> Associate Technical Lead
>>>>>>>>>> WSO2, Inc.:http://wso2.com
>>>>>>>>>>
>>>>>>>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>>>>>>>> Mobile: +94774273955
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Gimantha Bandara
>>>>>>>>> Software Engineer
>>>>>>>>> WSO2. Inc : http://wso2.com
>>>>>>>>> Mobile : +94714961919
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Sinthuja Rajendran*
>>>>>>>> Associate Technical Lead
>>>>>>>> WSO2, Inc.:http://wso2.com
>>>>>>>>
>>>>>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>>>>>> Mobile: +94774273955
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Rukshan Chathuranga.
>>>>>>> Software Engineer.
>>>>>>> WSO2, Inc.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Gimantha Bandara
>>>>>> Software Engineer
>>>>>> WSO2. Inc : http://wso2.com
>>>>>> Mobile : +94714961919
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> [email protected]
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> --
>>>>> *Joseph Fonseka*
>>>>> WSO2 Inc.; http://wso2.com
>>>>> lean.enterprise.middleware
>>>>>
>>>>> mobile: +94 772 512 430
>>>>> skype: jpfonseka
>>>>>
>>>>> * <http://lk.linkedin.com/in/rumeshbandara>*
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Rukshan Chathuranga.
>>>> Software Engineer.
>>>> WSO2, Inc.
>>>>
>>>
>>>
>>>
>>> --
>>> Rukshan Chathuranga.
>>> Software Engineer.
>>> WSO2, Inc.
>>>
>>
>>
>>
>> --
>> Rukshan Chathuranga.
>> Software Engineer.
>> WSO2, Inc.
>>
>
>
>
> --
> *Anjana Fernando*
> Senior Technical Lead
> WSO2 Inc. | http://wso2.com
> lean . enterprise . middleware
>



-- 
Rukshan Chathuranga.
Software Engineer.
WSO2, Inc.
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to