Hi Gihan,

Yes I am referring to the full incremental analytics we are planing on.
What I mean is that we will have to extend Data Layer to add a column /
meta data column to the table in order to add a meta data fields. Then we
can add fields like "processed_query1" to the table when an incremental
query like "query1" is added.

Actually this implementation violates the independence of Data Layer table
from the analytics it is involved, when metadata columns are added to the
same table. So the other alternative is to add a separate data layer table
to each incremental query to keep the processed state. (e.g. That table
should have the columns of primary key of the real table + the boolean
field, "processed") But in this approach when each record in the real table
is being processed it should check the processed flag of that record in the
meta data table which has a n^2 complexity. WDYT?
Thanks.


*Maninda Edirisooriya*
Senior Software Engineer

*WSO2, Inc.*lean.enterprise.middleware.

*Blog* : http://maninda.blogspot.com/
*E-mail* : [email protected]
*Skype* : @manindae
*Twitter* : @maninda

On Wed, Jun 8, 2016 at 1:42 PM, Gihan Anuruddha <[email protected]> wrote:

> Hi Maninda,
>
> We have introduced some of the incremental data processing capabilities
> with upcoming 3.1.0 release.  Please note that this doesn't support fully
> functional data processing with data aggregation functionalities. Basically
> what we have done is, introduced a way to fetch data based on time windows
> to avoid iterate same data set from the beginning again and again. To avoid
> the data losses, we have introduced some buffer time period and due to that
> some of the events may return for select queries more than once in a
> consecutive analytics task executions. Because of that, some aggregation
> operations like average can be wrong. We have a plan to introduce fully
> functional incremental data processing support in a future DAS release.
>
> Regards,
> Gihan
>
> On Wed, Jun 8, 2016 at 11:53 AM, Maninda Edirisooriya <[email protected]>
> wrote:
>
>> [Adding Architecture list]
>>
>> Hi all,
>>
>> Timestamp based approach for incremental processing is problematic as we
>> have gone through long discussions on it and could not come to an
>> acceptable solution. Instead I think following kind of approach would work.
>>
>> 1. For each incremental analytic script a metadata column is added to the
>> analytics table with type boolean with name "processed" with value "false".
>> 2. When an incremental script is executed on a data row, that particular
>> row should get updated with *processed=true.*
>> 3. Next time when the script get executed it can skip all the rows with
>> field
>>
>> *processed=true.*
>> This will avoid the timestamp restriction and buffer time issues and
>> allow parallel execution on records.
>> Thanks.
>>
>>
>> *Maninda Edirisooriya*
>> Senior Software Engineer
>>
>> *WSO2, Inc.*lean.enterprise.middleware.
>>
>> *Blog* : http://maninda.blogspot.com/
>> *E-mail* : [email protected]
>> *Skype* : @manindae
>> *Twitter* : @maninda
>>
>> On Wed, Jun 8, 2016 at 11:22 AM, Gihan Anuruddha <[email protected]> wrote:
>>
>>> Hi Guys,
>>>
>>> To fulfill above requirement, we can add query as below and make
>>> necessary changes to back-end.
>>>
>>> *create temporary table t5 using CarbonAnalytics options (tableName
>>> "t3", schema "x INT, y INT", incrementalParams "t5, -1");*
>>>
>>> Basically,  we are passing -1 for buffer time. In the backend, if the
>>> buffer is -1 we only take last processed event timestamp and fetch the data.
>>>
>>> If we insert 3 records and do the commit when the buffer is -1 and then
>>> next time do the select without inserting any records, we are not getting
>>> any result since after the saved timestamp there was no new record inserted.
>>>
>>> So what do you think about this implementation?
>>>
>>> Regards,
>>> Gihan
>>>
>>>
>>> --
>>> W.G. Gihan Anuruddha
>>> Senior Software Engineer | WSO2, Inc.
>>> M: +94772272595
>>>
>>
>>
>
>
> --
> W.G. Gihan Anuruddha
> Senior Software Engineer | WSO2, Inc.
> M: +94772272595
>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to