[Adding Architecture list]

Hi all,

Timestamp based approach for incremental processing is problematic as we
have gone through long discussions on it and could not come to an
acceptable solution. Instead I think following kind of approach would work.

1. For each incremental analytic script a metadata column is added to the
analytics table with type boolean with name "processed" with value "false".
2. When an incremental script is executed on a data row, that particular
row should get updated with *processed=true.*
3. Next time when the script get executed it can skip all the rows with
field

*processed=true.*
This will avoid the timestamp restriction and buffer time issues and allow
parallel execution on records.
Thanks.


*Maninda Edirisooriya*
Senior Software Engineer

*WSO2, Inc.*lean.enterprise.middleware.

*Blog* : http://maninda.blogspot.com/
*E-mail* : [email protected]
*Skype* : @manindae
*Twitter* : @maninda

On Wed, Jun 8, 2016 at 11:22 AM, Gihan Anuruddha <[email protected]> wrote:

> Hi Guys,
>
> To fulfill above requirement, we can add query as below and make necessary
> changes to back-end.
>
> *create temporary table t5 using CarbonAnalytics options (tableName "t3",
> schema "x INT, y INT", incrementalParams "t5, -1");*
>
> Basically,  we are passing -1 for buffer time. In the backend, if the
> buffer is -1 we only take last processed event timestamp and fetch the data.
>
> If we insert 3 records and do the commit when the buffer is -1 and then
> next time do the select without inserting any records, we are not getting
> any result since after the saved timestamp there was no new record inserted.
>
> So what do you think about this implementation?
>
> Regards,
> Gihan
>
>
> --
> W.G. Gihan Anuruddha
> Senior Software Engineer | WSO2, Inc.
> M: +94772272595
>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to