Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-24 Thread Uvindra Dias Jayasinha
The flat file approach is also an option for us, thanks for suggesting it
Akila. At the moment we dont have a feature to make use of audit
information and it is not a priority for us to implement. So we will
revisit our options latter on if we do decide to implement this.



On 23 October 2016 at 09:51, Akila Ravihansa Perera 
wrote:

> Hi,
>
> What exactly is the purpose of Audit table or tables? Will those be used
> to query the history and display it to the user through the system? Or is
> it only for auditing purposes in which APIM will never directly query the
> data but a separate system or tool will use.
>
> If it is the latter case then why not simply use an auditing log file? We
> define a well structured format and keep appending audit events. It should
> be a flat structure. In production, logs are usually centralized (ELK,
> Splunk etc) so file-system errors won't matter.
>
> Thanks.
>
> On Sat, Oct 22, 2016 at 11:21 AM, Abimaran Kugathasan 
> wrote:
>
>>
>>
>> On Fri, Oct 21, 2016 at 12:10 PM, Bhathiya Jayasekara 
>> wrote:
>>
>>>
>>> On Wed, Oct 12, 2016 at 12:30 PM, Inosh Goonewardena 
>>> wrote:
>>>


 On Tue, Oct 11, 2016 at 2:40 PM, Uvindra Dias Jayasinha <
 uvin...@wso2.com> wrote:

> Thanks for the feedback, some interesting points were brought up
>
> @Abimaran, the problem with maintaining a rigid structure like old/new
> column is that if a user changes the value of 5 columns at a given time
> that would mean 5 different inserts to the table, when in actual fact it
> was a single transaction that took place when the user did the change and
> saved. So its better to use a implementation like
> google-diff-match-patch[1] to record the string diff between the values of
> the columns before the change took place and after the update. Though we
> dont need to worry about this implementation detail for now. The idea of
> using a single table to store the history of all tables that will require
> auditing sounds good.
>
> @Sanjeewa, yes this would improve performance when trying to retreive
> the LAST_UPDATED_TIME for a given entity.
>
> Let me elaborate a bit on Sanjeewa's point. So there can be only one
> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
> of the original entities schema. Having the LAST_UPDATED_TIME as part of
> the original entities schema gives a performance advantage on checking if 
> a
> given entity has been modified since it was last checked. This is vital 
> for
> features such as ETags support for the REST API. So CREATED_BY,
> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
> schema.
>
> We can still use the master audit table(building on Abimarans idea) to
> actually keep track of change history of a given entity, so that table
> could look like this,
>
> ENTRY_ID   PK
> TABLE_NAME VARCHAR
> ENTITY_ID  FK
> DIFF  BLOB
> ACTION*VARCHAR*
> ACTION_BY  *VARCHAR*
> ACTION_TIME   *TIMESTAMP*
>
>
 +1 for having single audit table and recording a diff value. Using a
 structure with old/new column could become unmanageable when the changes
 happen to multiple columns. Also we are planning to do audit table updates
 from our code right? Database level triggers can be used in such cases but
 IMO we should avoid using triggers since it could affect the performance.

>>>
>>> Another reason to avoid database triggers is that if we use triggers, we
>>> complemetely depend on trigger support in each database. So we may not be
>>> able to support all databases, and we'll have to spend a lot of time and
>>> effort to support auditing for different databases.
>>>
>>> Another important thing is that when we implement auditing from our
>>> code, we need to make sure to do it in async manner to avoid performance
>>> degradations.
>>>
>>
>> Normally auditing with trigger is used in financial system where auditing
>> is very important, but in our case we don't have much importance for
>> auditing, so we can avoid using triggers.
>>
>>
>>>
>>> Thanks,
>>> Bhathiya
>>>
>>>
>>>



>

> [1] https://code.google.com/p/google-diff-match-patch/
>
> On 11 October 2016 at 13:44, Sanjeewa Malalgoda 
> wrote:
>
>> I think we can manage audit table while still having CREATED_BY,
>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>> approach we may never need to do table scan of audit table while fetching
>> updates. So each updates will recorded in separate table while original
>> table having all relevant information. WDYT?
>>
>> Thanks,
>> sanjeewa.
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/ma

Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-22 Thread Akila Ravihansa Perera
Hi,

What exactly is the purpose of Audit table or tables? Will those be used to
query the history and display it to the user through the system? Or is it
only for auditing purposes in which APIM will never directly query the data
but a separate system or tool will use.

If it is the latter case then why not simply use an auditing log file? We
define a well structured format and keep appending audit events. It should
be a flat structure. In production, logs are usually centralized (ELK,
Splunk etc) so file-system errors won't matter.

Thanks.

On Sat, Oct 22, 2016 at 11:21 AM, Abimaran Kugathasan 
wrote:

>
>
> On Fri, Oct 21, 2016 at 12:10 PM, Bhathiya Jayasekara 
> wrote:
>
>>
>> On Wed, Oct 12, 2016 at 12:30 PM, Inosh Goonewardena 
>> wrote:
>>
>>>
>>>
>>> On Tue, Oct 11, 2016 at 2:40 PM, Uvindra Dias Jayasinha <
>>> uvin...@wso2.com> wrote:
>>>
 Thanks for the feedback, some interesting points were brought up

 @Abimaran, the problem with maintaining a rigid structure like old/new
 column is that if a user changes the value of 5 columns at a given time
 that would mean 5 different inserts to the table, when in actual fact it
 was a single transaction that took place when the user did the change and
 saved. So its better to use a implementation like
 google-diff-match-patch[1] to record the string diff between the values of
 the columns before the change took place and after the update. Though we
 dont need to worry about this implementation detail for now. The idea of
 using a single table to store the history of all tables that will require
 auditing sounds good.

 @Sanjeewa, yes this would improve performance when trying to retreive
 the LAST_UPDATED_TIME for a given entity.

 Let me elaborate a bit on Sanjeewa's point. So there can be only one
 CREATED_BY and CREATED_TIME for a given entity so that can remain as part
 of the original entities schema. Having the LAST_UPDATED_TIME as part of
 the original entities schema gives a performance advantage on checking if a
 given entity has been modified since it was last checked. This is vital for
 features such as ETags support for the REST API. So CREATED_BY,
 CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
 schema.

 We can still use the master audit table(building on Abimarans idea) to
 actually keep track of change history of a given entity, so that table
 could look like this,

 ENTRY_ID   PK
 TABLE_NAME VARCHAR
 ENTITY_ID  FK
 DIFF  BLOB
 ACTION*VARCHAR*
 ACTION_BY  *VARCHAR*
 ACTION_TIME   *TIMESTAMP*


>>> +1 for having single audit table and recording a diff value. Using a
>>> structure with old/new column could become unmanageable when the changes
>>> happen to multiple columns. Also we are planning to do audit table updates
>>> from our code right? Database level triggers can be used in such cases but
>>> IMO we should avoid using triggers since it could affect the performance.
>>>
>>
>> Another reason to avoid database triggers is that if we use triggers, we
>> complemetely depend on trigger support in each database. So we may not be
>> able to support all databases, and we'll have to spend a lot of time and
>> effort to support auditing for different databases.
>>
>> Another important thing is that when we implement auditing from our code,
>> we need to make sure to do it in async manner to avoid performance
>> degradations.
>>
>
> Normally auditing with trigger is used in financial system where auditing
> is very important, but in our case we don't have much importance for
> auditing, so we can avoid using triggers.
>
>
>>
>> Thanks,
>> Bhathiya
>>
>>
>>
>>>
>>>
>>>

>>>
 [1] https://code.google.com/p/google-diff-match-patch/

 On 11 October 2016 at 13:44, Sanjeewa Malalgoda 
 wrote:

> I think we can manage audit table while still having CREATED_BY,
> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
> approach we may never need to do table scan of audit table while fetching
> updates. So each updates will recorded in separate table while original
> table having all relevant information. WDYT?
>
> Thanks,
> sanjeewa.
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


 --
 Regards,
 Uvindra

 Mobile: 33962

 ___
 Architecture mailing list
 Architecture@wso2.org
 https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


>>>
>>>
>>> --
>>> Thanks & Regards,
>>>
>>> Inosh Goonewardena
>>> Associate Technical Lead- WSO2 Inc.
>>> Mobile: +94779966317
>>>
>>> __

Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-21 Thread Abimaran Kugathasan
On Fri, Oct 21, 2016 at 12:10 PM, Bhathiya Jayasekara 
wrote:

>
> On Wed, Oct 12, 2016 at 12:30 PM, Inosh Goonewardena 
> wrote:
>
>>
>>
>> On Tue, Oct 11, 2016 at 2:40 PM, Uvindra Dias Jayasinha > > wrote:
>>
>>> Thanks for the feedback, some interesting points were brought up
>>>
>>> @Abimaran, the problem with maintaining a rigid structure like old/new
>>> column is that if a user changes the value of 5 columns at a given time
>>> that would mean 5 different inserts to the table, when in actual fact it
>>> was a single transaction that took place when the user did the change and
>>> saved. So its better to use a implementation like
>>> google-diff-match-patch[1] to record the string diff between the values of
>>> the columns before the change took place and after the update. Though we
>>> dont need to worry about this implementation detail for now. The idea of
>>> using a single table to store the history of all tables that will require
>>> auditing sounds good.
>>>
>>> @Sanjeewa, yes this would improve performance when trying to retreive
>>> the LAST_UPDATED_TIME for a given entity.
>>>
>>> Let me elaborate a bit on Sanjeewa's point. So there can be only one
>>> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
>>> of the original entities schema. Having the LAST_UPDATED_TIME as part of
>>> the original entities schema gives a performance advantage on checking if a
>>> given entity has been modified since it was last checked. This is vital for
>>> features such as ETags support for the REST API. So CREATED_BY,
>>> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
>>> schema.
>>>
>>> We can still use the master audit table(building on Abimarans idea) to
>>> actually keep track of change history of a given entity, so that table
>>> could look like this,
>>>
>>> ENTRY_ID   PK
>>> TABLE_NAME VARCHAR
>>> ENTITY_ID  FK
>>> DIFF  BLOB
>>> ACTION*VARCHAR*
>>> ACTION_BY  *VARCHAR*
>>> ACTION_TIME   *TIMESTAMP*
>>>
>>>
>> +1 for having single audit table and recording a diff value. Using a
>> structure with old/new column could become unmanageable when the changes
>> happen to multiple columns. Also we are planning to do audit table updates
>> from our code right? Database level triggers can be used in such cases but
>> IMO we should avoid using triggers since it could affect the performance.
>>
>
> Another reason to avoid database triggers is that if we use triggers, we
> complemetely depend on trigger support in each database. So we may not be
> able to support all databases, and we'll have to spend a lot of time and
> effort to support auditing for different databases.
>
> Another important thing is that when we implement auditing from our code,
> we need to make sure to do it in async manner to avoid performance
> degradations.
>

Normally auditing with trigger is used in financial system where auditing
is very important, but in our case we don't have much importance for
auditing, so we can avoid using triggers.


>
> Thanks,
> Bhathiya
>
>
>
>>
>>
>>
>>>
>>
>>> [1] https://code.google.com/p/google-diff-match-patch/
>>>
>>> On 11 October 2016 at 13:44, Sanjeewa Malalgoda 
>>> wrote:
>>>
 I think we can manage audit table while still having CREATED_BY,
 CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
 approach we may never need to do table scan of audit table while fetching
 updates. So each updates will recorded in separate table while original
 table having all relevant information. WDYT?

 Thanks,
 sanjeewa.

 ___
 Architecture mailing list
 Architecture@wso2.org
 https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


>>>
>>>
>>> --
>>> Regards,
>>> Uvindra
>>>
>>> Mobile: 33962
>>>
>>> ___
>>> Architecture mailing list
>>> Architecture@wso2.org
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>>
>> Inosh Goonewardena
>> Associate Technical Lead- WSO2 Inc.
>> Mobile: +94779966317
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> *Bhathiya Jayasekara*
> *Senior Software Engineer,*
> *WSO2 inc., http://wso2.com *
>
> *Phone: +94715478185 <%2B94715478185>*
> *LinkedIn: http://www.linkedin.com/in/bhathiyaj
> *
> *Twitter: https://twitter.com/bhathiyax *
> *Blog: http://movingaheadblog.blogspot.com
> *
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Thanks
Abimaran Kugathasan
Senior Software Engineer - 

Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-20 Thread Bhathiya Jayasekara
On Wed, Oct 12, 2016 at 12:30 PM, Inosh Goonewardena  wrote:

>
>
> On Tue, Oct 11, 2016 at 2:40 PM, Uvindra Dias Jayasinha 
> wrote:
>
>> Thanks for the feedback, some interesting points were brought up
>>
>> @Abimaran, the problem with maintaining a rigid structure like old/new
>> column is that if a user changes the value of 5 columns at a given time
>> that would mean 5 different inserts to the table, when in actual fact it
>> was a single transaction that took place when the user did the change and
>> saved. So its better to use a implementation like
>> google-diff-match-patch[1] to record the string diff between the values of
>> the columns before the change took place and after the update. Though we
>> dont need to worry about this implementation detail for now. The idea of
>> using a single table to store the history of all tables that will require
>> auditing sounds good.
>>
>> @Sanjeewa, yes this would improve performance when trying to retreive the
>> LAST_UPDATED_TIME for a given entity.
>>
>> Let me elaborate a bit on Sanjeewa's point. So there can be only one
>> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
>> of the original entities schema. Having the LAST_UPDATED_TIME as part of
>> the original entities schema gives a performance advantage on checking if a
>> given entity has been modified since it was last checked. This is vital for
>> features such as ETags support for the REST API. So CREATED_BY,
>> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
>> schema.
>>
>> We can still use the master audit table(building on Abimarans idea) to
>> actually keep track of change history of a given entity, so that table
>> could look like this,
>>
>> ENTRY_ID   PK
>> TABLE_NAME VARCHAR
>> ENTITY_ID  FK
>> DIFF  BLOB
>> ACTION*VARCHAR*
>> ACTION_BY  *VARCHAR*
>> ACTION_TIME   *TIMESTAMP*
>>
>>
> +1 for having single audit table and recording a diff value. Using a
> structure with old/new column could become unmanageable when the changes
> happen to multiple columns. Also we are planning to do audit table updates
> from our code right? Database level triggers can be used in such cases but
> IMO we should avoid using triggers since it could affect the performance.
>

Another reason to avoid database triggers is that if we use triggers, we
complemetely depend on trigger support in each database. So we may not be
able to support all databases, and we'll have to spend a lot of time and
effort to support auditing for different databases.

Another important thing is that when we implement auditing from our code,
we need to make sure to do it in async manner to avoid performance
degradations.

Thanks,
Bhathiya


>
>
>
>>
>
>> [1] https://code.google.com/p/google-diff-match-patch/
>>
>> On 11 October 2016 at 13:44, Sanjeewa Malalgoda 
>> wrote:
>>
>>> I think we can manage audit table while still having CREATED_BY,
>>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>>> approach we may never need to do table scan of audit table while fetching
>>> updates. So each updates will recorded in separate table while original
>>> table having all relevant information. WDYT?
>>>
>>> Thanks,
>>> sanjeewa.
>>>
>>> ___
>>> Architecture mailing list
>>> Architecture@wso2.org
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> Regards,
>> Uvindra
>>
>> Mobile: 33962
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Thanks & Regards,
>
> Inosh Goonewardena
> Associate Technical Lead- WSO2 Inc.
> Mobile: +94779966317
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
*Bhathiya Jayasekara*
*Senior Software Engineer,*
*WSO2 inc., http://wso2.com *

*Phone: +94715478185*
*LinkedIn: http://www.linkedin.com/in/bhathiyaj
*
*Twitter: https://twitter.com/bhathiyax *
*Blog: http://movingaheadblog.blogspot.com
*
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-12 Thread Inosh Goonewardena
On Tue, Oct 11, 2016 at 2:40 PM, Uvindra Dias Jayasinha 
wrote:

> Thanks for the feedback, some interesting points were brought up
>
> @Abimaran, the problem with maintaining a rigid structure like old/new
> column is that if a user changes the value of 5 columns at a given time
> that would mean 5 different inserts to the table, when in actual fact it
> was a single transaction that took place when the user did the change and
> saved. So its better to use a implementation like
> google-diff-match-patch[1] to record the string diff between the values of
> the columns before the change took place and after the update. Though we
> dont need to worry about this implementation detail for now. The idea of
> using a single table to store the history of all tables that will require
> auditing sounds good.
>
> @Sanjeewa, yes this would improve performance when trying to retreive the
> LAST_UPDATED_TIME for a given entity.
>
> Let me elaborate a bit on Sanjeewa's point. So there can be only one
> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
> of the original entities schema. Having the LAST_UPDATED_TIME as part of
> the original entities schema gives a performance advantage on checking if a
> given entity has been modified since it was last checked. This is vital for
> features such as ETags support for the REST API. So CREATED_BY,
> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
> schema.
>
> We can still use the master audit table(building on Abimarans idea) to
> actually keep track of change history of a given entity, so that table
> could look like this,
>
> ENTRY_ID   PK
> TABLE_NAME VARCHAR
> ENTITY_ID  FK
> DIFF  BLOB
> ACTION*VARCHAR*
> ACTION_BY  *VARCHAR*
> ACTION_TIME   *TIMESTAMP*
>
>
+1 for having single audit table and recording a diff value. Using a
structure with old/new column could become unmanageable when the changes
happen to multiple columns. Also we are planning to do audit table updates
from our code right? Database level triggers can be used in such cases but
IMO we should avoid using triggers since it could affect the performance.


>

> [1] https://code.google.com/p/google-diff-match-patch/
>
> On 11 October 2016 at 13:44, Sanjeewa Malalgoda  wrote:
>
>> I think we can manage audit table while still having CREATED_BY,
>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>> approach we may never need to do table scan of audit table while fetching
>> updates. So each updates will recorded in separate table while original
>> table having all relevant information. WDYT?
>>
>> Thanks,
>> sanjeewa.
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Regards,
> Uvindra
>
> Mobile: 33962
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Thanks & Regards,

Inosh Goonewardena
Associate Technical Lead- WSO2 Inc.
Mobile: +94779966317
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Uvindra Dias Jayasinha
On 12 October 2016 at 10:54, Lahiru Cooray  wrote:

>
>
> On Tue, Oct 11, 2016 at 1:44 PM, Sanjeewa Malalgoda 
> wrote:
>
>> I think we can manage audit table while still having CREATED_BY,
>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>> approach we may never need to do table scan of audit table while fetching
>> updates. So each updates will recorded in separate table while original
>> table having all relevant information. WDYT?
>>
>
> +1
> Also we may need the highlighted separate audit table to keep a track of
> deleted rows.
>

Hi Lahiru, Yes I highlighted the need for auditing entity deletions in my
initial mail.

>
>> Thanks,
>> sanjeewa.
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> *Lahiru Cooray*
> Software Engineer
> WSO2, Inc.;http://wso2.com/
> lean.enterprise.middleware
>
> Mobile: +94 715 654154
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Regards,
Uvindra

Mobile: 33962
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Uvindra Dias Jayasinha
Thanks for the feedback

On 12 October 2016 at 09:33, Abimaran Kugathasan  wrote:

>
>
> On Tue, Oct 11, 2016 at 10:34 PM, Lakmali Baminiwatta 
> wrote:
>
>>
>>
>> On 11 October 2016 at 14:40, Uvindra Dias Jayasinha 
>> wrote:
>>
>>> Thanks for the feedback, some interesting points were brought up
>>>
>>> @Abimaran, the problem with maintaining a rigid structure like old/new
>>> column is that if a user changes the value of 5 columns at a given time
>>> that would mean 5 different inserts to the table, when in actual fact it
>>> was a single transaction that took place when the user did the change and
>>> saved. So its better to use a implementation like
>>> google-diff-match-patch[1] to record the string diff between the values of
>>> the columns before the change took place and after the update. Though we
>>> dont need to worry about this implementation detail for now. The idea of
>>> using a single table to store the history of all tables that will require
>>> auditing sounds good.
>>>
>>
>> IMO we have to think bit more on whether we really need to store the diff
>> correspond to the changes since it may result in a considerable growth of
>> the storage mainly during the development phase.  Also the implementation
>> could be complicated in terms of what columns to be considered since the
>> changes may take place in more than a single table for an artifact update.
>>
>
> We dont need to worry too much about how we are going to implement
auditing at the moment since that is not a core requirement for the initial
DB design. We could provide a config for users to turn on auditing when the
system goes into production to avoid the growth during the development
stages(ideally an auditing feature is a requirement for a production
system). The actual implementation maybe complicated or simpler depending
on what we want to do. But any auditing implementation is better than no
auditing at all.

>
> Yes, also, one major problem with this approach is, getting diff between
> two db updates and later use the diff to check the history.
>
> And, if we are going to have seperate audit table for each Major tables,
> and how we are going to update that audit table when there is an update? If
> we copy the entire row when there is an update to a column, later we don't
> know for which column update, we have copy the old data to the audit table?
> If we copy only the old value of a column which is updated, will increase
> the audit table's number of entries.
>
> Having one audit table with table name and column name and the timestamps
> like below will solve these problems.
>
> ENTRY_ID   PK
> TABLE_NAME  VARCHAR
> FIELD_NAME   VARCHAR
> OLD_VALUEVARCHAR
> NEW_VALUE   VARCHAR
> ACTION_BY VARCHAR
> ACTION_TIME  VARCHAR
>

What is important is that we have the provision to implement this when we
actually want to, which is our initial goal. So there will be no actual
diff auditing to begin with. We will just record who did the update on a
given entity along with the time stamp like we do now. But to answer your
questions,

1. When we update an entity we replace the existing row with with the
updated values.

2. Most of the values wouldn't have changed, but since we have a single
update statement for APIs(We dont have separate update statements for
updating individual rows in the API table) we dont actually know which
columns have changed

3. So anyway we need to do a diff existing row and new data sent in the
update statement to figure out which columns have actually changed

We wont be having a separate audit table, just one. We dont need to copy
the entire row over either. If we get the diff we can store that only,
whcih should be smaller. We are not trying to replace DB backups :)

>
>
>> Thanks,
>> Lakmali
>>
>>>
>>> @Sanjeewa, yes this would improve performance when trying to retreive
>>> the LAST_UPDATED_TIME for a given entity.
>>>
>>> Let me elaborate a bit on Sanjeewa's point. So there can be only one
>>> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
>>> of the original entities schema. Having the LAST_UPDATED_TIME as part of
>>> the original entities schema gives a performance advantage on checking if a
>>> given entity has been modified since it was last checked. This is vital for
>>> features such as ETags support for the REST API. So CREATED_BY,
>>> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
>>> schema.
>>>
>>> We can still use the master audit table(building on Abimarans idea) to
>>> actually keep track of change history of a given entity, so that table
>>> could look like this,
>>>
>>> ENTRY_ID   PK
>>> TABLE_NAME VARCHAR
>>> ENTITY_ID  FK
>>> DIFF  BLOB
>>> ACTION*VARCHAR*
>>> ACTION_BY  *VARCHAR*
>>> ACTION_TIME   *TIMESTAMP*
>>>
>>>
>>>
>>> [1] https://code.google.com/p/google-diff-match-patch/
>>>
>>> On 11 October 2016 at 13:44, Sanjeewa Malalgoda 
>>> wrote:
>

Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Lahiru Cooray
On Tue, Oct 11, 2016 at 1:44 PM, Sanjeewa Malalgoda 
wrote:

> I think we can manage audit table while still having CREATED_BY,
> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
> approach we may never need to do table scan of audit table while fetching
> updates. So each updates will recorded in separate table while original
> table having all relevant information. WDYT?
>

+1
Also we may need the highlighted separate audit table to keep a track of
deleted rows.

>
> Thanks,
> sanjeewa.
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
*Lahiru Cooray*
Software Engineer
WSO2, Inc.;http://wso2.com/
lean.enterprise.middleware

Mobile: +94 715 654154
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Abimaran Kugathasan
On Tue, Oct 11, 2016 at 10:34 PM, Lakmali Baminiwatta 
wrote:

>
>
> On 11 October 2016 at 14:40, Uvindra Dias Jayasinha 
> wrote:
>
>> Thanks for the feedback, some interesting points were brought up
>>
>> @Abimaran, the problem with maintaining a rigid structure like old/new
>> column is that if a user changes the value of 5 columns at a given time
>> that would mean 5 different inserts to the table, when in actual fact it
>> was a single transaction that took place when the user did the change and
>> saved. So its better to use a implementation like
>> google-diff-match-patch[1] to record the string diff between the values of
>> the columns before the change took place and after the update. Though we
>> dont need to worry about this implementation detail for now. The idea of
>> using a single table to store the history of all tables that will require
>> auditing sounds good.
>>
>
> IMO we have to think bit more on whether we really need to store the diff
> correspond to the changes since it may result in a considerable growth of
> the storage mainly during the development phase.  Also the implementation
> could be complicated in terms of what columns to be considered since the
> changes may take place in more than a single table for an artifact update.
>


Yes, also, one major problem with this approach is, getting diff between
two db updates and later use the diff to check the history.

And, if we are going to have seperate audit table for each Major tables,
and how we are going to update that audit table when there is an update? If
we copy the entire row when there is an update to a column, later we don't
know for which column update, we have copy the old data to the audit table?
If we copy only the old value of a column which is updated, will increase
the audit table's number of entries.

Having one audit table with table name and column name and the timestamps
like below will solve these problems.

ENTRY_ID   PK
TABLE_NAME  VARCHAR
FIELD_NAME   VARCHAR
OLD_VALUEVARCHAR
NEW_VALUE   VARCHAR
ACTION_BY VARCHAR
ACTION_TIME  VARCHAR


> Thanks,
> Lakmali
>
>>
>> @Sanjeewa, yes this would improve performance when trying to retreive the
>> LAST_UPDATED_TIME for a given entity.
>>
>> Let me elaborate a bit on Sanjeewa's point. So there can be only one
>> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
>> of the original entities schema. Having the LAST_UPDATED_TIME as part of
>> the original entities schema gives a performance advantage on checking if a
>> given entity has been modified since it was last checked. This is vital for
>> features such as ETags support for the REST API. So CREATED_BY,
>> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
>> schema.
>>
>> We can still use the master audit table(building on Abimarans idea) to
>> actually keep track of change history of a given entity, so that table
>> could look like this,
>>
>> ENTRY_ID   PK
>> TABLE_NAME VARCHAR
>> ENTITY_ID  FK
>> DIFF  BLOB
>> ACTION*VARCHAR*
>> ACTION_BY  *VARCHAR*
>> ACTION_TIME   *TIMESTAMP*
>>
>>
>>
>> [1] https://code.google.com/p/google-diff-match-patch/
>>
>> On 11 October 2016 at 13:44, Sanjeewa Malalgoda 
>> wrote:
>>
>>> I think we can manage audit table while still having CREATED_BY,
>>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>>> approach we may never need to do table scan of audit table while fetching
>>> updates. So each updates will recorded in separate table while original
>>> table having all relevant information. WDYT?
>>>
>>> Thanks,
>>> sanjeewa.
>>>
>>> ___
>>> Architecture mailing list
>>> Architecture@wso2.org
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> Regards,
>> Uvindra
>>
>> Mobile: 33962
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Lakmali Baminiwatta
> Associate Technical Lead
> WSO2, Inc.: http://wso2.com
> lean.enterprise.middleware
> mobile:  +94 71 2335936
> blog : lakmali.com
>
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Thanks
Abimaran Kugathasan
Senior Software Engineer - API Technologies

Email : abima...@wso2.com
Mobile : +94 773922820


  
  
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Lakmali Baminiwatta
On 11 October 2016 at 14:40, Uvindra Dias Jayasinha 
wrote:

> Thanks for the feedback, some interesting points were brought up
>
> @Abimaran, the problem with maintaining a rigid structure like old/new
> column is that if a user changes the value of 5 columns at a given time
> that would mean 5 different inserts to the table, when in actual fact it
> was a single transaction that took place when the user did the change and
> saved. So its better to use a implementation like
> google-diff-match-patch[1] to record the string diff between the values of
> the columns before the change took place and after the update. Though we
> dont need to worry about this implementation detail for now. The idea of
> using a single table to store the history of all tables that will require
> auditing sounds good.
>

IMO we have to think bit more on whether we really need to store the diff
correspond to the changes since it may result in a considerable growth of
the storage mainly during the development phase.  Also the implementation
could be complicated in terms of what columns to be considered since the
changes may take place in more than a single table for an artifact update.

Thanks,
Lakmali

>
> @Sanjeewa, yes this would improve performance when trying to retreive the
> LAST_UPDATED_TIME for a given entity.
>
> Let me elaborate a bit on Sanjeewa's point. So there can be only one
> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
> of the original entities schema. Having the LAST_UPDATED_TIME as part of
> the original entities schema gives a performance advantage on checking if a
> given entity has been modified since it was last checked. This is vital for
> features such as ETags support for the REST API. So CREATED_BY,
> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
> schema.
>
> We can still use the master audit table(building on Abimarans idea) to
> actually keep track of change history of a given entity, so that table
> could look like this,
>
> ENTRY_ID   PK
> TABLE_NAME VARCHAR
> ENTITY_ID  FK
> DIFF  BLOB
> ACTION*VARCHAR*
> ACTION_BY  *VARCHAR*
> ACTION_TIME   *TIMESTAMP*
>
>
>
> [1] https://code.google.com/p/google-diff-match-patch/
>
> On 11 October 2016 at 13:44, Sanjeewa Malalgoda  wrote:
>
>> I think we can manage audit table while still having CREATED_BY,
>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>> approach we may never need to do table scan of audit table while fetching
>> updates. So each updates will recorded in separate table while original
>> table having all relevant information. WDYT?
>>
>> Thanks,
>> sanjeewa.
>>
>> ___
>> Architecture mailing list
>> Architecture@wso2.org
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Regards,
> Uvindra
>
> Mobile: 33962
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Lakmali Baminiwatta
Associate Technical Lead
WSO2, Inc.: http://wso2.com
lean.enterprise.middleware
mobile:  +94 71 2335936
blog : lakmali.com
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Uvindra Dias Jayasinha
Thanks for the feedback, some interesting points were brought up

@Abimaran, the problem with maintaining a rigid structure like old/new
column is that if a user changes the value of 5 columns at a given time
that would mean 5 different inserts to the table, when in actual fact it
was a single transaction that took place when the user did the change and
saved. So its better to use a implementation like
google-diff-match-patch[1] to record the string diff between the values of
the columns before the change took place and after the update. Though we
dont need to worry about this implementation detail for now. The idea of
using a single table to store the history of all tables that will require
auditing sounds good.

@Sanjeewa, yes this would improve performance when trying to retreive the
LAST_UPDATED_TIME for a given entity.

Let me elaborate a bit on Sanjeewa's point. So there can be only one
CREATED_BY and CREATED_TIME for a given entity so that can remain as part
of the original entities schema. Having the LAST_UPDATED_TIME as part of
the original entities schema gives a performance advantage on checking if a
given entity has been modified since it was last checked. This is vital for
features such as ETags support for the REST API. So CREATED_BY,
CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
schema.

We can still use the master audit table(building on Abimarans idea) to
actually keep track of change history of a given entity, so that table
could look like this,

ENTRY_ID   PK
TABLE_NAME VARCHAR
ENTITY_ID  FK
DIFF  BLOB
ACTION*VARCHAR*
ACTION_BY  *VARCHAR*
ACTION_TIME   *TIMESTAMP*



[1] https://code.google.com/p/google-diff-match-patch/

On 11 October 2016 at 13:44, Sanjeewa Malalgoda  wrote:

> I think we can manage audit table while still having CREATED_BY,
> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
> approach we may never need to do table scan of audit table while fetching
> updates. So each updates will recorded in separate table while original
> table having all relevant information. WDYT?
>
> Thanks,
> sanjeewa.
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Regards,
Uvindra

Mobile: 33962
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Sanjeewa Malalgoda
I think we can manage audit table while still having CREATED_BY,
CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
approach we may never need to do table scan of audit table while fetching
updates. So each updates will recorded in separate table while original
table having all relevant information. WDYT?

Thanks,
sanjeewa.
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] Review DB table audit fields for API Manager C5

2016-10-11 Thread Abimaran Kugathasan
Hi,

What about having one AUDIT table with below schema structure?

ENTRY_ID   PK
TABLE_NAME  VARCHAR
FIELD_NAME   VARCHAR
OLD_VALUEVARCHAR
NEW_VALUE   VARCHAR
ACTION_BY VARCHAR
ACTION_TIME  VARCHAR

It's rare, that, we have to update all the Columns in a single update.
Also, this will reduce the number of AUDIT tables in the system.

With this approach, we can easily find things like

   1. Which tables were updated recently
   2. Which tables have not been updated in the past year
   3. Which tables have never been updated
   4. All changes made by a specific user in a time period
   5. Most active tables in a time period



On Tue, Oct 11, 2016 at 11:50 AM, Uvindra Dias Jayasinha 
wrote:

> *Context*
> We have started to look into API Manager's DB design for C5 and want to
> evaluate what was done in the past and see if there is room for improvement.
>
> This is specifically to talk about the below audit columns,
>
> CREATED_BY*VARCHAR*
> CREATED_TIME*TIMESTAMP*
> UPDATED_BY*VARCHAR*
> UPDATED_TIME*TIMESTAMP*
>
>
> that are currently present in many of the tables such as AM_API,
> AM_APPLICATION, AM_SUBSCRIBER, AM_SUBSCRIPTION, etc.
>
> *Cons of current approach*
>
> 1. Since the columns are part of the entities definition, we can only
> store a single value for UPDATED_BY and UPDATED_TIME for a given entity.
> Currently we store only the latest values so we can actually keep track of
> history.
>
> 2. Cant keep track of when an entity is deleted(Maybe we need this, maybe
> we dont)
>
> 3. In order to truly be bale to audit we need to keep track of the changes
> that were done to a given entity along with who changed it and when, but
> this is not possible at the moment with our current design.
>
> *Pros of current approach*
>
> 1. Easy to manage data since audit columns are part of the entity
> itself(No need for a separate table to store history and no need to deal
> with growing historical audit data)
>
>
> *Possible alternative*
>
> So things are much more simple now and easier to manage but does it really
> achieve what it sets out to do? To keep track of auditing information we
> ideally need to keep track of every single action that was performed on a
> given entity(n number of actions for a given entity). So this requires a
> separate table definition. I'm *NOT* proposing that we define a separate
> audit table for every single entity in the DB, just for a few important
> ones such as AM_API. So for example for
> APIs we could have an audit table with columns like below,
>
> ENTRY_ID *INTEGER PRIMARY KEY*
> API_ID  *INTEGER*
> ACTION*VARCHAR*
> ACTION_BY  *VARCHAR*
> ACTION_TIME   *TIMESTAMP*
>
> *Cons of new approach*
>
> 1. Data could grow for a given entity and may need to be managed separately
>
> *Pros of new approach*
>
> 1.  Above design does not cater for keeping track of the actual change
> that took place. I left that out on purpose because we dont support this
> currently also, but having separate table means we have provision to add a
> diff column and implement this feature in the future.
>
> 2. We can keep track of all the changes that the entity went through, who
> did it and when(including deletions if we want to).
>
> I would like to here your thoughts on this. Is this worth exploring or
> should we just do things the way we have always done?
>
>
> --
> Regards,
> Uvindra
>
> Mobile: 33962
>
> ___
> Architecture mailing list
> Architecture@wso2.org
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Thanks
Abimaran Kugathasan
Senior Software Engineer - API Technologies

Email : abima...@wso2.com
Mobile : +94 773922820


  
  
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture