Re: [Architecture] Review DB table audit fields for API Manager C5

Uvindra Dias Jayasinha Tue, 11 Oct 2016 22:35:51 -0700

Thanks for the feedback

On 12 October 2016 at 09:33, Abimaran Kugathasan <[email protected]> wrote:


>
>
> On Tue, Oct 11, 2016 at 10:34 PM, Lakmali Baminiwatta <[email protected]>
> wrote:
>
>>
>>
>> On 11 October 2016 at 14:40, Uvindra Dias Jayasinha <[email protected]>
>> wrote:
>>
>>> Thanks for the feedback, some interesting points were brought up
>>>
>>> @Abimaran, the problem with maintaining a rigid structure like old/new
>>> column is that if a user changes the value of 5 columns at a given time
>>> that would mean 5 different inserts to the table, when in actual fact it
>>> was a single transaction that took place when the user did the change and
>>> saved. So its better to use a implementation like
>>> google-diff-match-patch[1] to record the string diff between the values of
>>> the columns before the change took place and after the update. Though we
>>> dont need to worry about this implementation detail for now. The idea of
>>> using a single table to store the history of all tables that will require
>>> auditing sounds good.
>>>
>>
>> IMO we have to think bit more on whether we really need to store the diff
>> correspond to the changes since it may result in a considerable growth of
>> the storage mainly during the development phase.  Also the implementation
>> could be complicated in terms of what columns to be considered since the
>> changes may take place in more than a single table for an artifact update.
>>
>
> We dont need to worry too much about how we are going to implement
auditing at the moment since that is not a core requirement for the initial
DB design. We could provide a config for users to turn on auditing when the
system goes into production to avoid the growth during the development
stages(ideally an auditing feature is a requirement for a production
system). The actual implementation maybe complicated or simpler depending
on what we want to do. But any auditing implementation is better than no
auditing at all.

>
> Yes, also, one major problem with this approach is, getting diff between
> two db updates and later use the diff to check the history.
>
> And, if we are going to have seperate audit table for each Major tables,
> and how we are going to update that audit table when there is an update? If
> we copy the entire row when there is an update to a column, later we don't
> know for which column update, we have copy the old data to the audit table?
> If we copy only the old value of a column which is updated, will increase
> the audit table's number of entries.
>
> Having one audit table with table name and column name and the timestamps
> like below will solve these problems.
>
> ENTRY_ID           PK
> TABLE_NAME      VARCHAR
> FIELD_NAME       VARCHAR
> OLD_VALUE        VARCHAR
> NEW_VALUE       VARCHAR
> ACTION_BY         VARCHAR
> ACTION_TIME      VARCHAR
>

What is important is that we have the provision to implement this when we
actually want to, which is our initial goal. So there will be no actual
diff auditing to begin with. We will just record who did the update on a
given entity along with the time stamp like we do now. But to answer your
questions,

1. When we update an entity we replace the existing row with with the
updated values.

2. Most of the values wouldn't have changed, but since we have a single
update statement for APIs(We dont have separate update statements for
updating individual rows in the API table) we dont actually know which
columns have changed

3. So anyway we need to do a diff existing row and new data sent in the
update statement to figure out which columns have actually changed

We wont be having a separate audit table, just one. We dont need to copy
the entire row over either. If we get the diff we can store that only,
whcih should be smaller. We are not trying to replace DB backups :)

>
>
>> Thanks,
>> Lakmali
>>
>>>
>>> @Sanjeewa, yes this would improve performance when trying to retreive
>>> the LAST_UPDATED_TIME for a given entity.
>>>
>>> Let me elaborate a bit on Sanjeewa's point. So there can be only one
>>> CREATED_BY and CREATED_TIME for a given entity so that can remain as part
>>> of the original entities schema. Having the LAST_UPDATED_TIME as part of
>>> the original entities schema gives a performance advantage on checking if a
>>> given entity has been modified since it was last checked. This is vital for
>>> features such as ETags support for the REST API. So CREATED_BY,
>>> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities
>>> schema.
>>>
>>> We can still use the master audit table(building on Abimarans idea) to
>>> actually keep track of change history of a given entity, so that table
>>> could look like this,
>>>
>>> ENTRY_ID           PK
>>> TABLE_NAME     VARCHAR
>>> ENTITY_ID          FK
>>> DIFF                  BLOB
>>> ACTION            *VARCHAR*
>>> ACTION_BY      *VARCHAR*
>>> ACTION_TIME   *TIMESTAMP*
>>>
>>>
>>>
>>> [1] https://code.google.com/p/google-diff-match-patch/
>>>
>>> On 11 October 2016 at 13:44, Sanjeewa Malalgoda <[email protected]>
>>> wrote:
>>>
>>>> I think we can manage audit table while still having CREATED_BY,
>>>> CREATED_TIME,UPDATED_BY, UPDATED_TIME  in same tables. So with that
>>>> approach we may never need to do table scan of audit table while fetching
>>>> updates. So each updates will recorded in separate table while original
>>>> table having all relevant information. WDYT?
>>>>
>>>> Thanks,
>>>> sanjeewa.
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Uvindra
>>>
>>> Mobile: 777733962
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> Lakmali Baminiwatta
>> Associate Technical Lead
>> WSO2, Inc.: http://wso2.com
>> lean.enterprise.middleware
>> mobile:  +94 71 2335936
>> blog : lakmali.com
>>
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Thanks
> Abimaran Kugathasan
> Senior Software Engineer - API Technologies
>
> Email : [email protected]
> Mobile : +94 773922820
>
> <http://stackoverflow.com/users/515034>
> <http://lk.linkedin.com/in/abimaran>
> <http://www.lkabimaran.blogspot.com/>  <https://github.com/abimarank>
> <https://twitter.com/abimaran>
>
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Regards,
Uvindra

Mobile: 777733962

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Re: [Architecture] Review DB table audit fields for API Manager C5

Reply via email to