Thanks for the feedback On 12 October 2016 at 09:33, Abimaran Kugathasan <[email protected]> wrote:
> > > On Tue, Oct 11, 2016 at 10:34 PM, Lakmali Baminiwatta <[email protected]> > wrote: > >> >> >> On 11 October 2016 at 14:40, Uvindra Dias Jayasinha <[email protected]> >> wrote: >> >>> Thanks for the feedback, some interesting points were brought up >>> >>> @Abimaran, the problem with maintaining a rigid structure like old/new >>> column is that if a user changes the value of 5 columns at a given time >>> that would mean 5 different inserts to the table, when in actual fact it >>> was a single transaction that took place when the user did the change and >>> saved. So its better to use a implementation like >>> google-diff-match-patch[1] to record the string diff between the values of >>> the columns before the change took place and after the update. Though we >>> dont need to worry about this implementation detail for now. The idea of >>> using a single table to store the history of all tables that will require >>> auditing sounds good. >>> >> >> IMO we have to think bit more on whether we really need to store the diff >> correspond to the changes since it may result in a considerable growth of >> the storage mainly during the development phase. Also the implementation >> could be complicated in terms of what columns to be considered since the >> changes may take place in more than a single table for an artifact update. >> > > We dont need to worry too much about how we are going to implement auditing at the moment since that is not a core requirement for the initial DB design. We could provide a config for users to turn on auditing when the system goes into production to avoid the growth during the development stages(ideally an auditing feature is a requirement for a production system). The actual implementation maybe complicated or simpler depending on what we want to do. But any auditing implementation is better than no auditing at all. > > Yes, also, one major problem with this approach is, getting diff between > two db updates and later use the diff to check the history. > > And, if we are going to have seperate audit table for each Major tables, > and how we are going to update that audit table when there is an update? If > we copy the entire row when there is an update to a column, later we don't > know for which column update, we have copy the old data to the audit table? > If we copy only the old value of a column which is updated, will increase > the audit table's number of entries. > > Having one audit table with table name and column name and the timestamps > like below will solve these problems. > > ENTRY_ID PK > TABLE_NAME VARCHAR > FIELD_NAME VARCHAR > OLD_VALUE VARCHAR > NEW_VALUE VARCHAR > ACTION_BY VARCHAR > ACTION_TIME VARCHAR > What is important is that we have the provision to implement this when we actually want to, which is our initial goal. So there will be no actual diff auditing to begin with. We will just record who did the update on a given entity along with the time stamp like we do now. But to answer your questions, 1. When we update an entity we replace the existing row with with the updated values. 2. Most of the values wouldn't have changed, but since we have a single update statement for APIs(We dont have separate update statements for updating individual rows in the API table) we dont actually know which columns have changed 3. So anyway we need to do a diff existing row and new data sent in the update statement to figure out which columns have actually changed We wont be having a separate audit table, just one. We dont need to copy the entire row over either. If we get the diff we can store that only, whcih should be smaller. We are not trying to replace DB backups :) > > >> Thanks, >> Lakmali >> >>> >>> @Sanjeewa, yes this would improve performance when trying to retreive >>> the LAST_UPDATED_TIME for a given entity. >>> >>> Let me elaborate a bit on Sanjeewa's point. So there can be only one >>> CREATED_BY and CREATED_TIME for a given entity so that can remain as part >>> of the original entities schema. Having the LAST_UPDATED_TIME as part of >>> the original entities schema gives a performance advantage on checking if a >>> given entity has been modified since it was last checked. This is vital for >>> features such as ETags support for the REST API. So CREATED_BY, >>> CREATED_TIME, LAST_UPDATED_TIME can remain with the original entities >>> schema. >>> >>> We can still use the master audit table(building on Abimarans idea) to >>> actually keep track of change history of a given entity, so that table >>> could look like this, >>> >>> ENTRY_ID PK >>> TABLE_NAME VARCHAR >>> ENTITY_ID FK >>> DIFF BLOB >>> ACTION *VARCHAR* >>> ACTION_BY *VARCHAR* >>> ACTION_TIME *TIMESTAMP* >>> >>> >>> >>> [1] https://code.google.com/p/google-diff-match-patch/ >>> >>> On 11 October 2016 at 13:44, Sanjeewa Malalgoda <[email protected]> >>> wrote: >>> >>>> I think we can manage audit table while still having CREATED_BY, >>>> CREATED_TIME,UPDATED_BY, UPDATED_TIME in same tables. So with that >>>> approach we may never need to do table scan of audit table while fetching >>>> updates. So each updates will recorded in separate table while original >>>> table having all relevant information. WDYT? >>>> >>>> Thanks, >>>> sanjeewa. >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>>> >>> >>> >>> -- >>> Regards, >>> Uvindra >>> >>> Mobile: 777733962 >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> Lakmali Baminiwatta >> Associate Technical Lead >> WSO2, Inc.: http://wso2.com >> lean.enterprise.middleware >> mobile: +94 71 2335936 >> blog : lakmali.com >> >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > Thanks > Abimaran Kugathasan > Senior Software Engineer - API Technologies > > Email : [email protected] > Mobile : +94 773922820 > > <http://stackoverflow.com/users/515034> > <http://lk.linkedin.com/in/abimaran> > <http://www.lkabimaran.blogspot.com/> <https://github.com/abimarank> > <https://twitter.com/abimaran> > > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Regards, Uvindra Mobile: 777733962
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
