[Ledger-smb-devel] Managing database change: improvements needed

Erik Huelsmann Wed, 26 Aug 2015 05:23:22 -0700

Hi all,


There are a few issues that I have with the way we currently handle
database schema changes.

Current situation
============
We have a main schema file for the table definitions and a number of
modules with stored procedures grouped by "subject".
Next to that, there's a "Fixes" file, which contains all incremental
updates to the schema since some version.
When I want to change the schema as part of our development, I need to
change the schema definition files *and* I need to add the schema changes
to the Fixes.sql file.

Problems with the current approach
==========================
The current approach has (IMO) several problems:
* When loading the "Fixes.sql" file, even in a production environment,
produces a truckload of ERRORs, because the schema may contain some or all
of the fixes in the file, leading to aborted transactions
* Making adjustments in multiple places can lead to incompleteness on
either side
* Fixes.sql is loaded twice; before and after loading the stored
procedures, making it unsuitable for some kinds of fixes
* In order to change types or parameter lists of stored procedures, they
need to be dropped before being (re)created, resulting in DROP TYPE and
DROP FUNCTION all over the place.

Especially the high number of ERRORs in the log file when creating a new
database is disconcerting to new users. Myself, I'm running into the DROP
TYPE and DROP FUNCTION more and more often as my refactorings for 1.5 and
1.4-mc reach deeper into the system.

Additional requirements
=================
* Next to solving the problems above, some users have expressed the desire
to extend LedgerSMB locally. However, the way we currently upgrade the
system removes all customizations (at least from 1.2->1.3->1.4; maybe not
1.4->1.5). If our changed approach could help solve this requirement, all
the better.
* The solution chosen preferably works with a strategy of branching and
merging as we do in the development cycle.

Possible solutions
=============
One thing that I think we should do is separate development from production
when loading the database. That is to say: when creating a new database,
that is a different step than upgrading a production database which is in
turn again very different from working on a development database. Our
processes should work to provide the best for each.

Proposed solution to the duplicate change problem
=====================================
In the past we have been talking about adding a "build" step to LedgerSMB.
This step could be used to produce a single schema file for quickly setting
up a new database (company). This schema file would be a build artifact and
no longer be version controlled itself. It'd be the result of a full schema
creation step, including running the Fixes.sql file.
Additionally, this step could be used to deliver an up-to-date
doc/database/ directory with current database documentation.
While this step may feel inhibiting for development, one thing I'm thinking
is that we may not need to require this step to be executed on a
development system, except for when testing the production deployment.

Proposed solution to the slew of errors from Fixes.sql
=======================================
There are actually a number of solutions here, as I see it, all of them
revolving around the idea that every schema change should be applied once.
Basically, I see 2 categories of solutions:
1. Do it ourselves
2. Re-use the work of others

The benefit of (1) is that we get full control and no further dependencies
for development or production. However, the downside is that we get to do
all the thinking and problem solving as well.
In both categories I see similar solutions available:
a. Numbered changes
b. Named changes
and in (1) I see a solution that's not available in category (2):
c. Use PostgreSQL EXTENSIONs

As for category (2), I haven't looked too far around yet, but I did find
sqitch (http://sqitch.org/); Sqitch offers a command line tool for the
development workflow. Additionally, it provides facilities for deploying
only the necessary changes with releases *and* it provides an API which we
can use to upgrade the database from one (patch) release to another.

As for sqitch, I have no experience with it yet. It's supposed to work well
with branching and merging. One thing it *does* do is integrate with
version control tools and it means to integrate with the project's
repository. Ideally when merging and branching, no additional processing is
required to integrate the changes from branches. However, I'm not exactly
sure that's what happens (given
https://groups.google.com/forum/#!searchin/sqitch-users/merge/sqitch-users/GXqgt7nJ_1k/Vvg-r1HOEqMJ)
but I think the referenced link is about reordering changes which already
have been partially deployed. What I like about Sqitch is that it
integrates with the VC system, but is VC system agnostic, some of us can
use it with Git while others can keep using Hg-git as they currently do.

What I *am* sure about is that (1) will be completely change-oriented.
*Maybe* we can use sqitch with a the current module-based source code
organization.


If we're going to go with Sqitch, I don't think it's a good idea to switch
just then and there, but test it to build up some experience and then
choose a well defined point in time to start using it.

So, there are 2 proposals here. What do you say?


-- 
Bye,

Erik.

http://efficito.com -- Hosted accounting and ERP.
Robust and Flexible. No vendor lock-in.

------------------------------------------------------------------------------

_______________________________________________
Ledger-smb-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ledger-smb-devel

[Ledger-smb-devel] Managing database change: improvements needed

Reply via email to