[midgard-user] Case study: Midgard framework in action

martin langhoff Thu, 21 Nov 2002 15:15:07 -0800

A client of CWA New Media has recently made live a site we have
developed using Midgard as the underlying framework. The project overall involved 3 companies, responsible for the back-end, front-end and hosting. This three teams, plus a sizable team put together by the client, worked for over a year from prototype to launch date.

The site is a high profile portal with an extensive search system that
is driven by a variant of Dublin Core metadata called NZGLS. An
additional search system is provided by Autonomy.

Our team was in charge of developing the front end, which included some
CMS functionality. We decided to use Midgard as a framework, and we also
provided a simplified CMS inteface based on 'OldAdmin'. The team
involved a project manager, one art director/designer, one designer, one
html developer, one architect/programmer and four programmers. Not
everybody was working on it all the time, but for a good 5 months we had
a core team of 6 working full time on it.

Infrastructure
==============

After the initial pilot stage, that did not use Midgard at all, a lot of
time was spent in setting up infrastructure to support the project. The
core issues solved were:

- Install. The hosting is handled by a different company. This meant
we had to provide install scripts and instructions that were perfect to
avoid conflict with the hosting company. A makefile was put together
that would compile apache/php/midgard from source, configure them, load
our application, and configure it as well. It was later enhanced to
handle upgrades while preserving user data. We also identified the
package dependancies in detail for RedHat 7.2 and Debian 2 (Potato).

- Midgard version. While this project was being developed, Midgard was
in flux, as a new PHP version had been released, but the released
Midgard did not compile against it. The same makefile was used to
experiment with different Apache/PHP versions and compile flags against
Midgard nightly snapshots. Luckily (and thanks to Torben), an official
Midgard release happened before the project was over.

- Private sandboxes. With all the versioning issues, developers were
using different Apache/PHP/Midgard combinations. They had to be able to
compile their own set and run it, without having root privileges, and
inside their home directories. As part of the install process, we had to
patch some of the configure and shell scripts so that they (a) link to
the private libraries and not the system-wide libraries and (b) accept
to run as non-root. We could not fix (a) completely: if there was a
systemwide libmidgard.so, we could not avoid midgard-php4 and others
from linking to it. The solution was to remove systemwide PHP and
Midgard installs on the development server.

- CVS integration. We developed a series of scripts to help
integration with CVS of repligard's exported XML files. In particular,
we order objects to their GUIDs. We also fix their dates to 0 and then
restore the date to the current timestamp. Host objects also have their
name and port changed on export, and changed back before import.
Basically, things that are volatile or particular to an installation or
working copy are not allowed to go into CVS. As developers were working
on different platforms (Windows, Mac OS9, Mac OS X, Linux) we also have
to fix newlines on export using mac2unix and dos2unix. This is
apparently addressed by YAMP as well.

Once this infrastructure was in place, we used it to support other
aspects of the project. Our CVS repository would export out a nightly
release, and a script on a public server would run the whole install
process. The client could either have the nightly release set up by the
hosting company or use it on the daily build machine.

The daily build flushed the previous day database, and removed the code
completely. We could have run them all in paralell instead, had we
feared regressions.

As the project progressed, we added regression tests using perl's
HTTP::WebTest and added the regression tests to the daily build process.

The nightly release/daily build added a Changelog that was easily
accessible, so all parties involved could monitor commits. We were also
running a shared Bugzilla, and commits would usually indicate bug
numbers, and provide links to the corresponding bugzilla page.

Approach
========

We used Midgard as a framework to provide certain services, but a
significant part of the application was not based on Midgard, except for
its use of the templating system. Keeping track of critical code inside
Midgard Snippets and trying to edit it through a web-based interface was
considered risky and impractical.

Instead, we decided we would develop the code as regular files, taking
into account Midgard's approach of separating business logic
('code-init') from display logic ('content').

For each case where we decided to develop outside Midgard, we built a
set of three files, accessible directly through Apache with no Midgard
intervention. The files were called code-init.php, content.php and
wrapper.php. The wrapper file would (a) do anything we were doing in
code-global, like include some libraries, (b) include code-init.php (c)
include content.php.

This way, we split the team working on non-Midgard application logic and
from the team working on Midgard/CMS tasks. The ability to work in
their editors and debug a complex application without having to deal
with Midgard (a complex application on its own) was a major benefit.
Once both sides were mature, the integration was simple.

The team working on the application logic still drops back to their thin
wrappers to develop and fix. When the team working on Midgard took
longer than expected to reach the maturity point were we decided to
integrate, but this had almost no negative impact on the project, as it
did not hold the other team back.

The team working on Midgard did so mostly using the web-based interfaces
"Old Admin" and Asgard. For advanced users who are efficient in their
preferred editors these tools forced them to use impractical and time
consuming <textfields>. We are now exploring using PHP Mole.

Using Old Admin, it is easy to understand why the coding style it is
written in is so terrible. Unfortunately, this coding style makes
patching it (and maintaining patches over time) a nightmare. I would
personally like to cost-justify applying PEAR standards to SpiderAdmin.

Other changes that would be welcome on that front have to do with not
requiring REGISTER_GLOBALS, and being clean of warnings.

Working with Midgard
====================

The setup for this project consists of two databases, one for being the
'staging' environment and the other the 'live' environment. The live
environment is 'read-only', as no data is tracked in the database at all.

Changes happen on the staging database and are replicated to live. We
only replicate to live articles that have been approved after they have
been edited (approved>edited). This is accomplished by running a perl
script that fudges the repligard table before running the repligard
export. Of course, the repligard export is made to a temp file and that
file is imported into the live database. This is all run in a crontab,
and output goes to a log.

We found that if the import fails for any reason, it is hard to re-run
it properly to restore the live DB. Object deletions can be replayed,
but objects that have been edited after the failed replication, and not
yet approved, won't be replicated, as that 'correct' version is no
longer in the database.

We also found that due to a minur bug, object deletions were not being replicated because their entry in the repligard table was wrong. A patch was available on the mailing list and the bigtracker, and thus applied as part of the install process. This made apparent that not many Midgard sites are using staging/live replication.

For certain restricted groups of objects, namely articles belonging to
certain topics, we have set up Version Control using NemeinRCS. Editors
can see the article's history, see old versions and roll back.
Granularity of the VC is a key issue. We have implemented commits on
every approval, as opposed to commits on every form submission.

It has to be noted, though, that using RCS means your data storage is
now the mysql database plus the RCS files. Replicating the database
(either with mysql utilities, repligard or plain old cp) is no longer
enough. This has implications for backup/restore as well. Plan to
rethink and change your cronjobs to match the new situation.

In the end, the client has found that the customized interface we have
provided for content management does not completely meet their needs,
have moved to importing data from an RSS source they manage.

One interesting detail is that we decided to be paranoid and avoid building URLs using article IDs or names, as they were too volatile. Using GUIDs for linking articles has initially regarded as unnecessarily paranoid. However, halfway through the project the client indicated they would run several installations transparently, switching around them (and/or load-balancing) as required, and cross replicating data as they saw fit.

Linking using GUIDs saved the day more than once since, even using it just for articles. Topics and pages seem to be far less volatile, and at any rate the client is aware that they will break links by changing them.

The website hosts content in English and Maori. Maori uses vowels with macrons, only available in Unicode, so part of the challenge was ensuring proper storage, retrieval, version control and display of data with high Unicode characters.

One of Midgard's secret weapons seems to be its pan-european user base: setting Midgard to 'Russian' mode actually meant UTF-8. After doing that, we only had to ensure that _our_ calls to PHP's entities() had the correct parameters for UTF-8. Midgard's internal calls to entities() or its C equivalent were Unicode-clean. Repligard and related tools worked perfectly, and CVS and RCS kept track of code and data with consistency.

Performance & Reliability
=========================

We have found performance and reliability to be excellent. Midgard is
based on a fast and reliable infrastructure, and does not disappoint. We
unfortunately do not have performance measurements, although we know
the client has commissioned aggressive stress tests on the setup.

Informal feedback indicates performance was good. We would certainly
know if it was not.

Conclusion
==========

Through the project, we have found Midgard to be a reliable platform for
development, and are now exploring new tools to make it more efficient
and practical. PHPMole, YAMP and MidCOM are on our list.

That we needed to do some infrastructure work up front points to a
framework that needs its tools to mature a bit. I am tempted to think
that is is a matter of a better, more comprehensive, install approach, as the tools themselves worh without hitches.

As a development framework, it should be simple to install the whole
toolbox, or at least to identify the core tools. The Midgard site can do
a lot helping new people discover not only how to install Midgard, but
"what is the toolbox that efficient Midgard users/teams use?" and "how
do the tools fit together"?

Having every single tool on a separate site doesn't help, as there are
relatively few links cross-sites and almost no 'big picture' documents
showing how it fits together.

And last, but not least, the community around Midgard is its greatest
asset. The likes of Emiliano, Torben, Henri, Piotras, Sergei and many
others make anything possible. Thanks for all the help, patience and code.

regards,

martin langhoff

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[midgard-user] Case study: Midgard framework in action

Reply via email to