Hi!

I just played around a bit in my Midgard configuration and decided to
meet the "technology challenge" and upgrade my local development copy of
http://siriux.net/ to the newest packages, and after this all works fine
now, I decided to share my experience with that with the Midgard
community.

Current configuration: 
- Apache 1.3.26
- PHP 4.1.2 (both Debian Woody)
- Midgard 1.5.0
- IPv4 connectivity
- ISO-8859-1 charset

The goal:
- Apache 2.0.48
- PHP 4.3.4
- Midgard 1.6.0alpha2 (without Multilang because of compatibility)
- IPv6
- UTF-8 charset


Upgrading Apache
----------------

The first major problem here was, that Debian Woody has no Apache 2
package, so I decided to move to Debian Sarge, which is quite stable as
well, but has newer packages. I also tried to backport the necessary
packages to Woody, but this failed miserably, see below.

Since Sarge does not have PHP 4.3.4 (I guess it's 4.2.x), I backported
this one from Debian Unstable without any problems.

After this, I had Apache 2 (and thus IPv6) and PHP 4.3.4 running quite
nicely.


Compiling midgard-core
----------------------

The first Midgard package to build was "midgard-core" which provides
libmidgard and repligard.
After adjusting its "debian" control files a little bit (fixing
dependencies and adding "--without-multilang" to the configuration
flags) I was able to build the three packages "libmidgard",
"libmidgard-dev" and "repligard" out of it.


Compiling midgard-apache2
-------------------------

The second package was midgard-apache2. I did not try to build Debian
packages for that because the control files are really outdated.
After some effortless tries, I found out that the "--with-apxs" switch
in "configure" has no effect and apxs could not be found (it's called
apxs2 instead of apxs for Debian).
Having been compiled and installed, it made Apache2 segfault every time
I started it.
After some investigation in that I found out three things:

- midgard-apache2 seems to need gcc > 3.2 to run. 
  With gcc-2.95 or 3.0 it compiles but segfaults on Apache2 start.
  (this one kicked Debian Woody out of the race after some frustrating
  hours)

- The whole Midgard configuration must not be defined in a virtual host.
  In my former configuration, everything from "MidgardEngine On" to
  "MidgardBlobDirectory" was between the <VirtualHost> tags.
  This makes midgard-apache2 segfault as well.
  It took me another some hours to find out but after putting everything
  in the global Apache2 config helped finally.

- Some directory and file locations are wrong
  especially the midgard-root.php file is put in the wrong directory and
  then later looked for in another wrong directory.
  This should get fixed soon in the Makefile...


Compiling midgard-php4
----------------------

For this package, the "configure" file seems to be really broken. It did
not accept my "--with-apxs" switch, so I had to manually change "apxs"
to "apxs2" in the configure script. This may also be the reason for the
missing defines in the Makefile: I had to add "-DMIDGARD_APXS2" to the
DEFS line and "-I/usr/include/apr-0" to INCLUDES.


Now I had a working Midgard 1.6.0 environment which worked quite nicely
with my old Midgard database (copy of siriux.net).


UTF-8 Encoding
--------------

The second most painful task (after midgard-apache2 and downloading and
installing Debian three times) was switching the database to UTF-8.

The Midgard Site says that it's only a thing of using repligard to
export the database, switching the "MidgardParser" Apache configuration
value to "utf-8", adjusting the repligard config and importing the
database again.

For Sites that are already PHP+UTF-8 ready, that may be true, but the
default Apache2 / PHP4 configuration isn't. Somewhere I found a quote of
one of the PHP authors that says "PHP is designed for latin1, not
UTF-8". How promising.
I had to add some more configuration options to my vhost config:

        AddDefaultCharset UTF-8 
        php_value default_charset UTF-8
        php_value mbstring.func_overload 7
        php_value mbstring.internal_encoding UTF-8
        php_value mbstring.detect_order UTF-8

The first line sets the UTF-8 information in the HTTP header, this
enables Browsers to automatically discover that it's UTF-8 what they're
getting. The second one informs PHP about this. The "mbstring" commands
replaces some PHP functions like "substr", "len", etc. with "mb_substr",
"mb_len", etc., which are working with multibyte character sets (as
UTF-8).

With these changes, everything works without any problems. I'll use this
system as my local staging site and see if there are any problems (also
with converting, because the "live" site is still latin1/Apache1...).

You can see the result at http://sirius.ipv6.siriux.net/ (however you'll
need IPv6 ;-) and my server must be running, I don't guarantee for that)

So as you can see, Midgard *is* UTF-8 ready, I consider this as a big
leap forward in terms of standardization and internationalization.


TODO
----

* See how MidCOM performs on this platform (I discovered some problem
with htmlentities, which is heavily used by MidCOMs html formatter..)
* See if foreign languages work with MidCOM (e.g. russian / finnish
special characters)

        Nico.

-- 
Nico Kaiser     :: 
[EMAIL PROTECTED] :: http://siriux.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to