I have been using Bacula for over two years quite happily on an old  
Red Hat 9 server. The last version of Bacula that I used was a hand- 
compiled 2.0.0 with PostgreSQL 7.3.9.

This server is the data storage for my Mac OS X and Windows clients ,  
which it serves with Netatalk and Samba. So any given file can be  
accessed via Netatalk, Samba or directly as a Linux file. Filenames  
can be English or German (so may contain umlauts).

I decided to migrate the system to Fedora 8 and go with the RPMs in  
that distro (Bacula 2.0.3 and PostgreSQL 8.2.5). I had previously done  
upgrades from Bacula 1.3.x to 2.0.0, and I've migrated other  
PostgreSQL databases across release versions, so I thought I knew what  
I was doing. I dumped the Bacula database from PG 7.3.9 using the  
PostgreSQL facility in Webmin and restored it to a new UTF-8 database  
in PG 8.2.5.

I had to make a few minor alterations to my bacula-*.conf files (I  
built stuff under "/opt/bacula" and the Fedora distro has it under  
"/") and I was ready to go.

Given the time of year, the first task to throw at the new system was  
the annual full backup. This helped me to shake out the usual silly  
errors that one has, but it's also thrown me up against something  
which I don't know how to handle in Bacula.

I have some filenames which contain lowercase-a-umlaut or lowercase-u- 
umlaut which Netatalk has encoded in MacRoman (<8A> and <9F>  
respectively). PostgreSQL takes exception to these and Bacula  
generates  messages like the following:

04-Jan 15:31 hub-dir: Annual_Backup.2008-01-04_15.26.38 Fatal error:  
sql_create.c:870 sql_create.c:870 insert INSERT INTO Filename (Name)  
VALUES ('2004-04-29 Z<9F>rich 0002.jpeg') failed: ERROR:  invalid byte  
sequence for encoding "UTF8": 0x9f
HINT:  This error can also happen if the byte sequence does not match  
the encoding expected by the server, which is controlled by  
"client_encoding".

because they fall into an area of UTF-8/Unicode/ISO-8859-1 which seems  
to be unassigned (<82>-<8C> and <90>-<9F>).

These files are not new and I didn't have this problem with my old  
configuration, so something has presumably changed in Bacula or  
PostgreSQL.

The major issue for me is that these are fatal errors which cause the  
backup to fail. I understand from other postings around the net that  
Bacula can have no knowledge about the encoding of file names. But  
then, isn't using UTF-8 also wrong? Shouldn't it be plain binary bytes  
as far as Bacula is concerned? Is there a way of switching this  
feature off? Or marking the file somehow and only generating a warning?

Describing the problem for this posting has helped me to understand  
how to solve my problem, but I'll post it anyway in because
        a) it seems to me that there is a real issue here and
        b) this may help other people.

The simple solution to the problem is to replace the a-umlaut and u- 
umlaut in the filenames with "ae" and "ue" respectively. That's the  
auncient 1970s-style solution to the problem, but is offensive to the  
German reader's eye. Clearly I am going to have to look at how  
Netatalk and Samba handle character encodings to get a good long-term  
fix. Dare I mention that I'm also introducing NFS into the LAN?

Happy New Year!

Steve

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to