Hi Craig,

Am Donnerstag, 3. Dezember 2009 06:35:23 schrieb Craig Ringer:
> The director is responsible for managing all the metadata, and it's the
> component that connects to Pg.
> 
> If the fd sent the system charset along with the bundle of filenames etc
> that it sends to the director, then I don't see why the director
> couldn't `SET client_encoding' appropriately before inserting data from
> that fd, then `RESET client_encoding' once the batch insert was done.
> 
> The only downside is that if even one file has invalidly encoded data,
> the whole batch insert fails and is rolled back. For that reason, I'd
> personally prefer that the fd handle conversion so that it can exclude
> such files (with a loud complaint in the error log) or munge the file
> name into something that _can_ be stored.
> 
> Come to think of it, if the fd and database are both on a utf-8
> encoding, the fd should *still* validate the utf-8 filenames it reads.
> There's no guarantee that just because the system thinks the filename
> should be utf-8, it's actually valid utf-8, and it'd be good to catch
> this at the fd rather than messing up the batch insert by the director,
> thus making it much safer than it presently is to use Bacula with a
> utf-8 database.

+1!

The normal case is that encoding of fd fits into db encoding, converting 
should be the exceptional case.

if the fd is converting names with complaining about this problem is it a 
smaller pain as as totally broken backup (rolling back of batch insert) or 
loosing this file. Bacula can be additionally mark this file in db as 
'converted name' that a restore can also notify about this issue in his log 
with back converting or converting to a new destination encoding was 
successfully or impossibly.

1. fact: dir has the possibility to know how the file/path table is encoded
2. fact: fd has the possibility to know how the filesystem should be encoded

where is the concrete problem* to proof the encoding health of a filename/path 
and the possibility to convert in a storable and hopefully back convertible 
format? Such a converted filename can also be stored in a additionally table 
with original name in a blob field or anything else and a foreign key to the 
file table, a simple outer join can give back this additionally information.

The encoding of db and fd os and/or filesystem should be match or fit, if a 
client configuration has a generally not matching encoding of the operating 
system or locally found filesystems**, can be complained on fd start over the 
dir. This can be a lot of work ...

*) the cpu power maybe a problem, this option should be configurable

**) if fd can proof this, imho os depended.

just my 2 € cents 

regards 
    falk 


------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to