2005/12/18, John Siracusa wrote: [..snip..] > > An alternative approach is to define new layer of classes on top of > > for e.g. R:D:O::Metadata::Column::Varchar/Char/Text where to perform > > the UTF decoding and then to change the type-to-class mapping in > > R:D:O::Metadata. All my derived classes will use the new meta data > > object and they will have Unicode support. > > That approach will have the best performance, but it's also the most work.
You are right about performance penalty in trigger approach but I saw a benefit of triggers after playing with the code -- the trigger is actually executed when I access the property for the very first time. This may be beneficial if I work with many object instances and I don't access all the UTF8 properties thus minimizing the UTF decoding 'on-demand'. Another cool thing is that R:D:O doesn't execute the trigger on the next access. > > So, my questions are: > > 1. Am I doing the Right Thing by adding trigger to a column to perform > > UTF8 conversion/inflation > > It's a valid approach, yes. One way to automate it would be to simply > define your own custom metadata class and then override > make_column_methods() or initialize() to apply your UTF-8 trigger to all > your columns automatically. Example (untested code): > > sub make_column_methods > { > my($self) = shift; > > $self->SUPER::make_column_methods(@_); > > foreach my $column ($self->columns) > { > next unless($column->type =~ /^(?:text|varchar|character)$/); > > $column->add_trigger(inflate => sub > { > my $self = shift; > my $value = shift; > if(!Encode::is_utf8($value)) > { > $value = Encode::decode_utf8($value); > } > return $value; > }); > > return; # return value is not significant > } > Actually I did something similar to your example -- I redefined the 'add_columns' in my own meta class to add trigger after defining column. I checked your example -- it also works. However it is mastery to me why both automations don't work using auto_initalize in my R:D:O object? It works fine when I manually specify the columns/keys/relations (or copy/pasting the output of perl_class_definition in the module). Any ideas? > > 3. Do think of some other approach to convert octets coming from > > database into Unicode scalars > > If you just want to convert data as it comes from the database, you might > want to use an on_load trigger instead of an inflate trigger. The trigger > code would look slightly different: > > sub make_column_methods > { > my($self) = shift; > > $self->SUPER::make_column_methods(@_); > > foreach my $column ($self->columns) > { > next unless($column->type =~ /^(?:text|varchar|character)$/); > > my $get_method = $column->accessor_method_name; > my $set_method = $column->mutator_method_name; > > $column->add_trigger(on_load => sub > { > my $self = shift; > > # Triggers disabled within a trigger, so no infinite recursion here > my $value = $self->$get_method(); > > if(!Encode::is_utf8($value)) > { > $self->$set_method(Encode::decode_utf8($value)); > } > return; # return value is not significant > }); > > return; # return value is not significant > } > } Sure - this works fine too (checked) -- I haven't decided yet what trigger I'll use (on load or inflate) because the inflate can fix octets coming from other sub-system which doesn't understand UTF8 and just pass through octets. > > 2. Could someone show an example code how to extend > > R:D:O::Metadata::Column::Varchar column type so it can inflate the > > values loaded from DB. > > I'd actually like this functionality to be in the core distribution. > > Were you to do it on your own, the best approach would be to make your own > trivial subclasses of the column classes, then point those column classes at > your own custom method maker classes. Finally, make your own trivial > metadata object subclass and map the appropriate column types to your new > column classes. I'm not sure I get this right. I do understand that I need custom meta class in order to add my own mapping type->class. I'll create subclasses of text/varchar classes. The mapping reuses existing non-text classes in RDBO and uses my own text-classes. I don't understand when method maker enters the game. A sample code would be appreciated (or just ignore me :) > > There are "shorter" ways to do this, but the approach described above > ensures that the default behavior remains unchanged for any RDBO-based > classes that do not want your modified behavior. > > Anyway, like I said, I'd like to make this stuff built-in since it's a > reasonably common task. I'm thinking of perhaps making some column classes > like this: > > Rose::DB::Object::Metadata::Column::Varchar::WithEncoding > Character > Text > > and then adding attributes for "check encoding" and "set encoding." Example > usage: > > __PACKAGE__->meta->columns > ( > name => > { > type => 'varchar', > length => 255, > check_encoding => \&Encode::is_utf8, > set_encoding => \&Encode::decode_utf8, > }, > ... > ); > > where the set_encoding function is called on the column value if the > check_encoding function returns false when passed the current value. Then > it'd be trivial to add ::UTF8 column variants that simply hard-code the > check/set_encoding functions. > > Perhaps I could even "intelligently" substitute these classes if a > text-based field has its "utf8" attribute set. Hmmm. > > __PACKAGE__->meta->columns > ( > name => { type => 'varchar', length => 255, utf8 => 1 }, > ... > ); The described new features above fit perfectly with my needs! However, I'm not sure they should be in core RDBO distro since the whole UTF8 adventure roots in the lack of UTF8 support in DBD::mysql. I think they should be in separate CPAN distro managing with UTF8 incompatibilities. BTW, does someone has similar problems with Postgre/SQLite? > > Anyone have any suggestions for better approaches to this problem? Is there > an even more generic way to handle encoding/decoding? Should these checks > and operations be done on load only or on set as well? I lean toward the trigger approach done on inflate because (as stated above): (a) it is execute on demand (b) converts any broken octets which aren't supposed to be in a full-blown UTF8 app. > > -John Thank you for the comprehensive and thorough answer! Keep up the good job! - Svilen ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Rose-db-object mailing list Rose-db-object@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rose-db-object