On 12/18/05 4:16 AM, Svilen Ivanov wrote:
> After reading the extensive documentation of R:D:O classes I added
> trigger on the columns that store chars/varchars/texts like that:
> 
> # the column 'name' is varchar
> __PACKAGE__->meta->column('name')->add_trigger(
>    inflate => sub {
>       my $self = shift;
>       my $value = shift;
>       if (!Encode::is_utf8($value))  {
>          $value = Encode::decode_utf8($value);
>       };
>       return $value;
>    });
> 
> This approach works just fine but the main disadvantage is that I have
> to put such code for each column and for each class I have or ever
> create.

It's also slightly slower than the other approach you mention, if that
matters to you.  Adding a trigger will always slightly slow down a column
accessor method, since it requires an extra check to see if triggers exist,
and at least one extra method call to run the original accessor code after
the trigger runs.  Removing all triggers removes all of this overhead,
including the trigger check.  RDBO has no triggers defined by default.

The performance difference is likely negligible when compared to the rest of
the things in your app (e.g., the actual database access! :) but I just
thought I'd mention it.

> An alternative approach is to define new layer of classes on top of
> for e.g. R:D:O::Metadata::Column::Varchar/Char/Text where to perform
> the UTF decoding and then to change the type-to-class mapping in
> R:D:O::Metadata. All my derived classes will use the new meta data
> object and they will have Unicode support.

That approach will have the best performance, but it's also the most work.

> So, my questions are:
> 1. Am I doing the Right Thing by adding trigger to a column to perform
> UTF8 conversion/inflation

It's a valid approach, yes.  One way to automate it would be to simply
define your own custom metadata class and then override
make_column_methods() or initialize() to apply your UTF-8 trigger to all
your columns automatically.  Example (untested code):

sub make_column_methods
{
  my($self) = shift;

  $self->SUPER::make_column_methods(@_);

  foreach my $column ($self->columns)
  {
    next  unless($column->type =~ /^(?:text|varchar|character)$/);

    $column->add_trigger(inflate => sub
    {
      my $self = shift;
      my $value = shift;
      if(!Encode::is_utf8($value))
      {
        $value = Encode::decode_utf8($value);
      }
      return $value;
    });

    return; # return value is not significant
  }

> 3. Do think of some other approach to convert octets coming from
> database into Unicode scalars

If you just want to convert data as it comes from the database, you might
want to use an on_load trigger instead of an inflate trigger.  The trigger
code would look slightly different:

sub make_column_methods
{
  my($self) = shift;

  $self->SUPER::make_column_methods(@_);

  foreach my $column ($self->columns)
  {
    next  unless($column->type =~ /^(?:text|varchar|character)$/);

    my $get_method = $column->accessor_method_name;
    my $set_method = $column->mutator_method_name;

    $column->add_trigger(on_load => sub
    {
      my $self = shift;

      # Triggers disabled within a trigger, so no infinite recursion here
      my $value = $self->$get_method();

      if(!Encode::is_utf8($value))
      {
        $self->$set_method(Encode::decode_utf8($value));
      }
      return; # return value is not significant
    });

    return; # return value is not significant
  }
}

> 2. Could someone show an example code how to extend
> R:D:O::Metadata::Column::Varchar column type so it can inflate the
> values loaded from DB.

I'd actually like this functionality to be in the core distribution.

Were you to do it on your own, the best approach would be to make your own
trivial subclasses of the column classes, then point those column classes at
your own custom method maker classes.  Finally, make your own trivial
metadata object subclass and map the appropriate column types to your new
column classes.

There are "shorter" ways to do this, but the approach described above
ensures that the default behavior remains unchanged for any RDBO-based
classes that do not want your modified behavior.

Anyway, like I said, I'd like to make this stuff built-in since it's a
reasonably common task.  I'm thinking of perhaps making some column classes
like this:

    Rose::DB::Object::Metadata::Column::Varchar::WithEncoding
                                        Character
                                        Text

and then adding attributes for "check encoding" and "set encoding."  Example
usage:

    __PACKAGE__->meta->columns
    (
      name => 
      {
        type   => 'varchar',
        length => 255,
        check_encoding => \&Encode::is_utf8,
        set_encoding   => \&Encode::decode_utf8,
      },
      ...
    );

where the set_encoding function is called on the column value if the
check_encoding function returns false when passed the current value.  Then
it'd be trivial to add ::UTF8 column variants that simply hard-code the
check/set_encoding functions.

Perhaps I could even "intelligently" substitute these classes if a
text-based field has its "utf8" attribute set.  Hmmm.

    __PACKAGE__->meta->columns
    (
      name => { type   => 'varchar', length => 255, utf8 => 1 },
      ...
    );

Anyone have any suggestions for better approaches to this problem?  Is there
an even more generic way to handle encoding/decoding?  Should these checks
and operations be done on load only or on set as well?

-John




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Rose-db-object mailing list
Rose-db-object@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rose-db-object

Reply via email to