I'm not clear from perlguts and perlapi how to make sure all my strings are
utf8.

And I'm more confused about sv_utf8_upgrade vs bytes_to_utf8.

Say I want to pass a list of strings passed into my xsub as an arrayref.

Is there a problem with blindly calling sv_utf8_upgrade on every element of
my AV?


For example, this code does not handle utf8:

void
foo( names )
    AV * names

    INIT:
        char ** name_list;

    CODE:
        Newx( name_list, av_len( names ) + 1, char * );

        *for( int i=0; i <= av_len( names ); i++ )*
*            name_list[i] = SvPV_nolen( *av_fetch( names, i, 0) );*

        RETVAL = foo( name_list, av_len( names) );
        SafeFree( name_list );

     OUTPUT:
         RETVAL



If my C function foo() expects all character data utf8-encoded is this the
correct approach?

        for( int i=0; i <= av_len( names ); i++ ) {
          *  SV * name_sv = *av_fetch( names, i, 0);*
*            sv_utf8_upgrade( name_sv );*
            name_list[i] = SvPV_nolen( name_sv );
         }

And is sv_utf8_upgrade a NOOP if the utf8 flag is already set?

Or is a better approach to use bytes_to_utf8()?   But, if I did that I
would need to SafeFree() each string in my name_list[] array after calling
my C function, correct?


-- 
Bill Moseley
mose...@hank.org

Reply via email to