Looks like this is a utf-8 problem. The following patch seems to fix editing problem and conversion problems on pdf-export.
tesseract texts fail on editing and changing the page. same as before... problem still not found. however... next bug report will bring a real improvement. --- /usr/share/perl5/Gscan2pdf/Page.pm 2011-08-27 07:00:41.000000000 +0200 +++ /usr/share/perl5/Gscan2pdf/Page.pm 2011-10-22 23:57:19.492261844 +0200 @@ -11,6 +11,7 @@ use HTML::TokeParser; use HTML::Entities; use Image::Magick; +use Encode; use utf8; BEGIN { @@ -135,7 +136,7 @@ } } if ( $token->[0] eq 'T' and $token->[1] !~ /^\s*$/ ) { - $text = HTML::Entities::decode_entities( $token->[1] ); + $text = HTML::Entities::decode_entities(decode_utf8( $token->[1] )); chomp($text); } if ( $token->[0] eq 'E' ) { -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org