Re: not using OUTPUT_PERL_ENCODING to determine if _stream_encode should encode

2024-02-19 Thread Patrice Dumas
On Mon, Feb 19, 2024 at 09:01:07PM +, Gavin Smith wrote:
> On Sun, Feb 18, 2024 at 10:57:22PM +0100, Patrice Dumas wrote:
> > On Sun, Feb 18, 2024 at 06:09:23PM +, Gavin Smith wrote:
> > > If this is ok, then "convert" could set $self->{'encoding_disabled'}.
> > 
> > Should I do that part?
> 
> I tried making the change myself.  Do you think the following is correct?
> It changes the conversion for "info" output in the test suite to use 'convert'
> instead of 'output' ("file_info" still uses 'output').  It leads to quite
> a few test suite result changes, which I haven't sent.

I do not have a precise idea why for info convert is not tested.  My
feeling is that it is because it is less interesting than testing
output() only.  I think that $self->{'encoding_disabled'} should also be
set in output() if returning the result as a string and not outputting
in a file.  So I think that it is probably better to keep on calling
output only for Info but if $fh is not set, set
$self->{'encoding_disabled'}.  Something like, near line 108 of Info.pm,
add an else:

  my $fh;
  if (! $output_file eq '') {
if ($self->get_conf('VERBOSE')) {
  print STDERR "Output file $output_file\n";
}
$fh = _open_info_file($self, $output_file);
if (!$fh) {
  $self->conversion_finalization();
  return undef;
}
  } else {
$self->{'encoding_disabled'} = 1;
  }

Another issue, maybe it would be better to reset encoding_disabled to
leave a clean state in conversion_finalization(), do something like

  if (...) {
$self->{'encoding_disabled'}++;
  }

in conversion_finalization()
  if (...) {
$self->{'encoding_disabled'}--;
  }

But it requires having a condition that is a sure marker that a string
is returned.

> 
> diff --git a/ChangeLog b/ChangeLog
> index 5a0166b732..d23c77279c 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,14 @@
> +2024-02-19  Gavin Smith 
> +
> + * tp/Texinfo/Convert/Plaintext.pm (convert):
> + Set $self->{'encoding_disabled'} to 1 so that 'convert' will
> + always return an unencoded character string.
> + * tp/t/test_utils.pl (convert_to_plaintext, convert_to_info):
> + Do not set OUTPUT_PERL_ENCODING to prevent encoding.
> + (convert_to_info): Check if there is an output file the same
> + way as in 'convert_to_plaintext', so that 'convert' is called
> + instead of 'output'.
> +
>  2024-02-18  Patrice Dumas  
>  
>   Only allow highlighting commands to be redefined with @definfoenclose
> diff --git a/tp/Texinfo/Convert/Plaintext.pm b/tp/Texinfo/Convert/Plaintext.pm
> index 19df4a8912..98a4e514d1 100644
> --- a/tp/Texinfo/Convert/Plaintext.pm
> +++ b/tp/Texinfo/Convert/Plaintext.pm
> @@ -619,6 +619,7 @@ sub convert($$)
>my ($self, $document) = @_;
>  
>$self->conversion_initialization($document);
> +  $self->{'encoding_disabled'} = 1;
>  
>my $root = $document->tree();
>  
> diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl
> index 681ae6154e..b3daefed74 100644
> --- a/tp/t/test_utils.pl
> +++ b/tp/t/test_utils.pl
> @@ -532,14 +532,6 @@ sub convert_to_plaintext($)
>  }
>}
>  
> -  # If not outputing to a file, do not encode.  Return value from
> -  # 'output' is a character string.  It will be encoded to
> -  # UTF-8 in the results file.
> -  if (defined($converter_options->{'OUTFILE'})
> -  and $converter_options->{'OUTFILE'} eq '') {
> -$converter_options->{'OUTPUT_PERL_ENCODING'} = '';
> -  }
> -
>my $converter = Texinfo::Convert::Plaintext->converter($converter_options);
>  
>my $result;
> @@ -574,19 +566,18 @@ sub convert_to_info($)
>  = set_converter_option_defaults($converter_options, 'info',
>  $self->{'DEBUG'});
>  
> -  # If not outputing to a file, do not encode.  Return value from
> -  # 'output' is a character string.  This will be encoded to
> -  # UTF-8 in the results file.  This may make byte offsets in the tag table
> -  # incorrect, so if those needed to be tested, an separate output file
> -  # would have to be used instead.
> +  my $converter = Texinfo::Convert::Info->converter($converter_options);
> +
> +  my $result;
>if (defined($converter_options->{'OUTFILE'})
>and $converter_options->{'OUTFILE'} eq '') {
> -$converter_options->{'OUTPUT_PERL_ENCODING'} = '';
> +$result = $converter->convert($document);
> +  } else {
> +$result = $converter->output($document);
> +close_files($converter);
> +$result = undef if (defined($result) and ($result eq ''));
>}
>  
> -  my $converter = Texinfo::Convert::Info->converter($converter_options);
> -  my $result = $converter->output($document);
> -  close_files($converter);
>die if (!defined($converter_options->{'SUBDIR'}) and !defined($result));
>  
>my $converter_errors = $converter->get_converter_errors();
> 



Re: not using OUTPUT_PERL_ENCODING to determine if _stream_encode should encode

2024-02-19 Thread Gavin Smith
On Sun, Feb 18, 2024 at 10:57:22PM +0100, Patrice Dumas wrote:
> On Sun, Feb 18, 2024 at 06:09:23PM +, Gavin Smith wrote:
> > If this is ok, then "convert" could set $self->{'encoding_disabled'}.
> 
> Should I do that part?

I tried making the change myself.  Do you think the following is correct?
It changes the conversion for "info" output in the test suite to use 'convert'
instead of 'output' ("file_info" still uses 'output').  It leads to quite
a few test suite result changes, which I haven't sent.

diff --git a/ChangeLog b/ChangeLog
index 5a0166b732..d23c77279c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2024-02-19  Gavin Smith 
+
+   * tp/Texinfo/Convert/Plaintext.pm (convert):
+   Set $self->{'encoding_disabled'} to 1 so that 'convert' will
+   always return an unencoded character string.
+   * tp/t/test_utils.pl (convert_to_plaintext, convert_to_info):
+   Do not set OUTPUT_PERL_ENCODING to prevent encoding.
+   (convert_to_info): Check if there is an output file the same
+   way as in 'convert_to_plaintext', so that 'convert' is called
+   instead of 'output'.
+
 2024-02-18  Patrice Dumas  
 
Only allow highlighting commands to be redefined with @definfoenclose
diff --git a/tp/Texinfo/Convert/Plaintext.pm b/tp/Texinfo/Convert/Plaintext.pm
index 19df4a8912..98a4e514d1 100644
--- a/tp/Texinfo/Convert/Plaintext.pm
+++ b/tp/Texinfo/Convert/Plaintext.pm
@@ -619,6 +619,7 @@ sub convert($$)
   my ($self, $document) = @_;
 
   $self->conversion_initialization($document);
+  $self->{'encoding_disabled'} = 1;
 
   my $root = $document->tree();
 
diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl
index 681ae6154e..b3daefed74 100644
--- a/tp/t/test_utils.pl
+++ b/tp/t/test_utils.pl
@@ -532,14 +532,6 @@ sub convert_to_plaintext($)
 }
   }
 
-  # If not outputing to a file, do not encode.  Return value from
-  # 'output' is a character string.  It will be encoded to
-  # UTF-8 in the results file.
-  if (defined($converter_options->{'OUTFILE'})
-  and $converter_options->{'OUTFILE'} eq '') {
-$converter_options->{'OUTPUT_PERL_ENCODING'} = '';
-  }
-
   my $converter = Texinfo::Convert::Plaintext->converter($converter_options);
 
   my $result;
@@ -574,19 +566,18 @@ sub convert_to_info($)
 = set_converter_option_defaults($converter_options, 'info',
 $self->{'DEBUG'});
 
-  # If not outputing to a file, do not encode.  Return value from
-  # 'output' is a character string.  This will be encoded to
-  # UTF-8 in the results file.  This may make byte offsets in the tag table
-  # incorrect, so if those needed to be tested, an separate output file
-  # would have to be used instead.
+  my $converter = Texinfo::Convert::Info->converter($converter_options);
+
+  my $result;
   if (defined($converter_options->{'OUTFILE'})
   and $converter_options->{'OUTFILE'} eq '') {
-$converter_options->{'OUTPUT_PERL_ENCODING'} = '';
+$result = $converter->convert($document);
+  } else {
+$result = $converter->output($document);
+close_files($converter);
+$result = undef if (defined($result) and ($result eq ''));
   }
 
-  my $converter = Texinfo::Convert::Info->converter($converter_options);
-  my $result = $converter->output($document);
-  close_files($converter);
   die if (!defined($converter_options->{'SUBDIR'}) and !defined($result));
 
   my $converter_errors = $converter->get_converter_errors();