Re: Improve error reporting in 027_stream_regress test

Nazir Bilal Yavuz Mon, 28 Jul 2025 05:34:36 -0700

Hi,

On Thu, 24 Jul 2025 at 13:34, Nazir Bilal Yavuz <[email protected]> wrote:
>
> Hi,
>
> On Tue, 22 Jul 2025 at 03:56, Michael Paquier <[email protected]> wrote:
> >
> > On Mon, Jul 21, 2025 at 11:53:00AM +0300, Nazir Bilal Yavuz wrote:
> > > I realized that we actually don't trim the file, we do the opposite;
> > > read the file from both ends. Sorry for not realizing earlier. I will
> > > update the remaining patches according to that but I think trim_file()
> > > is helpful, too. What do you think about adding both
> >
> > I did not review the contents of the patch yet, as I was rather unsure
> > about the right semantics here.
> >
> > > ```
> > > trim_file() -> trims $n lines from head or tail of the file and
> > > returns the remaining lines.
> > > read_file_ends() -> returns $n lines from head or tail of the file.
> > > ```
> > >
> > > although trim_file() will not be used in these particular patches?
> >
> > And this made me wonder over the weekend if only showing the head
> > and/or tail of a file is always the best set of properties.
> > Then I came up that  this could be something close to what git-grep
> > -A/-B is able to do.  For example, for a crash, it would be much
> > cheaper to target a log entry that matches with what we see in the
> > usual crash stacks, then print the surroundings.  The "trim" behavior
> > you have proposed is a subset of that, where the matching patterns are
> > the end and/or the beginning of the file to show.
> >
> > So the API could com down to a pattern matching, with "head" and
> > "tail" being the optional number of lines we'd want to print around
> > the pattern we are looking for.
>
> That makes sense and could be really useful in more general cases but
> I think that discussion might be better suited for a separate thread.
>
> For this specific case, we are dealing with regression.diffs; which
> already contains the relevant diff output. We don't need to search for
> patterns here; we just want to include a small portion of the file in
> the error message, to spare users from having to open the file
> manually.


I wanted to show what is in my mind, v4 is attached. Summary is:

- 0001 introduces the read_file_ends() function, which reads lines
from either the beginning or end of a given file. It includes a
force_line_count argument that, when set to true, ensures that the
specified number of lines is read from the file regardless of whether
the PG_TEST_FILE_READ_LINES environment variable is set.

- 0002 is the actual patch that improves error reporting in the
027_stream_regress test by using the read_file_ends() function. It
adds a regression_log_helper() function, which reads the
PG_TEST_FILE_READ_LINES environment variable and then calls
read_file_ends() with force_line_count set to true. This approach
avoids any potential race condition where the environment variable
might be modified after being read in the regression_log_helper() and
before used in the read_file_ends().

--
Regards,
Nazir Bilal Yavuz
Microsoft

From 4960cc4c194575ef1cc87d65406b4ee6f7bfc06c Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz <[email protected]>
Date: Mon, 21 Jul 2025 11:56:54 +0300
Subject: [PATCH v4 1/2] Add read_file_ends() helper to Utils.pm

The read_file_ends() function reads lines from a file, either from the
head or the tail, based on the provided direction. It takes how many
lines to read as a 'line_count' argument. By default, if the
'PG_TEST_FILE_READ_LINES' environment variable is set, it overrides the
'line_count'. However, the 'force_line_count' argument can be used to
bypass this behavior and force the use of the provided 'line_count'
regardless of the environment variable.

Suggested-by: Michael Paquier <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE%3DbA%40mail.gmail.com
---
 src/test/perl/PostgreSQL/Test/Utils.pm | 60 ++++++++++++++++++++++++++
 doc/src/sgml/regress.sgml              |  8 ++++
 2 files changed, 68 insertions(+)

diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 85d36a3171e..ac00390f9df 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -68,6 +68,7 @@ our @EXPORT = qw(
   slurp_file
   append_to_file
   string_replace_file
+  read_file_ends
   check_mode_recursive
   chmod_recursive
   check_pg_config
@@ -590,6 +591,65 @@ sub string_replace_file
 
 =pod
 
+=item read_file_ends(filename, direction, line_count, force_line_count)
+
+Return lines from the head or tail of a given file.
+
+=cut
+
+sub read_file_ends
+{
+	my ($filename, $direction, $line_count, $force_line_count) = @_;
+	my @lines;
+
+	if ($direction ne 'tail' && $direction ne 'head')
+	{
+		die
+		  "read_file_ends(_, direction, _, _) direction should be either 'tail' or 'head'.";
+	}
+
+
+	# If the force_line_count is true, then use the provided line_count.
+	# Otherwise, use the PG_TEST_FILE_READ_LINES environment variable if set.
+	if (!$force_line_count)
+	{
+		my $temp_env_var = $ENV{PG_TEST_FILE_READ_LINES};
+		if (defined $temp_env_var)
+		{
+			$line_count = $temp_env_var;
+		}
+	}
+
+	if ($line_count <= 0)
+	{
+		return;
+	}
+
+	open my $fh, '<', $filename or die "couldn't open file: $filename\n";
+
+	if ($direction eq 'head')
+	{
+		while (my $line = <$fh>)
+		{
+			push @lines, $line;
+			last if $. == $line_count;
+		}
+	}
+	elsif ($direction eq 'tail')
+	{
+		while (my $line = <$fh>)
+		{
+			push @lines, $line;
+			shift @lines if @lines > $line_count;
+		}
+	}
+
+	close $fh;
+	return @lines;
+}
+
+=pod
+
 =item check_mode_recursive(dir, expected_dir_mode, expected_file_mode, ignore_list)
 
 Check that all file/dir modes in a directory match the expected values,
diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index bf4ffb30576..5e3ea98c767 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -521,6 +521,14 @@ make check EXTRA_TESTS=numeric_big
     can run <command>diff</command> yourself, if you prefer.)
    </para>
 
+   <para>
+    For certain tests, the environment variable
+    <envar>PG_TEST_FILE_READ_LINES</envar> can be set to limit the number of
+    lines read from the output files. This is useful when test output files
+    are large or contain unnecessary content, allowing the test framework to
+    read only a specified number of lines for comparison.
+   </para>
+
    <para>
     If for some reason a particular platform generates a <quote>failure</quote>
     for a given test, but inspection of the output convinces you that
-- 
2.50.0

From bd13c7dfec0e373eefa7f001fbe9de3206956b06 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz <[email protected]>
Date: Fri, 18 Jul 2025 10:34:41 +0300
Subject: [PATCH v4 2/2] Improve error reporting in 027_stream_regress test

Previously, the 027_stream_regress test only reported that the regression
test had failed, without showing the actual error. The detailed failure
information was hidden in the regression.diffs file.

This commit improves the situation by including the head and tail of
regression.diffs directly in the failure message. This helps quickly
identify the root cause without needing to open extra files.

Suggested-by: Andres Freund <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE%3DbA%40mail.gmail.com
---
 src/test/recovery/t/027_stream_regress.pl | 54 +++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/src/test/recovery/t/027_stream_regress.pl b/src/test/recovery/t/027_stream_regress.pl
index 5d2c06ba06e..d1198943ab9 100644
--- a/src/test/recovery/t/027_stream_regress.pl
+++ b/src/test/recovery/t/027_stream_regress.pl
@@ -97,6 +97,10 @@ if ($rc != 0 && $primary_alive && $standby_alive)
 		print "=== dumping $diffs ===\n";
 		print slurp_file($diffs);
 		print "=== EOF ===\n";
+
+		# Dump out 50 lines from head and tail of regression diffs to the
+		# failure message.
+		regression_log_helper($diffs, 50);
 	}
 }
 is($rc, 0, 'regression tests pass');
@@ -190,4 +194,54 @@ UPDATE|t), 'check contents of pg_stat_statements on regression database');
 $node_standby_1->stop;
 $node_primary->stop;
 
+sub regression_log_helper
+{
+	my ($diff_file, $lines_to_dump) = @_;
+	my @lines;
+
+	my $temp_env_var = $ENV{PG_TEST_FILE_READ_LINES};
+	if (defined $temp_env_var)
+	{
+		$lines_to_dump = $temp_env_var;
+	}
+
+	if ($lines_to_dump <= 0)
+	{
+		return;
+	}
+
+	open my $fh, '<', $diff_file or die "couldn't open file: $diff_file\n";
+
+	while (my $line = <$fh>)
+	{
+		push @lines, $line;
+	}
+	close $fh;
+
+	my $line_count = scalar @lines;
+
+	# If the diff_file has fewer lines than (2 * $lines_to_dump), dump out the
+	# entire file.
+	if ($line_count <= (2 * $lines_to_dump))
+	{
+		diag("\n=== dumping $diff_file ===\n");
+		foreach my $line (@lines)
+		{
+			diag($line);
+		}
+	}
+	else
+	{
+		diag(
+			"\n=== dumping $lines_to_dump lines from head of $diff_file ===\n"
+		);
+		diag(read_file_ends($diff_file, 'head', $lines_to_dump, 1));
+		diag(
+			"\n=== dumping $lines_to_dump lines from tail of $diff_file ===\n"
+		);
+		diag(read_file_ends($diff_file, 'tail', $lines_to_dump, 1));
+	}
+	diag("=== EOF ===\n\n");
+}
+
 done_testing();
-- 
2.50.0

Re: Improve error reporting in 027_stream_regress test

Reply via email to