Test::Set

Fergal Daly Thu, 27 Feb 2003 11:32:52 -0800

I've been discussing this with Mr Schwern recently but he's a little 
indisposed at the moment so this seemed like a good place for feedback.


Test::More's eq_set() is not a set comparison or a bag comparison but a 
strange array comparison where the order of scalars doesn't matter but the 
order of refs does, except for deeply equalrefs, which can be interchanged. 

So

perl -MTest::More -e 'print eq_set([ [], {} ], [ {}, [] ])."\n"'

prints 0

Attached is Test::Set which provides true deep set and bag comparisons and 
bagset.t some tests.

Any comments welcome,

F

-- 
Do you need someone with lots of Unix sysadmin and/or lots of OO software 
development experience? Go on, giz a job.
My CV - http://www.fergaldaly.com/cv.html

# -*-perl-*-
use strict;
use Test::More (tests => 13);
use Test::Set;

my $refa_1 = ["a"];
my $refa_2 = ["a"];

my $set = [qw(a b a c d), $refa_1];
is_math_set($set, [qw(a b a c d), $refa_1], "math_set equal");
is_math_set($set, [qw(b c), $refa_1, qw(d a)], "math_set reorder");
is_math_set($set, [qw(b c), $refa_1, qw(d a b c ), $refa_1], "math_set reorder dupes");
is_math_set($set, [qw(b c), $refa_1, qw(d a b c ), $refa_2], "math_set reorder deep-dupes");
ok(! eq_math_set($set, [qw(c), $refa_1, qw(d a c ), $refa_1]), "math_set missing");
ok(! eq_math_set($set, [qw(c b e), $refa_1, qw(d a c ), $refa_1]), "math_set extra");

is_bag([[], $refa_2], [$refa_2, []], "eq_bag deep");
ok(! eq_bag([$refa_2, []], [$refa_2, [], []]), "eq_bag extra");
ok(! eq_bag([$refa_2, [], []], [$refa_2, []]), "eq_bag missing");

check_diff(
	[qw(a a b b c)], [qw(e d c d e)],
	[qw(d e)], [qw(a b)],
	"set", 1
);

check_diff(
	[qw(a a b b c)], [qw(e d c d e)],
	[qw(d d e e)], [qw(a a b b)],
	"bag", 0
);

sub check_diff
{
	my ($a1, $a2, $missing, $extra, $name, $is_set) = @_;

	my ($m, $e) = Test::Set::_diff_bagset($a1, $a2, $is_set);

	is_deeply($missing, [sort @$m], "diff $name m");
	is_deeply($extra, [sort @$e], "diff $name e");
}

package Test::Set;

use 5.004;

use strict;
use Test::More;

use vars qw($VERSION @ISA @EXPORT );
$VERSION = '0.01';
@ISA    = qw( Exporter );
@EXPORT = qw(
	is_bag is_math_set
	eq_bag eq_math_set
);

my $Test = Test::Builder->new;

sub eq_deeply
{
	# this can be removed when Test::More exports an eq_deeply() function
	local @Test::More::Data_Stack = ();
	return Test::More::_deep_check(@_);
}

=head1 NAME

Test::Set - some tests for set and bag comparisons

=head1 SYNOPSIS

  use Test::More tests => $Num_Tests;
  use Test::Set;

  is_bag([EMAIL PROTECTED], [EMAIL PROTECTED], $test_name);
  is_math_set([EMAIL PROTECTED], [EMAIL PROTECTED], $test_name);

=head1 DESCRIPTION

These are just some extra comaprison functions and tests to add to the ones
already in Test::More. It depends on Test::More for some comparison
functions. You should read the documentation for Test::More before using
these tests.

Test::Set provides a true set comparison and a true bag comparison (a bag is
like a set, the order of elements doesn't matter, however the number of
occurrences of an element does).

It does not provide any way to set up a plan.

=head2 AVAILABLE TESTS

=over 4

=item B<is_math_set>

  is_math_set( $this, $that, $test_name );

Similar to Test::More's is_deeply(), except that $this and $that must be
array references. Because it is a set comparison, it ignores the order of
elements in $this and $that and it also ignores duplicate elements. This is
a deep check so duplicates are found by deep comparison. If the two sets are
different, it will list the scalar elements in $this that are missing and
that are extra. It will also count the missing and extra reference elements
but will not display their structure as that could get very messy.

=cut

sub _converge_arrays
{
	# this takes a master array and some slave arrays.

	# it returns one array for each slave array with all references replaced
	# by the first deeply equal reference from the master array.
	
	my $all = shift;

	# because we'll be do lots of deep_checks against this list, it's best if
	# we throw out the refs and the duplicates before we start.

	my @all = grep {ref($_)} @$all;
	my %all;
	@[EMAIL PROTECTED] = @all;
	@all = values %all;

	my @converged;

	foreach my $array (@_)
	{
		my @array = @$array;

		foreach my $elem (@array)
		{
			next unless ref $elem;
			if (my ($master) = grep { eq_deeply($elem, $_) } @all)
			{
				$elem = $master;
			}
		}

		push(@converged, [EMAIL PROTECTED]);
	}

	return @converged;
}

sub _diff_bagset
{
	# this takes 2 arrays and finds out what is in $a1 but not in $a2 and vice
	# versa

	# the arrays must contains all refs or all scalars to guarantee the
	# correct answer

	# a third (boolean) parameter specifies whether it's a bag or set
	# comparison. True for set, false for bag

	my ($a1, $a2, $is_set) = @_;

	if (ref($a1->[0]))
	{
		# we are dealing with refs so we need to converge them to cope with
		# deepness

		($a1, $a2) = _converge_arrays([EMAIL PROTECTED], @$a2], $a1, $a2);
	}

	my (%a1, %a2, %values);

	foreach my $elem (@$a1)
	{
		$a1{$elem}++;
		$values{$elem} = $elem;
	}
	foreach my $elem (@$a2)
	{
		$a2{$elem}++;
		$values{$elem} = $elem;
	}

	foreach my $elem (@$a1)
	{
		if ($is_set)
		{
			$a2{$elem} = 0;
		}
		else
		{
			$a2{$elem}--;
		}
	}
	foreach my $elem (@$a2)
	{
		if ($is_set)
		{
			$a1{$elem} = 0;
		}
		else
		{
			$a1{$elem}--;
		}
	}

	my @missing;
	my @extra;

	while (my ($value, $count) = each %a2)
	{
		if ($count > 0)
		{
			push(@missing, ($values{$value}) x ($is_set ? 1 : $count));
		}
	}
	while (my ($value, $count) = each %a1)
	{
		if ($count > 0)
		{
			push(@extra, ($values{$value}) x ($is_set ? 1 : $count));
		}
	}

	return ([EMAIL PROTECTED], [EMAIL PROTECTED]);
}

sub _nice_list
{
	my ($scalars, $refs) = @_;

	my $ref_string = @$refs." reference" if @$refs;
	$refs .= "s" if @$refs > 1;

	return join(", ",
		(map {"'$_'"} @$scalars),
		$ref_string
	);
}

sub is_math_bagset
{
	my($this, $that, $name, $is_set) = @_;

	my $diag = '';

	if (ref $this ne 'ARRAY')
	{
		$diag = 'First argument to is_math_set is not an array reference';
	}

	if(ref $that ne 'ARRAY')
	{
		$diag = 'First argument to is_math_set is not an array reference';
	}

	if (not $diag)
	{
		my ($missing_s, $extra_s) = _diff_bagset(
			[grep ! ref($_), @$this],
			[grep ! ref($_), @$that],
			$is_set
		);

		my ($missing_r, $extra_r) = _diff_bagset(
			[grep ref($_), @$this],
			[grep ref($_), @$that],
			$is_set
		);

		my @diags;
		if (@$missing_s or @$missing_r)
		{
			push(@diags, "Missing: "._nice_list($missing_s, $missing_r));
		}

		if (@$extra_s or @$extra_r)
		{
			push(@diags, "Extra: "._nice_list($extra_s, $extra_r));
		}

		$diag = join("\n", @diags);
	}

	my $ok;
	if ($diag)
	{
		$ok = $Test->ok(0, $name);
		$Test->diag($diag);
	}
	else
	{
		$ok = $Test->ok(1, $name);
	}

	return $ok;
}

sub is_math_set
{
	my($this, $that, $name) = @_;

	local $Test::Builder::Level = 2;
	return is_math_bagset($this, $that, $name, 1);
}

=item B<is_bag>

  is_bag( $this, $that, $test_name );

Similar to is_math_set, exccept that it doesn't ignore duplicates, so if
$this contains 3 copies of an element then $that must also contain 3 exactly
copies. This is a deep check so duplicates are found by deep comparison. If
the two bags are different, it will list the scalar elements in $this that
are missing and that are extra. It will also count the missing and extra
reference elements but will not display their structure as that could get
very messy.  Elements that are missing or extra multiple times will appear
multiple times, so is_math_set([qw( a a b b)], qw[ (b b c c)], "test") will
complain that "a", "a" is missing and "c", "c" is extra.

=cut

sub is_bag
{
	my($this, $that, $name) = @_;

	local $Test::Builder::Level = 2;
	return is_math_bagset($this, $that, $name, 0);
}

=item B<eq_math_set>

  eq_math_set([EMAIL PROTECTED], [EMAIL PROTECTED]);

This performs the same test as is_math_set but produces no output. It just
returns true if $this and $that are equal as sets and false otherwise.

=cut

sub eq_math_set  { 
	my ($this, $that) = @_;
	my ($extra, $missing) = _diff_bagset($this, $that, 1);

	(@$extra or @$missing) ? 0 : 1;
}

=item B<eq_bag>

  eq_bag([EMAIL PROTECTED], [EMAIL PROTECTED]);

This performs the same test as is_bag but produces no output. It just
returns true if $this and $that are equal as sets and false otherwise.

=cut

sub eq_bag  { 
	my ($this, $that) = @_;
	my ($extra, $missing) = _diff_bagset($this, $that, 0);

	(@$extra or @$missing) ? 0 : 1;
}

=back

=head1 SEE ALSO

L<Test::More>Test::Set is kinda useless without this module.

=head1 AUTHOR

Fergal Daly E<lt>[EMAIL PROTECTED]<gt>, with thanks to Michael G Schwern
for Test::More.

=head1 COPYRIGHT

Copyright 2003 by Fergal Daly E<lt>[EMAIL PROTECTED]<gt>.

This program is free software; you can redistribute it and/or 
modify it under the same terms as Perl itself.

See F<http://www.perl.com/perl/misc/Artistic.html>

=cut

1;

Test::Set

Reply via email to