I figured out the problem which I posted earlier, so I'm responding for
completeness. Regarding the subroutine "gen_substrs" if you do the
following:
@strings = ("green", "blue");
foreach $string (@strings) {
@subs = gen_substrings($string);
pring "@subs\n";
}
The output is:
line 1 g gr gre gree green r re ... and so on ...
line 2
Nothing on line 2! If in the subroutine definition, you change
my @s;
to
@s = ();
All is well. I'm not sure if this satisfies the careful programmer, but
it works as far as I can tell.
Thanks for the help.
Chris.
--------------------------------------------------------------------
Chris Staskewicz
http://www.ZyGob.com/cjs
http://www.math.utah.edu/~cjs
--------------------------------------------------------------------
---------- Forwarded message ----------
Date: Thu, 10 Oct 2002 12:34:56 -0600 (MDT)
From: Chris Staskewicz <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: Paul Makepeace <[EMAIL PROTECTED]>
Subject: Re: [Boston.pm] "cross product" searches
Thanks, the subroutine is very helpful for generating these substrings.
However, if I do:
@strings = ("green", "blue");
foreach $str (@strings) {
@subs = gen_substrs($str);
}
@subs contains all the desired "green" substrings, but is NULL on the next
iteration of the loop.
Thanks again,
Chris.
On Wed, 9 Oct 2002, Spider Boardman wrote:
> I make no claims as to being optimal, but in the theory of "ops are bad,
> m'kay" (with a nod to Nick Clark), here's what I did:
>
> perl -le 'my @s; $_ = "greensleeves"; /(.+?)(??{push @s,$1})(?!)/; print "@s"'
>
> It's roughly the same algorithm, since it's inherently O(n^2) to generate
> substrings. [For a string of length n, there are n starting positions, and
> from each position i, running from 0 to n-1, there are n-i substrings to be
> had.] This solution does minimise the real ops in perl by letting the
> regexp engine handle the actual finding of the substrings. On the other
> hand, the (??{code}) construct has close to eval/sub-entry overhead. I
> don't recommend this for readability. Its real value was in minimising my
> debugging time.
>
> Done as a subroutine (should anyone care):
>
> sub gen_substrs ($) {
> my $str = shift;
> my @s;
> $str =~ /(.+?)(??{push @s,$1})(?!)/;
> @s;
> }
>
> For actual enumeration, for a string of length n, the number of non-null
> contiguous substrings is:
>
> (n^2 + n) / 2
>
> In particular, for "greensleeves", which is a 12-character string, there are
> 78 such substrings to be found. Because of the repeated letters in the
> starting string, of course, those 78 values are not unique.
>
> Hmm, I had a point when I started. Oh, yeah. As long as your algorithm
> isn't of worse complexity than required by the nature of the problem, don't
> worry about it too much. Just be sure it gives you the right answers. If
> finding the substring lists turns out to be a bottleneck, consider using
> Memoize or something similar to avoid re-building the lists for the same
> word each time.
>
> Hth,
> --s.
>
> --
> Spider Boardman (at home) [EMAIL PROTECTED]
> The management (my cats) made me say this. http://www.ultranet.com/~spiderb
> PGP public key fingerprint: 96 72 D2 C6 E0 92 32 89 F6 B2 C2 A0 1C AB 1F DC
>
--------------------------------------------------------------------
Chris Staskewicz
http://www.ZyGob.com/cjs
http://www.math.utah.edu/~cjs
--------------------------------------------------------------------
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm