[algogeeks] Re: check string A and B isPermutation ?

Gene Sat, 15 Apr 2006 08:16:09 -0700


Timak wrote:
> Hey, I think it's cool too :-) BTW, how exactly do you generalize it?


Well, I'm going to let that as a puzzle for Algorithm Geeks!  It's
equally cool.  And I've never seen it published anywhere.

>
> > There are a few other things to think about however.
> >
> > First, "simplest" depends on environment.  In perl you can compare two
> > strings $a and $b with
> >
> > if (join('', sort(split '', $a)) eq join('', sort(split '', $b)) {
> >   # a is a permutation of b!
> > }
> >
> > Histogram code would be more complex and would probably run slower.
>
> Why, can be pretty simple too.  Something like:
>
> bool is_perm(char* a, char* b)
> {
>   while (*a) hist[*(a++)]++;
>   while (*b) if (--hist[*(b++)] < 0) return true;
>   return false;
> }

Ah, but in the first place I was talking about Perl.  Histogramming
with arrays in Perl is really awkward. My whole point is that different
environments often lead to different "best" solutions.  You also have
omitted zeroing the histogram, which isn't fair because your function
won't work without it.

Initializing arrays often crops up as the most expensive operation in
interesting algorithms.
So I'll point out that there is a cool data structure that you can
"tack on" to an array.  This data structure needs only constant time
per array access to maintain.  Yet it allows you to treat the array as
though it's initialized to zero even though it really is not.  The down
side is that this data structure needs O(V) additional storage, where V
is the number of array entries eventually set to values different from
zero.  Again I'll let the details as a puzzle for the news group.

Unfortunately, your code above doesn't work.  For example it will
return false on is_perm("a", "a") .  Did you mean to have the true and
false in the other order?  Even if you reverse them it doesn't work.
Try is_perm("aa", "a") .  Your code will miss the fact that hist['a']
is 1 at the end.  So your algorithm should really be called
"is_superset(a, b)"

> However, please note that you don't need to compare the
> histograms (although those were my words).  You just need to
> accumulate the histogram of the first string, and then just decrement
> the counts when scanning the second one; since you would know that the
> lengths are the same, a trial to decrement a zero would mean the
> strings are not permutations.  All this is O(N).

Sorry this is not correct. To make your algorithm work, you'd need to
verify that the incremented/decremented histogram is all zeros at the
end (see my example above).  This final verification pass is O(2^w).

Another approach that doesn't need the verification pass is to use
"is_superset(a, b) && is_superset(b, a)."  But now you need to zero the
histogram twice!

IMHO if you have big characters, the histogram is best kept in a hash
table where the keys are letters and the values are counts.   But this
again is more complex.

Great discussion!  Thanks.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---

[algogeeks] Re: check string A and B isPermutation ?

Reply via email to