#!/usr/bin/perl ####################################################### I am trying to find all of the reoccurring sequences excluding the sub sequences.
Maybe I am missing the obvious, but having a little perl exposure and not being an expert perl programmer I have hacked together some code that does some of what I would like to do, but I know that there must be a much better way of doing this. I just don't have any ideas right now, having only had a couple hours sleep in the last couple of days. :+( am I looking at this all wrong? There should be some regular expression(s) that would make this more maintainable and elegant. :-) I have used an array of items called @datalist and a hash called %frequency that has a count of how often each item occurs in the data list. I used tr to clean the data of special characters if any and split on white space into the @datalist array. I would appreciate some help with this. Thanks JH ####################################################### # find frequency of all sequences of the given size my $count = $first = $currentseq = 0; # size of sequence to look for my $sizeof = 10; while ($first + $sizeof < $#datalist) { #ugly if ( defined $frequency{$datalist[$first]} && defined $frequency{$datalist[$first+1]} && $frequency{$datalist[$first+2]} && $frequency{$datalist[$first+3]} && $frequency{$datalist[$first+4]} && $frequency{$datalist[$first+5]} && $frequency{$datalist[$first+6]} && $frequency{$datalist[$first+7]} && $frequency{$datalist[$first+8]} && $frequency{$datalist[$first+9]} ) { # put a sequence together with a space separating items $currentseq .= $datalist[ $first ] ; for (my $count = 1; $count < $sizeof; ++$count) { $currentseq .= " " . $datalist[ $first + $count ] ; } # increment count of sequence for the current one ++$current{ $currentseq }; } # next position in the data list ++$first; } foreach ( keys ( %current ) ) { # if no multiples remove sequence if ( $current{ $_ } < 2 ) { delete $current{ $_ } ; } my $currentsequence = $_ ; my $numberof = $current{ $_ } ; foreach ( keys ( %lastseq ) ) { # if the number of times the smaller sequence occurs is # the same, then the shorter sequence is not needed if ( grep($_,$currentsequence) && $lastseq{ $_ } == $numberof ) { delete $lastseq{ $_ } ; } } } ####################################################### __________________________________________________ Do You Yahoo!? Send FREE video emails in Yahoo! Mail! http://promo.yahoo.com/videomail/ -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]