"Charles K. Clarkson" <[EMAIL PROTECTED]> writes:
> : > > while(<FILE>){ > : > > chomp; > : > > my $line = $_; > : > > : > Why here. Since you are doing this with each line, > : > you could write in the loop control: > : > while (my $line = <FILE>) { > : > : Not sure I understand the advantage. In my > : formulation, `$line' is minus the trailing newline... > : which I've found to be nearly always a plus. > > I think Joseph was implying the 'chomp'. This is > still shorter and IMO clearer than using $_. > > while ( my $line = <FILE> ) { > chomp $line; I hope it doesn't sound like I'm being a hard head... because at my stage of skill I'm not likely to stand on my practices as better than some other... but, I'm having trouble seeing what is shorter or better about this. Both have 7 entries to type. And as for clarity, is it because you say `chomp $line' so it is apparent what is being chomped? > : > > ## @hdregs is an array of several regex for the > : > > ## headers > : > > for($ii=0;$ii<=$#hdregs;$ii++){ [...] > : > Why a C-style for loop? Are you using the index somewhere? > : > : Well yes, sort of. > > Assuming a non-C maintainer comes along, I would > recommend the following. The C-style loop is confusing > to those of us who don't have a background in C. This > is very clear (to me). > > foreach my $ii ( 0 .. $#hdregs ) { My usage was called `c - style but in fact I used it because of familiarity with awk. Probably awk style was borrowed from C anyway. But aside from yours being shorter, a formulation like yours leaves me wondering what its doing. Probably due to lack of familiarty with perl I guess. > : I wanted a way to ensure that each reg has hit at > : least once. Otherwise we don't print. So I used a > : formulation like this (Not posted previously for > : clarity): > : > : if ($data{$hdregs[$ii]}++ == 0) { > : ## it will only be 0 once > : $hdelem_hit_cnt++; > : } > : Then before printing we compare $hdelem_hit_cnt to > : ($#hdregs + 1): > : > : sub test_hdr_good { > : if ($hdelem_hit_cnt == ($#hdregs + 1)) { > : $test_hdr_good = "TRUE"; > : $hdelem_hit_cnt = 0; > > Generally, global variables should raise a giant, > blinking, annoying sign telling us we an are no > longer in Kansas. I didn't post it but in fact I have a `my' declaration like this at the beginning of my `sub wanted {' my($line,@hdhits,$hdelem_hit_cnt,%data); Another one at the beginning of the script that trys to catch everthing that didn't need to be local to a loop of some kind. > : They should be the same if all regs have hit at least > : once. If not the same... we don't print. > > Actually, they should be the same if all regs were > hit /only/ once. No, you'd have to try it to see that is not true. That was the beauty of it to me. It would only increment if a UNIQ hit happened but not if other repeated hits happened. So a repeat hit would not get mistaken for a UNIQUE hit and throw off the count. > Depending on where the 'if' block is located, this > is a roundabout way to test that @hdregs is an array of > unique values. It would be similar to this outside the > 'for' loop. @hdregs is not intended to be an array of uniq values. There are cases where I want to print repeated headers like `Received:' lines. > But as Randy mentioned, some mail headers are allowed > to appear more than once. Thus making this test invalid. Yes, that was what started this thread, my desire to include those in the hits.. For that reason I choose the `awkward' unique hit technique so that it would still work to verify that all regex had hit at least once, but not prevent printing of repeated hits. I'm sure there are better ways to do that, but I haven't thought of any yet. Even after commentary here, I'm not thinking of a slicker or nicer way to do that. An important ingrediant in this script is that it return nothing but a report of no hits if not ALL regex have found a hit. It is intended to be restrictive. For a more inclusive return one lessens the number or precision of the regex. This script will have both header and body regex to find in most cases, although it can be run for just one or the other. The idea is to purposely use a restrictive number of regex to have very good precision over what messages get turned up. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>