On 3/16/06, Jeff Pang <[EMAIL PROTECTED]> wrote: > > > > >I'm havig problems getting information out of a file and having it write > >each of the dates to an array. The problem is, I don't want duplicates. > > > > After reading your program carefully,I think you just want to get the count > increase for each uniq $data. > In perl programming,HASH is very useful for this purpose.Because hash's key > is always uniq. > I would modifiy your code here,and hope it helps to you. > > use strict; > use warnings; > > my %uniq_data; > > while ( <> ) { > chomp; > my ($date, $time, $ip, $ssl, $cipher, $get, $pkg, $http, $pid, > $name1, $name2, $name3 ) = split; > $date =~ s/\[//; > $date =~ s/\// /g; > $date =~ s/\:(\d+):(\d+):(\d+)//; > > $uniq_data{$date}++; > } > > > Then you could loop the HASH of %uniq_data to get each $data's count. > > --
As Jeff says, you want a hash here. But let's look at your regex for a second: > $date =~ s/\[//; # get rid of the opening '[' > $date =~ s/\// /g; # replace '/' with ' ' > $date =~ s/\:(\d+):(\d+):(\d+)//; # get rid of every thing after > the first ':' That should leave you with a string like '07 Feb 2005'. There are a couple of things to note, here, especailly in your final substitution. First, don't use capturing parenthesis unless you intend to do something with the captures value (e.g. $1, $2, $3, etc.). Captuing makes regexes much less efficient, noticably slower on lagre data sets. If you're just using parens for grouping, use non-capturing parens: $date =~ s/\:(?:\d+):(?:\d+):(?:\d+)//; Here, though, you don't need to group at all. Perl treats the metacharacter escape '\d' as a single character, so the following is fine (the same goes for other class metas and any escaped character; '\w', '\$', '\/', '\n', etc. are all single characters, as far as Perl is concerned): $date =~ s/\:\d+:\d+:\d+//; Next, ':' is not a metacharacter, you don't need to escape it. $date =~ s/:\d+:\d+:\d+//; Finally, here you want to get rid of everything after the first colon. Just do that: $date =~ s/:.*/; # (a purist might want s/:.*$/) You might also want to think about looking for what you do want out of your regex, instead of spending so much effort getting rrid of what you don't want. Something along the lines of $date =~ s#(\d{2})/(\w{3})/(\d{4}).*$#$1 $2 $3#; HTH, -- jay -------------------------------------------------- This email and attachment(s): [ ] blogable; [ x ] ask first; [ ] private and confidential daggerquill [at] gmail [dot] com http://www.tuaw.com http://www.dpguru.com http://www.engatiki.org values of β will give rise to dom!