Bill, I'm amused you call this efficiency when you said you didn't care about efficiency:-) Of course, it depends on one's definitions...
Here are several I came up with, with times for sample data (30000 data points): ($pref,$suf)=/(.{4})(.{4})/o; # Time for 30000 .294 ($pref,$suf)=/(.{4})(.{4})/; # Time for 30000 .3 $pref=substr($_,0,4); $suf=substr($_,4); # Time for 30000 .08 ($pref,$suf)= unpack "A4 A4",$_; # Time for 30000 .25 Using a regex or unpack will reduce the line count by one, but it will actually run much slower than substr (which surprised me). What you came up with is actually close to being the fastest (most efficient). (Doing the sort avoids a comparison inside the for loop, and turns out to be slightly faster. Of course, data variations affect the outcomes.) But- you can avoid one substr to make it shorter, so: @list = sort @list; foreach $item (@list) { # Time for 30000 .05 $itemPref = substr($item, 0, 4); $h{$itemPref} = $item; } foreach $pref (keys %h) { push (@finalList, $h{$pref}); } And, though it's inefficient (.2 sec): @list = sort @list; map {$h{substr($_,0,4)}=$_} @list; etc... An interesting problem. On Wed, 5 Apr 2006, Ng, Bill wrote: > Okay, > > Here's what I've come up with: > ---------- > @list = ("aacs1110", "brbt4332", "rtxa4320", "aacs2000", "brig5621", > "brbt5220", "nbvc1111"); > @list = sort @list; > > foreach $item (@list) { > $itemPref = substr($item, 0, 4); > $itemVers = substr($item, 4); > $h{$itemPref} = $itemVers; } > > foreach $pref (keys %h) { > push (@finalList, $pref . $h{$pref}); } > ---------- > > Okay ... 8 lines of code (not counting empty lines) ... anyone got > anything else? > > Bill > > > > > -----Original Message----- > From: Ng, Bill > Sent: Wednesday, April 05, 2006 3:19 PM > To: perl-win32-users@listserv.ActiveState.com > Subject: Efficiency > > Since Efficiency seems to be the topic of the day, > > My situation, I have an array filled with 8-character strings, a > few thousand of them. First 4 chars are letters, last 4 are numbers. > Examples - abcd1234, zyxw9876, etcc2222. The letters portion is a > prefix, the numbers is a version. In my list, there are instances of > the same prefix with different versions. > > What's the easiest way for me to either clean up the existing > array to remove all but the latest versions of each prefix, or create a > new array with the same info? Focus is on least amount of code, I'm not > going for outright speed. No modules. I've got it working now but I'm > running nested foreach loops and an if statement ... not efficient at > all. > > Example: > ------------ > @myArray = ("aacs1110", "brbt4332", "rtxa4320", "aacs2000", "brig5621", > "brbt5220", "nbvc1111") > ------------ > The code would give me an array that contained everything except > aacs1110 & brbt4332 since both have been superseded. > > Bill > > _______________________________________________ > Perl-Win32-Users mailing list > Perl-Win32-Users@listserv.ActiveState.com > To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs > --Nelson R. Pardee, Support Analyst, Information Technology & Services-- --Syracuse University, 211 Machinery Hall, Syracuse, NY 13244-1260 -- --(315) 443-1079 [EMAIL PROTECTED] -- _______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs