Zentara wrote: > > On Fri, 21 Jun 2002 13:37:57 -0700, [EMAIL PROTECTED] (John W. Krahn) > wrote: > > >Sorry, it was late and I didn't test it. :-( The correct code should > >be > > > >my $bin = pack 'H*', $hextest; > > Thanks John , I thought I was losing my mind. :-) > > >Sorry, my understanding is that the hex string is just an ASCII > >representation of the binary data to search for. Virus files don't have > >actual "hex strings" in them but are compiled executables. > > Yeah I see the misunderstanding now. You were looking at doing > a regex testing a binary value on the binary file. > Can perl do "binary regexes"?
Yes, that is why I suggested quotemeta. perldoc -q "binary data" Found in /usr/lib/perl5/5.6.0/pod/perlfaq4.pod How do I handle binary data correctly? Perl is binary clean, so this shouldn't be a problem. For example, this works fine (assuming the files are found): if (`cat /vmunix` =~ /gzip/) { print "Your kernel is GNU-zip enabled!\n"; } On less elegant (read: Byzantine) systems, however, you have to play tedious games with "text" versus "binary" files. See the binmode entry in the perlfunc manpage or the perlopentut manpage. Most of these ancient-thinking systems are curses out of Microsoft, who seem to be committed to putting the backward into backward compatibility. If you're concerned about 8-bit ASCII data, then see the perllocale manpage. If you want to deal with multibyte characters, however, there are some gotchas. See the section on Regular Expressions. > I was looking at it the other way. I had the hex signature of the virus, > so I converted the binary file into a long hexstring. Then regexed the > hex values. > My first attempt is below. It works, but is incredibly slow. I tested > it against some commercial virus scanners like Trendmicro's vscan, > and the H+BEDV scanner for linux. I took some executables, hexedited > them to put in some test signatures, and scanned them. > The commercial scanners found the patterns in a micro-second. > My scanner took about 1 second per megabyte of filedata. Too > slow for anything but the smallest files. > > It's such a simple process, that I'm now toying with trying to do it > with assembly. > > Anyways here is what my slow kludge looks like. > You get the virussignatures.txt file from > http://www.openantivirus.org/VirusSignatures-latest.zip > > This is what the signature file looks like: > .... > .... > 10 past 3 (B)=ec020e1ff3a4b82125061fbab300cd21 > 10 past 3 (C)=b840008ed8a11300b106d3e02d00088e > 100-Years=fe3a558bec50817e0400c0730c2ea147 > 1024-PrScr #1=8cc0488ec026a103002d800026a30300 > 1024-PrScr #2=a172041f3df0f07505a10301cd0526a1 > 1024-PrScr #3=00012ea30300b4400e1fba0004b90004e8e8007230 > 1024-PrScr #4=babf00b82125cd2133c08ec0b8f0f026 > 1210-Prudent=2f040175d00e0e1f07bed3042bc92e8a0446410ac0 > 1210=c474f02e803e2f040175 > 1241=8a4600a200018b4601a30101b8cc4bcd > 1244=cd217252b91e00ba7d04b43fcd217246 > .... > .... > > This file has nearly 2000 entries, and I suspect that is why > it is so slow to check all those values thru the regex. You should either use index() or pre-compile the regex. > ######################################################### > #!/usr/bin/perl > use strict; > use warnings; > > my (@vs,@virname,@virsig,$numsigs,$i); > open (VS,"< virussignatures.strings") > or die "Cant open signature file",$!; > @vs = <VS>; > $numsigs = $#vs; > close VS; > > for ($i=0; $i <= $numsigs; $i++) { > chomp $vs[$i]; > ($virname[$i],$virsig[$i])= split(/=/,$vs[$i]); > } > > $/ = undef; > my $file = <>; #slurp binary file into 1 long string > if (length $file eq 0){print "Empty File\n";exit} > my $hexfilestring = unpack "H*", $file; #convert binary file to hex > > for (my $i =0; $i <= $numsigs; $i++){ > if ($hexfilestring =~ m/$virsig[$i]/i){print "$virname[$i] > found\n";exit;} > } > > print "file clean\n"; > exit; > ############################################################### This is about ten times faster than your version. :-) #!/usr/bin/perl use strict; use warnings; open VS, 'virussignatures.strings' or die "Cant open virussignatures.strings: $!"; my @sigs = map { chomp; ( $a, $b ) = split /=/; $b = pack 'H*', $b; # convert to binary $b = qr/\Q$b\E/; # pre-compile regex [ $a, $b ] } <VS>; FILE: for my $file ( @ARGV ) { unless ( open FILE, $file ) { warn "Cannot open $file: $!"; next FILE; } my $buffer; unless ( read FILE, $buffer, -s $file ) { warn "$file has no data.\n"; next FILE; } for my $regex ( @sigs ) { if ( $buffer =~ /$regex->[1]/ ) { print "$regex->[0] found in $file\n"; next FILE; } } } __END__ John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]