I haven't had time to test my theory so I didn't respond. Since he had
said text and binary my thoughts where that the regex would not match
past the first linefeed and would need to be updated accordingly.
On 2/15/2014 6:33 AM, sisyph...@optusnet.com.au wrote:
Hi Greg,
This list is all but dead -- it may be that you and me are the only
people receiving mail from it.
Much better, IMO, to post these types of questions to perlmonks.
Anyway ... this might help:
#################################
use strict;
use warnings;
my $str = "\x1F\x8B\x08";
print "String contains: $str\n";
open WR, '>', 'file.bin' or die $!;
binmode WR;
print WR $str;
close WR or die $!;
undef $/;
open RD, '<', 'file.bin' or die $!;
binmode RD;
my $contents = <RD>;
close RD or die $!;
if($contents =~ /$str/){print "ok 1\n"}
# To safeguard against presence of
# metacharacters in $str:
if($contents =~ /\Q$str\E/){print "ok 2\n"}
######################################
Cheers,
Rob
*From:* Greg VisionInfosoft <mailto:gai...@visioninfosoft.com>
*Sent:* Saturday, February 15, 2014 9:41 AM
*To:* Perl-Win32-Users@listserv.ActiveState.com
<mailto:Perl-Win32-Users@listserv.activestate.com>
*Subject:* trouble doing regex in file containing both ascii and
binary content
i cant figure out what im doing wrong here.
i ran wireshark to monitor a small http client/server query/response.
point of exercise is to see exactly what an ajax response looks like
(as im trying to learn ajax).
unfortunately, the ajax response is sent from server in 'gzip' format
(not plain text).
so wireshark shows two standard http headers and at the end of the
stream is the binary 'gzipped' small stream.
ive saved this wireshark tcp 'stream' to a file. viewing the file in
hex mode, i see clearly the first three binary bytes of the gzipped
stream are hex1F hex8B hex08
what i need to do next is save just the binary gzipped stream to a
stand alone file, then see if i can un-gzip it to read the plain text
contents.
in theory, a straight forward task.
i write a quick few line perl script, whereby i open the saved
wireshark tcp stream file, set this input file to binary mode (so as
to not change any internal binary byte values), undefine the input
line seperator (to upserp the entire file into memory when read), read
the file to upserp its contents into a var, do a simple pattern match
of \x1F\x8B\x08, then save the matched pattern $& and what follows the
match $' to a new file... (right now the script doesnt actually yet
output to a file, it just dumps to screen)
for reasons that elude me, the pattern match fails.
i know the 3 bytes are in the file, yet the pattern match to those 3
bytes fails.
any ideas?
heres the small script.
open(IN, $ARGV[0]) || die "cant open input file";
binmode(IN);
undef $/;
my $data = <IN>;
if ($data =~ /\x1F\x8B\x08/) {
print "matched: " . $& . $';
} else {
print "no match\n";
}
the contents of the wireshark stream is as follows...
POST /ajax/demo_post.asp HTTP/1.1
Host: www.w3schools.com <http://www.w3schools.com>
Connection: keep-alive
Content-Length: 0
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36
Origin: http://www.w3schools.com
Accept: */*
Referer: http://www.w3schools.com/ajax/tryajax_post.htm
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cookie: ASPSESSIONIDAASDBBTC=BFEPJKCDLGDHEEOJIKANOEHP
HTTP/1.1 200 OK
Cache-Control: private,public
Content-Type: text/html
Content-Encoding: gzip
Vary: Accept-Encoding
Server: Microsoft-IIS/7.5
X-Powered-By: ASP.NET <http://ASP.NET>
Date: Fri, 14 Feb 2014 21:03:48 GMT
Content-Length: 201
.............`.I.%&/m.{.J.J..t...`.$..@.........iG#).*..eVe]f.@......{....{....;.N'...?\fd.l..J...!....?~|.?"......&.V.6_..U..u...y...t........./_.I.y;.f..wWG.qBo..
..Q.www.....~..h......./......h.c...
note; the binary data at end is obviously not easily discerned here in
ascii mode. when i open this same file in a binary editor the actual
binary contents (displayed in hex) is as follows... (ive inserted an
extra space to make the hex values be easily discerned).
1f 8b 08 00 00 00 00 00 04 00 ed bd 07 60 1c 49 96 25 26 2f 6d ca 7b
7f 4a f5 4a d7 e0 74 a1 08 80 60 13 24 d8 90 40 10 ec c1 88 cd e6 92
ec 1d 69 47 23 29 ab 2a 81 ca 65 56 65 5d 66 16 40 cc ed 9d bc f7 de
7b ef bd f7 de 7b ef bd f7 ba 3b 9d 4e 27 f7 df ff 3f 5c 66 64 01 6c
f6 ce 4a da c9 9e 21 80 aa c8 1f 3f 7e 7c 1f 3f 22 1e af 8e de cc 8b
26 9d 56 cb 36 5f b6 e9 55 d6 a4 75 fe 8b d6 79 d3 e6 b3 74 dd 14 cb
8b b4 9d e7 e9 cb 2f 5f bf 49 17 79 3b af 66 e3 c7 77 57 47 bf 71 42
6f be b2 0d b3 f6 51 ba 77 77 77 ff ee de ce ee 7e ba ff 68 e7 de a3
fd 87 e9 cb 2f d0 f4 ff 01 a8 9f 68 15 63 00 00 00
------------------------------------------------------------------------
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs