RE: can't read in whole file

Perl Thu, 14 Nov 2002 15:33:00 -0800

bingo!

That fixed it!

Thanks agian

-----Original Message-----
From: Wagner, David --- Senior Programmer Analyst --- WGO
[mailto:David.Wagner@;freight.fedex.com]
Sent: Thursday, November 14, 2002 4:25 PM
To: 'Perl'; Wagner, David --- Senior Programmer Analyst --- WGO
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: can't read in whole file

        You might try after the open but before any io

          binmode SEARCH_FILE

      then do your io.
Wags ;)

-----Original Message-----
From: Perl [mailto:perl@;codyartsupply.com]
Sent: Thursday, November 14, 2002 15:10
To: Wagner, David --- Senior Programmer Analyst --- WGO
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: can't read in whole file

Thanks, it's gotten better, but not completely reading yet.

It now loads the text-editor opened-and-resaved version of the spreadsheet
as a full 10752 characters long.
.....which is good.

Unfortunately, the MS-Works spreadsheet files still get tructed right where
they did before.

I think it must be semi-binary data - these files weren't intended for
as-text reads.

I figured if old Notepad can load it completely and have what I need to
match visible as plain text amongst the garble, then Perl can do it no
sweat.  I'm sure it can of course, but the trouble is figuring out how....

Can I load it as a binary, then convert it to a string without it getting
tructed then?

I'm pretty new to Perl.

In case it's just a really-dumb-bug(tm) I've created in the code, here's
that section:

============================================
foreach my $fileName (@list) {
                chomp($fileName);

                if (open SEARCH_FILE, "< $fileName") {

                      local $/ = undef;
                      my $searchData = <SEARCH_FILE>;

                        close SEARCH_FILE;

                        print "$fileName is ";
                        print length($searchData);
                        print " characters long.<br>\n";

                        if (length($searchData) < 10) {
                                print "hey, and it is:
'$searchData'\n<br>";
                        }

                        if ($searchData =~ /(\w*\s*\w*$search\w*\s*\w*)/i) {
                                push @matchFileList, $fileName;
                                push @matchTextList, $1;
                                $matches++;
                        }
                } else {
                        print "Failed to open file '$fileName' $!\n<br>";
                }
        }
==================================================

Thanks!

-----Original Message-----
From: Wagner, David --- Senior Programmer Analyst --- WGO
[mailto:David.Wagner@;freight.fedex.com]
Sent: Thursday, November 14, 2002 3:11 PM
To: 'Perl'; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: RE: can't read in whole file

        Replace         local $_ = undef; with local $/ = undef;
   This undefines the input record separator which is what you want when you
want to slurp a whole file into an array.

Wags ;)

-----Original Message-----
From: Perl [mailto:perl@;codyartsupply.com]
Sent: Thursday, November 14, 2002 13:58
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: RE: can't read in whole file

Hi Paul,

Thank you for taking the time to help.

>>      foreach my $fileName (@list) {
>>              chomp($fileName);
>>              my $duhFileName = $fileName;
>>                      #okay, I need help with scope too :-)
>
>You sure you didn't just misspell the variable name?
>It's case sensitive.....

with the '$duhFileName' scope problem/work around, it 'went away' on my
newest attempt so I not sure why it was there.  I had copied and pasted it
to ensure the names where identical... oh well I'll figure that one out when
I inadvertently repeat it...

>
>>              if (open SEARCH_FILE, "< $fileName") {
>>                      my $searchData = join "", <SEARCH_FILE>;
>>                      close SEARCH_FILE;
>
>ok, $searchData should be the whole file, but maybe it's the line-based
>read you're doing. A less attractive but more efficient way is
>
>  if (open SEARCH_FILE, "< $fileName") {
>    local $_ = undef;
>    $searchData = <SEARCH_FILE>;
>  }
>
>That slurps it all in as one big scalar read.

okay, I switched it to:

if (open SEARCH_FILE, "< $fileName") {
        local $_ = undef;
        my $searchData = <SEARCH_FILE>;

.......and I get the same problem:

I added:

        print "$fileName is ";
        print length($searchData);
        print " characters long.<br>\n";

and the output on a text length test reads:

C:\[clip]\retail\Account Purchases Spreadsheet.wks is 179 characters long.
C:\[clip]\retail\Account Purchases Spreadsheet.xlr is 6 characters long.
C:\[clip]\retail\All Brush Prices Printout.xlr is 6 characters long.

>From windows, the file properties show the file sizes (not disk space used)
are:

Account Purchases Spreadsheet.wks is 1.54 KB (1,587 bytes)
Account Purchases Spreadsheet.xlr is 10.5 KB (10,752 bytes)
All Brush Prices Printout.xlr is 51.0 KB (52,224 bytes)

....... this is what is messing up my head.

I can dump the data from the first file into my code editor, and save it as
text, and get a new size:

C:\[clip]\retail\Account Purchases Spreadsheet.txt is 561 characters long.

It is indeed pulling the first 6 characters 'DIa!?' correctly - but just
cutting it off there.

Do I need to read it as a binary file or something, and convert it to ASCII?

>>
>> if ($searchData =~ /(\w*\s*\w*$search\w*\s*\w*)/i) {
>
>ouch. \w*\s*\w*$search\w*\s*\w* ??
>
>That's zero or more word chars followed by zero or more whitespaces
>followed by zero or more word chars followed by the search pattern
>followed by zero or more word chars followed by zero or more
>whitespaces followed by zero or more word chars. Is there a compelling
>reason you can't just say /$search/i and then use "$`$&$'" ???? I
>realize it's not exactly the same, but all those asterisks work the
>regex engine pretty hard.....

yeah, it's ugly.  my goal is to grab a word or two just before and just
after the matched text to display the context within which the search found
the info.

Any optimizing tips are appreciated.  It's not production or cgi code, so
while I want to do it the best way, I'm trying to just get better results
than the pitiful file search tools built into XP (don't laugh)

>
>I don't suppose /([\w\s]+$search[\w\s]+)/i is specific enough either?
>
>Sorry, that may sound critical, which I don't mean. It's just a lot of
>backtracking if it isn't absolutely necessary.
>

hey, critical is good.

Any thoughts on this partial file read?  It's bending my brain out of whack.

Thanks

>__________________________________________________
>Do you Yahoo!?
>Yahoo! Web Hosting - Let the expert host your site
>http://webhosting.yahoo.com
>
>--
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

**********************************************************
This message contains information that is confidential
and proprietary to FedEx Freight or its affiliates.
It is intended only for the recipient named and for
the express purpose(s) described therein.
Any other use is prohibited.
****************************************************************

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: can't read in whole file

Reply via email to