Jay Savage wrote:
On 4/12/05, John W. Krahn <[EMAIL PROTECTED]> wrote:

while (!eof) {

You need to specify the filehandle you are reading from.

A bare eof operates on the last filehandle read from. perldoc -f eof. The only place I'm going to get EOF from the OS here is IMG, and it
works as expected.

OK sorry, the only time I've used eof is when processing files through <> to reset the $. variable. Generally there is no need to use eof at all.


The second problem is that $eofsec contains multiple bytes and may (and
probably does) overlap $sector boundaries.  The easy solution would be to read
the entire file into a variable.

I thought about this, but everything I've read assumes that the files begin on sector boudaries.

That is usually the case, but are you sure that the "sectors" are 512 bytes in length?


The code I borrowed from only eaxmined the
first 4 bytes of each read, and the regex works anchored at the
beginning of the string.

Yes the $magic regex does but I was talking about the $eofsec regex.


If I were only missing a few files, I'd say
this was it.  But I'm not picking up any.  All the EOFs can't be split
across boundaries, can they?  Or can they?

I would say that statistically it is impossible, but I'm no expert. :-)


It's worth checking into
again.  I was hoping to avoid reading the entire image into memory,
though.

Are you *certain* that the values of $magic and $eofsec are correct? I ask because the JPEG FAQ says this:


[11]  How do I recognize which file format I have, and what do I do about it?

If you have an alleged JPEG file that your software won't read, it's likely
to be HSI format or some other proprietary JPEG-based format.  You can tell
what you have by inspecting the first few bytes of the file:

1.  A JFIF-standard file will start with the four bytes (hex) FF D8 FF E0,
    followed by two variable bytes (often hex 00 10), followed by 'JFIF'.

2.  If you see FF D8 at the start, but not the 'JFIF' marker, you may have a
    "raw JPEG" file.  This is probably decodable as-is by JFIF software ---
    it's worth a try, anyway.

3.  HSI files start with 'hsi1'.  You're out of luck unless you have HSI
    software.  Portions of the file may look like plain JPEG data, but they
    won't decompress properly with non-HSI programs.

4.  A Macintosh PICT file, if JPEG-compressed, will have several hundred
    bytes of header (often 726 bytes, but not always) followed by JPEG data.
    Look for the 3-byte sequence (hex) FF D8 FF --- the text 'Photo - JPEG'
    will usually appear shortly before this header, and 'JFIF' or 'AppleMark'
    will usually appear shortly after it.  Strip off everything before the
    FF D8 FF and you should be able to decode the file.

5.  Anything else: it's a proprietary format, or not JPEG at all.  If you are
    lucky, the file may consist of a header and a raw JPEG data stream.
    If you can identify the start of the JPEG data stream (look for FF D8),
    try stripping off everything before that.


Also the standard EOI (end of information) marker is "\xFF\xD9".



John
--
use Perl;
program
fulfillment

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to