From: Shawn McKenzie [] 
> Andrew Ballard wrote:
>> On Tue, Jan 5, 2010 at 10:20 AM, Ashley Sheridan
>> <> wrote:
>>> On Tue, 2010-01-05 at 16:18 +0100, Daniel Egeberg wrote:
>>>> On Tue, Jan 5, 2010 at 16:09, Shawn McKenzie <>
>>>>> Of course this doesn't work for something like
>>>>> or 'archive_v2.0.1.tar.gz', but I don't know what will with
>>>>> being variable length and possibly composed of multiple periods.
>>>> I suppose a solution to that could be having a list of known file
>>>> extensions and use that while falling back to one of the methods
>>>> in this thread if there is no match in the list. Of course you
>>>> then have to check .tar.gz before .gz if you're just iterating
>>>> a list. You might also just choose the longest match (in terms of
>>>> number of periods).
>>> That was my thought on how operating systems did it. If it maybe
>>> find a matching pattern, then it can fall back to matching anything
>>> after the last period.
>>> It always puzzles me, because some.archive.tar.gz is a valid file,
>>> the extension is .tar.gz and the filename is some.archive. I guess
>>> must compare the full filename to a list of knowns, and then try
>>> best after that.
>> While relying on a file's extension to determine the type of its
>> contents is dangerous, isn't archive.tar.gz just a gzip'd file that
>> happens to contain a tarball named archive.tar? In that case,
>> the correct extension just be .gz?
>> Andrew
> Yes, or .tgz sometimes.

Most of the time .tgz indicates more than just a gzipped tar file. It
has always been used by Slackware to indicate specific control files
have been added for their package manager.

But that has always been the danger of using the extension instead of
the magic number to identify the file type. Too many extensions have
been overloaded.

Bob McConnell

PHP General Mailing List (
To unsubscribe, visit:

Reply via email to