read this better when I get back home.
On 6 Oct 2015 15:03, "Dan Bron" <j...@bron.us> wrote:

> Bjorn asked:
> > I am thinking about if it is possible to get a util that
> > can read parts or whole files and decide what they are.
>
> Raul wrote:
> > If you are on linux, the 'file' utility will do it.
>
> Yes, and if you are not on Linux (or BSD, OS X, etc), you can still get
> your hands on the “magic files”, which are plain-text files describing the
> pattern which is used to determine a file’s type*. Just Google for them.
>
> For example, on my Mac, there is a file /usr/share/file/magic/dyadic which
> identifies Dyalog APL workspaces and component files (I was fairly
> surprised to find this installed by default on OSX!).
>
>
> #------------------------------------------------------------------------------
> # $File: dyadic,v 1.4 2009/09/19 16:28:09 christos Exp $
> # Dyadic: file(1) magic for Dyalog APL.
> #
> 0       byte    0xaa
> >1      byte    <4              Dyalog APL
> >>1     byte    0x00            incomplete workspace
> >>1     byte    0x01            component file
> >>1     byte    0x02            external variable
> >>1     byte    0x03            workspace
> >>2     byte    x               version %d
> >>3     byte    x               .%d
>
> This says: if the first byte of a file is 170 (i.e. 0xAA), and the 2nd
> byte of the file is less than 4, then you’ve got a Dyalog APL object. If
> that pattern doesn’t match, “file” will know it’s got something other than
> a Dyalog APL object, so it will move on and try out the next magic file
> pattern.
>
> If that pattern does match, however, the following lines help identify the
> kind of Dyalog APL object more specifically.
>
> If the 2nd byte (which must be less than 4) is zero, then it’s an
> “incomplete workspace”; if one, then a “component file”, if two, then an
> “external variable”; if three, then a (not-incomplete) “workspace”.
>
> Again, if the initial test about (firstByte=170) *. (secondByte<4)
> matched, and we know we’re dealing with a Dyalog APL object, then the 3rd
> and 4th bytes will give the major and minor versions of the interpreter
> which created it, respectively.
>
> Bjorn wrote:
> > I know extensions are indications of what they are.
>
> Worth pointing out, pragmatically speaking, if a file’s type is not
> self-evident on your OS, or file extensions being insufficient or
> misleading clues often enough that you need to use “file” with some
> frequency, it might be more productive to identify the root cause of that
> issue, rather than re-implementing the utility.
>
> I suppose one use case for “file” is increasing one’s confidence that a
> file one downloaded from a not-perfectly-trustworthy source is indeed what
> it advertises itself to be…
>
>
> -Dan
>
> * Please note these “magic file tests” are applied at a specific point in
> the utility’s workflow, after some preliminary tests at a higher level.
>
> So the files are useful, but not completely sufficient. If you can’t use
> “file” directly, and want to reimplement it, you’ll have to reimplement
> some of these preliminary tests as well.
>
> A good place to start is the manpage for file, followed by its source code
> (if you really want to get into it).
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to