read this better when I get back home. On 6 Oct 2015 15:03, "Dan Bron" <j...@bron.us> wrote:
> Bjorn asked: > > I am thinking about if it is possible to get a util that > > can read parts or whole files and decide what they are. > > Raul wrote: > > If you are on linux, the 'file' utility will do it. > > Yes, and if you are not on Linux (or BSD, OS X, etc), you can still get > your hands on the “magic files”, which are plain-text files describing the > pattern which is used to determine a file’s type*. Just Google for them. > > For example, on my Mac, there is a file /usr/share/file/magic/dyadic which > identifies Dyalog APL workspaces and component files (I was fairly > surprised to find this installed by default on OSX!). > > > #------------------------------------------------------------------------------ > # $File: dyadic,v 1.4 2009/09/19 16:28:09 christos Exp $ > # Dyadic: file(1) magic for Dyalog APL. > # > 0 byte 0xaa > >1 byte <4 Dyalog APL > >>1 byte 0x00 incomplete workspace > >>1 byte 0x01 component file > >>1 byte 0x02 external variable > >>1 byte 0x03 workspace > >>2 byte x version %d > >>3 byte x .%d > > This says: if the first byte of a file is 170 (i.e. 0xAA), and the 2nd > byte of the file is less than 4, then you’ve got a Dyalog APL object. If > that pattern doesn’t match, “file” will know it’s got something other than > a Dyalog APL object, so it will move on and try out the next magic file > pattern. > > If that pattern does match, however, the following lines help identify the > kind of Dyalog APL object more specifically. > > If the 2nd byte (which must be less than 4) is zero, then it’s an > “incomplete workspace”; if one, then a “component file”, if two, then an > “external variable”; if three, then a (not-incomplete) “workspace”. > > Again, if the initial test about (firstByte=170) *. (secondByte<4) > matched, and we know we’re dealing with a Dyalog APL object, then the 3rd > and 4th bytes will give the major and minor versions of the interpreter > which created it, respectively. > > Bjorn wrote: > > I know extensions are indications of what they are. > > Worth pointing out, pragmatically speaking, if a file’s type is not > self-evident on your OS, or file extensions being insufficient or > misleading clues often enough that you need to use “file” with some > frequency, it might be more productive to identify the root cause of that > issue, rather than re-implementing the utility. > > I suppose one use case for “file” is increasing one’s confidence that a > file one downloaded from a not-perfectly-trustworthy source is indeed what > it advertises itself to be… > > > -Dan > > * Please note these “magic file tests” are applied at a specific point in > the utility’s workflow, after some preliminary tests at a higher level. > > So the files are useful, but not completely sufficient. If you can’t use > “file” directly, and want to reimplement it, you’ll have to reimplement > some of these preliminary tests as well. > > A good place to start is the manpage for file, followed by its source code > (if you really want to get into it). > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm