Re: [reiserfs-list] magic is useless Determining File Types

2002-01-07 Thread Russell Coker

On Mon, 7 Jan 2002 10:49, Hans Reiser wrote:
 There is an issue of going completly overboard,
 attribute/subattribute/subsubattribute anybody? This is certainly an
 overall interesting idea. How about file//acl for accessing ACLs? This
 does mean though you *MUST* have a filesystem specific dump tool.

 Yep, we have to improve tar.

Also we must not break the tar file format!!!

Please keep in mind my previous messages on this list regarding LHArc and 
OS/2's EAs when thinking of changes to tar.

The big advantage of tar is that it's files can be read on any OS so no 
matter how much hardware and software I lose then I can still find a way to 
read my tar files!

-- 
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/   Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page




Re: [reiserfs-list] magic is useless Determining File Types

2002-01-07 Thread Valdis . Kletnieks

On Mon, 07 Jan 2002 17:23:57 +0100, Russell Coker said:
  disk geometry is usually not worth knowing and lied about by the hard
  drive.
 
 I suspect that this is usually the case on mainframes too.  Valdis?

Well.. OK.. you caught me there.  Older IBM disk drives *did* lie
about their geometry only slightly.  For instance, the IBM 3350 disk
drive reported itself as having 555 cylinders and 30 tracks/cylinder,
and a listed 19,069 bytes/track maximum.

In reality, it had 560 cylinders (5 were for dedicated use for
diagnostics and replacments for bad tracks), and only 8 platters
rather than the 15 that 30 tracks would imply.  It was really laid out
as follows: 8 platters, and heads top/bottom of each platter.  The
access arm had 2 sets of heads, an inside and outside.  One pair
of heads (I admit not remembering which pair it was) was the servo
head used for positioning the access arm.  Half the logical cylinder
was on the outside and the other half was inside.

In addition, although the claimed capacity was 19,069 bytes/track,
it depended on the actual blocksized used, and was as follows:

Blocks/track = 19254/(185+blocksize)  So for various blocking factors:
blocksize   capacity
119069  19069
2 9442  18884
3 6233  18699
4 4628  18512

You get the idea.

For the 3380 disk, it was more complicated - 885/1770/2665 cylinders
for the single/dual/triple density versions, and 15 tracks/cyl.
The track capacity was 47,968 bytes.  The space for a block was
equal to (C + K + D) where C was 8, K was 0 for partitioned and sequential
datasets, or the keysize for keyed data sets, and D was equal to
7 + ((blocksize+12)/32) rounded up, and then blocks/track was equal
to 1499/SPACE.

http://www.sdisw.com/dasd_capacity.html - of those, I've personally
worked with 2314, 3350, 3370, 3380A/E/K, 3390-1,2,3, 3370, and 9332.

http://www.naspa.com/PDF/96/T9611005.pdf  gives the geometry calculations
for some of the more common recent count-key-data disks.

http://www.perfassoc.com/papers/pdf_files/volume_sizes_paper_01.pdf talks
about the performance considerations of disk size and related issues.

And yes, on the newer high-end disk subsystems, the disk lies about
its geometry:

http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg245465.pdf talks about
the IBM 'Shark' ESS disk subsystem, which does a very good job of
acting like an IBM 3380 or 3390 disk drive.  Given that the 33[89]0
is being emulated on either a JBOD or RAID-5, the chances that the
underlying disk is implementing anything resembling the same geometry
are close to zero.

The important thing to note here is that although the disk seen by the
host operating system is being emulated/striped/etc, the host mainframe
is *still* reaping the performance benefits of two major concepts:

1) Using multiple data paths to avoid I/O contention (if a SCSI device
is busy, the whole chain is busy - the IBM mainframe world just finds
one of the OTHER 8 ways to reach the disk).

2) Offloading much of the work to the disk controller (whether it's
a 3990-3 controlling real 3390 disks, or a 2105 with RS6000 processors
controlling a RAID that's emulating 3390's) and letting *it* do all the
work of finding the right record on the disk, and returning an interrupt
when done.

The fact that the disk subsystems have gotten so fast that they can
gloss over the faked geometry without a performance hit is even more
reason to offload more of the work onto them.

/Valdis




msg03845/pgp0.pgp
Description: PGP signature


Re: [reiserfs-list] magic is useless Determining File Types

2002-01-06 Thread Alexander G. M. Smith

The Amazing Dragon (Elliott Mitchell) [EMAIL PROTECTED] wrote on Sat, 5 Jan 2002 
16:26:21 -0800 (PST):
[...] You need to be able to get *everything* for programs such as
 tar and cp, but then you've got ordinary programs that just want
 their data. My tendancy is that the prior suggestion of the
 structured file is the way to go. Open file and you get
 *everything*, open file/data and you get only the data.

And with the XML file object system, the everything is the XML
representation of the file/object and its attributes and sub-objects.

- Alex



Re: [reiserfs-list] magic is useless Determining File Types

2002-01-05 Thread Elliott Mitchell

 From: Hans Reiser [EMAIL PROTECTED]
 Jens Benecke wrote:
 User space: There should be a way of specifying what default MIME type a
 file should get and whether the MIME type should follow the extension (if
 it has one). One thing I didn't like in OS/2 was that sometimes it was
 really difficult to persuade the WPS to open a specific file with another
 viewer by default (eg. hex viewer). Changing the extension didn't help and
 the EAs weren't editable without extra tools.
 
 I think this is the most important thing, that one should be able to 
 cast a file into a different type.  Then one also needs to be able to 
 convince app writers to try to make their apps work on the most generic 
 type possible, which is difficult to do but probably we will have to 
 endure failure at this in return for the benefits of typing and the 
 object orientation it allows.
 
 I think that at this point we just need to focus on completing version 
 4, and it is good to know that people will use its features once they 
 are available.  We are indeed going to give you the flexibility you need 
 to create whatever file/attributes or file/forks you need, though we 
 will make do this by making files and directories able to do what you 
 want attributes and resource forks able to do (this is nontrivial).

Yes, once everybody supports it, it will no longer be an issue, however
what should you get when you open the various resources? You need to be
able to get *everything* for programs such as tar and cp, but then you've
got ordinary programs that just want their data. My tendancy is that the
prior suggestion of the structured file is the way to go. Open file and
you get *everything*, open file/data and you get only the data.

 ..content-type turns out to be something that needs no specific support 
 in version 4, the generic mechanisms will be enough for it.  Suggestions 
 for alternatives to .. as the style convention for meta-data about an 
 object are welcome.

May I suggest // instead? This is much better for a couple reasons.
First / by itself is already a magic character, and so this doesn't
annoy people by stopping them from creating files with certain names;
same for programs. Second this is usable on directories, ie a file bar
inside a directory foo (foo/bar) is distinct from an attribute bar
on the directory (foo//bar).

There is an issue of going completly overboard,
attribute/subattribute/subsubattribute anybody? This is certainly an
overall interesting idea. How about file//acl for accessing ACLs? This
does mean though you *MUST* have a filesystem specific dump tool.


--
|\__/|\__/|\__  --= 8-) EHM =--  __/|\__/|\__/|
\||   | [EMAIL PROTECTED]  PGP 8881EF59 |   ||/
  \   \   | __| -O #include stddisclaimer.h O-  |__ |   /   /
\___\_|/82 04 A1 3C C7 B1 37 2A   E3 6E 84 DA 97 4C 40 E6\|_/___/





Re: [reiserfs-list] magic is useless Determining File Types

2002-01-05 Thread Chris Dukes

On Sat, Jan 05, 2002 at 04:26:21PM -0800, The Amazing Dragon wrote:
 May I suggest // instead? This is much better for a couple reasons.
 First / by itself is already a magic character, and so this doesn't
 annoy people by stopping them from creating files with certain names;
 same for programs. Second this is usable on directories, ie a file bar
 inside a directory foo (foo/bar) is distinct from an attribute bar
 on the directory (foo//bar).

So that 'mv /mp3dir/* //newmp3dir' will add my MP3s as attributes to '/'?
sarcasm
I LOVE IT!
/sarcasm

I would rather see a useful rich file format API, and a userspace
filesystem implementation before I see that kind of dinking on reiserfs.
I get enough joy from coders neglecting to quote filenames in shell scripts
as it is.  I'm afraid that sanity checking the file before tacking on
the path to the attributes is beyond their abilities.

-- 
Chris Dukes
Bert is apparently VIL, whereas Oscar is just a sysadmin^Wgrouch.
-- gorski



[reiserfs-list] magic is useless Determining File Types

2002-01-04 Thread Hans Reiser

Alexander G. M. Smith wrote:

Raphael Bosshard [EMAIL PROTECTED] wrote on Wed, 02 Jan 2002 14:51:03 +0100:

What do you think about the filetype beeing just another
attribute? How difficult would it be to realize? Not so
difficult, in my oppinion, but I am most certainly wrong.


BeOS does do file types that way.  The values used are MIME types,
so it's a lot more precise than file extensions (for example, type
text/html is openable by both HTML specific programs and generic
text editors).  But extensions and magic are used by a background
daemon to determine the file type for data that doesn't already
have a MIME type.  Incidentally, their attribute naming convention
uses BEOS:TYPE to identify the file type attribute, since BEOS:
is prepended to all OS specific attributes.  The attributes also
have a simple datatype associated (integer-32, integer-64, float-32,
double-64, plain string, MIME string, etc).  There's also an optional
indexing system for finding files with particular attribute values
very quickly, but that's a topic for another day.

So, yes, it's a good idea particularly when the rest of the OS
actually uses it.

- Alex


I think I might be willing to do it in Reiser4, if you are willing to do 
the work of creating a consensus that it should be noticed and set in 
user land.

You would want contact Miguel, Martin, and Richard.  Others probably 
also.  You should be willing to write 30 itty bitty patches to user land 
things yourself.

Hans





Re: [reiserfs-list] magic is useless Determining File Types

2002-01-04 Thread Russell Coker

On Fri, 4 Jan 2002 01:15, Alexander G. M. Smith wrote:
 Same thing for BeOS - floppies are FAT16 format (you can format for
 BFS but with the journal etc, there's 300KB of space for data),
 there's also FAT32 for Windows disk partitions and several other
 file systems.  Some, like Mac HFS support a limited number of
 attributes (just the ones which have a Mac equivalent).  Still,
 they got used by most of the regular applications written for BeOS,
 even if just used to specify the file type.  Though if you used POSIX
 commands (like cp), the attributes would get lost.  ZIP format

If even cp doesn't support it then it's useless.  This is why multiple 
streams were useless on NT because the cmd.exe copy command didn't support 
them (presumably nothing has changed with XP).

 So, if it's available and useful then there's a good chance people
 will use it in new software.

When even the authors of the OS don't support it in their core file copy 
utility then it's not getting used much.

On Fri, 4 Jan 2002 01:48, Jens Benecke wrote:
 Microsoft has these problems with their NTFS attributes. All the office
 type apps and so on were pressed hard to make heavy use of these
 attributes: you can e.g. view author, etc. of a MS-Word file in the file
 properties dialog, or the download URL of a .zip file, just like OS/2 did
 in 1996 :) but apart from that, nobody is really using these features,
 because you still *CAN* install Windows on FAT partitions and there you
 don't have these features.

OS/2 had extended attributes in 1988.  OS/2 had a fully object-oriented 
desktop using EAs in every imaginable way in 1992.  By 1996 OS/2 was 
seriously losing market-share, mind-share, and IBM support.

On Fri, 4 Jan 2002 09:48, Raphael Bosshard wrote:
 The idea of putting the filetype (ie. as MIME) into an additional
 file-attribute is not new and has done before by various systems,
 including OS/2, BeOS and even Windows. But in these cases, limitations
 of the FAT-Filesystem prevented an adoption of this feature.
 In the Unix-enviroment, it would fail because of standards and laziness;
 most of the file manipulating tools would have to be rewritten or to be
 patched. Right?

 Well, at least it was a nice idea... ;)

I'm not sure it was such a nice idea really.  Mainframes and mini-computers 
had typed files before Unix was invented.  Unix was one of the earlier OSs to 
use strictly non-typed files (a file is just a collection of bytes).  CP/M, 
DOS, etc all just followed that example.

If we're going to experiment with new things, then how about indexed files 
managed by the file system which allow hardware devices such as EMC 
machines to do the database operations.  This is why an IBM zSystem running 
OS/390 will beat almost anything for bulk IO while the same zSystem running 
Linux will apparently give poor IO performance.

-- 
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/   Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page