Re: GFS, what's remaining

2005-09-05 Thread Kurt Hackel
On Mon, Sep 05, 2005 at 10:24:03PM +0200, Bernd Eckenfels wrote:
> On Mon, Sep 05, 2005 at 04:16:31PM +0200, Lars Marowsky-Bree wrote:
> > That is the whole point why OCFS exists ;-)
> 
> The whole point of the orcacle cluster filesystem as it was described in old
> papers was about pfiles, control files and software, because you can easyly
> use direct block access (with ASM) for tablespaces.

The original OCFS was intended for use with pfiles and control files but
very definitely *not* software (the ORACLE_HOME).  It was not remotely
general purpose.  It also predated ASM by about a year or so, and the
two solutions are complementary.  Either one is a good choice for Oracle
datafiles, depending upon your needs.

> > No. Beyond the table spaces, there's also ORACLE_HOME; a cluster
> > benefits in several aspects from a general-purpose SAN-backed CFS.
> 
> Yes, I dont dispute the usefullness of OCFS for ORA_HOME (beside I think a
> replicated filesystem makes more sense), I am just nor sure if anybody sane
> would use it for tablespaces.

Too many to mention here, but let's just say that some of the largest
databases are running Oracle datafiles on top of OCFS1.  Very large
companies with very important data.

> I guess I have to correct the artile in my german it blog :) (if somebody
> can name productive customers).

Yeah you should definitely update your blog ;-)  If you need named
references, we can give you loads of those.

-kurt

Kurt C. Hackel
Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-cluster] Re: GFS, what's remaining

2005-09-05 Thread kurt . hackel
On Mon, Sep 05, 2005 at 05:24:33PM +0800, David Teigland wrote:
> On Mon, Sep 05, 2005 at 01:54:08AM -0700, Andrew Morton wrote:
> > David Teigland <[EMAIL PROTECTED]> wrote:
> > >
> > >  We export our full dlm API through read/write/poll on a misc device.
> > >
> > 
> > inotify did that for a while, but we ended up going with a straight syscall
> > interface.
> > 
> > How fat is the dlm interface?   ie: how many syscalls would it take?
> 
> Four functions:
>   create_lockspace()
>   release_lockspace()
>   lock()
>   unlock()

FWIW, it looks like we can agree on the core interface.  ocfs2_dlm
exports essentially the same functions:
dlm_register_domain()
dlm_unregister_domain()
dlmlock()
dlmunlock()

I also implemented dlm_migrate_lockres() to explicitly remaster a lock
on another node, but this isn't used by any callers today (except for
debugging purposes).  There is also some wiring between the fs and the
dlm (eviction callbacks) to deal with some ordering issues between the
two layers, but these could go if we get stronger membership.

There are quite a few other functions in the "full" spec(1) that we
didn't even attempt, either because we didn't require direct 
user<->kernel access or we just didn't need the function.  As for the
rather thick set of parameters expected in dlm calls, we managed to get
dlmlock down to *ahem* eight, and the rest are fairly slim.

Looking at the misc device that gfs uses, it seems like there is pretty
much complete interface to the same calls you have in kernel, validated
on the write() calls to the misc device.  With dlmfs, we were seeking to
lock down and simplify user access by using standard ast/bast/unlockast
calls, using a file descriptor as an opaque token for a single lock,
letting the vfs lifetime on this fd help with abnormal termination, etc.
I think both the misc device and dlmfs are helpful and not necessarily
mutually exclusive, and probably both are better approaches than
exporting everything via loads of syscalls (which seems to be the 
VMS/opendlm model).

-kurt

1. http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf


Kurt C. Hackel
Oracle
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/