Anyone can review the code? It is in src/lib/libperfuse/ops.c
(search for calls to no_access)
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
way?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
On NetBSD-5.1, when an FFS filesystem has extended attributes, any call
to statfs(2) will never return from the kernel. ps -axl shows the
process is sleeping at tstile.
Is it a known problem, is there already a PR for it?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
abc557 ]
cylgrp dynamic inodes 4.4BSD sblock FFSv2 fslevel 4
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
S2 but FFS1 (as obtained by newfs
without -O) vs FFS v2 (as obtained by newffs -O 2). Indeed extattrctl
start says Operation not supported on a FFS v2 filesystem.
If extended attribute are known to be broken in FFS v1 and are
unsupported in FFS v2, there is little room for actual usage.
--
Em
k with FFS v1 but exhibit the neverending sleep
in tstile when statvfs() is called. It seems exended attributes are not
supported at all on FFS v2.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
On Sat, May 07, 2011 at 09:28:23PM +0200, Emmanuel Dreyfus wrote:
> On NetBSD-5.1, when an FFS filesystem has extended attributes, any call
> to statfs(2) will never return from the kernel. ps -axl shows the
> process is sleeping at tstile.
>
> Is it a known problem, is there alre
void)chkiq(ip, -1, NOCRED, 0);
#endif
-#ifdef UFS_EXTATTR
- ufs_extattr_vnode_inactive(vp, curlwp);
-#endif
if (ip->i_size != 0) {
/*
* When journaling, only truncate one indirect block
* at a time
--
Emmanuel Dreyfus
m...@netbsd.org
kind of thing.
I need SOCK_DGRAM here because packet exchange needs to be atomic (see
CVS log for src/usr.sbin/perfused/msg.c 1.5-1.6). I considered
SOCK_SEQPACKET, but it is broken enough to be unusable.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
in kernel because the operation was
sent but never got a reply. I have not been able to build a test case
for that bug since I have trouble to reproduce it at will for now.
--
Emmanuel Dreyfus
m...@netbsd.org
--- src/sys/kern/uipc_proto.c.orig 2008-04-27 15:16:58.0 +0200
+++ sr
ch an option be the default for
SOCK_STREAM and SOCK_SEQPACKET?
--
Emmanuel Dreyfus
m...@netbsd.org
his is possible.
--
Emmanuel Dreyfus
m...@netbsd.org
fixed stub that way?
I did but I would prefer to avoid that, since there can be many write()
calls in the sources.
--
Emmanuel Dreyfus
m...@netbsd.org
storage creation.
Such a behavior could be triggered by a new kernel option such
as UFS_EXTATTR_AUTOCREATE. It could hold the default size for autocreated
attributes. (e.g.: options UFS_EXTATTR_AUTOCREATE=1024 to get 1024 bytes
long attributes).
Opinions?
--
Emmanuel Dreyfus
m...@netbsd.org
On Tue, Jun 07, 2011 at 08:07:02AM +, Emmanuel Dreyfus wrote:
[autocreate extended attribute backend]
> Such a behavior could be triggered by a new kernel option such
> as UFS_EXTATTR_AUTOCREATE. It could hold the default size for autocreated
> attributes. (e.g.: options UFS_EXTATTR_A
sctl in a next commit round.
I will remove #ifdef UFS_EXTATTR_AUTOCREATE so that the code is always
built, if this is considered better.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
nd have
a rc.d script enabling it later during the boot (after fsck).
Opinions?
--
Emmanuel Dreyfus
m...@netbsd.org
ot; panic? If it's the
> second, can the extended attributes be turned off at that point?
I experienced various types of crashes.
--
Emmanuel Dreyfus
m...@netbsd.org
}
}
if (nd.ni_dvp != NULL) {
KASSERT(VOP_ISLOCKED(nd.ni_dvp) == LK_EXCLUSIVE);
vput(nd.ni_dvp);
}
if (root_vp != NULL) {
KASSERT(VOP_ISLOCKED(root_vp) == 0);
vrele(root_vp);
}
return uele;
}
--
Emmanuel Dreyfus
m...@netbsd.org
y are not likely to have any problem.
On the other hand, adding an attributes grows the attribute backing
store, so I immagine it is much more sensible to corruption.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
aid to be
"native". I suspect it is indeed more robust. I would be happy if
someone is ready to import that code.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> The patch I posted yesterday has a race condition, if a user process
> quickly sets two attributes on two different filesystems, then the second
> one will panic on VFS_ROOT() in namei() because the root vnode is already
> locked.
That analysis of the
can happen to the
root vnode of a filesystem that would ruin my day here? At least I
assume it cannot be removed.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
point/.attribute/system/attribute2
...
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> I suggest that if we make extended attribute the default, we disabled
> autostart in single user mode.
The complete proposal:
- extatrrctl start will automaticaly enable the attribute (no need to
extattrctl enable each attribute). I do this by changing th
Emmanuel Dreyfus wrote:
> sys_extattr_set_file
> extattr_set_vp
> vn_lock
> VOP_SETEXTATTR
> ufs_setextattr
> ufs_extattr_set
> ufs_extattr_autocreate_attr
> *
le bug somewhere in extended attribute autostart. Therefore
condtionning attribute autoload to the use of a mount otpion such as
-o extattr (see my other post) seems highly desirable.
--
Emmanuel Dreyfus
m...@netbsd.org
a smart way, or should I
do it for all ports?
--
Emmanuel Dreyfus
m...@netbsd.org
attr can be be useful, as it
makes the thing a bit more usable.
--
Emmanuel Dreyfus
m...@netbsd.org
be be useful, as it
> > makes the thing a bit more usable.
>
> I know you didn't make the problems here -- you just inherited them. Thanks
> for working on this.
Well, if we want a robust EA implementation, I understand we must look
for an UFS2 import.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
code from FreeBSD.
Opinions?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
t and set. It cannot be done
properly in libc and really needs to be done in kernel. Moreover, the
[fl](get|set|list|rm)xattr system calls have been available for a while
in the NetBSD kernel, so this was a minimal change.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
On Wed, Jun 29, 2011 at 04:55:30AM +0200, Emmanuel Dreyfus wrote:
> perhaps this is just a bug, since the documentation does not tell about
> this prepended one byte length. The funny thing is that glusterfs has a
> code section for FreeBSD that uses extattr_list_link(), but it assu
On Wed, Jun 29, 2011 at 09:06:13AM +, Emmanuel Dreyfus wrote:
> It is easy to do the conversion so that our Linux-like listexattr() returns
> NUL-separated strings, while our FreeBSD-like extattr_list_file() returns
> one-byte length prefixed strings. The question is what should
David Laight wrote:
> Does libc need to use its own buffer - or would that force it to allocate
> a temporary buffer?
We can avoid the allocation, by using memmove() : data has the same
length.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
OP_LISTEXTATTR
means a huge pull up.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
mes list is returned as one-byte length prefixed, non NUL
terminated strings
- Attribute value cannot cannot be bigger than 255 bytes
- Used in FreeBSD
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
to build and run natively a program
that was written for Linux. If you do not have it, you have to patch the
program to add API conversion functions.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
[Linux flavor]
> - Attribute values cannot contain a NUL character
[FreeBSD flavor]
> - Attribute value cannot cannot be bigger than 255 bytes
This is nonsense posted too early in the morning :-) The attribute value
can contain random binary data in both API.
Att
g a third one and
always convert to the two others.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
the kernel.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
/lsxattr_vnode.patch
Please review
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
never been usable
in a a NetBSD release, I sugest to do it right now, as it will not
cause backward incompatibility. Doing it after a reelase where trusted
and security are mapped as system would need administrator intervention
to fix things.
Opinions?
--
Emmanuel Dreyfus
m...@netbsd.org
ed
$ uname
NetBSD
$ mkdir i386
$ ln -s i386 inst.xxx
$ ln inst.xxx machine
$ ls -ld machine
lrwxrwxrwx 2 manu manu 4 jui 29 10:03 machine -> i386
$ uname
Linux
--
Emmanuel Dreyfus
m...@netbsd.org
_link(2),
braindead_link(2) or whatever.
--
Emmanuel Dreyfus
m...@netbsd.org
lid A B X Y
1135709 -rw-rw-r-- 1 manu manu 0 jui 29 20:40 A
1135711 -rw-rw-r-- 1 manu manu 0 jui 29 20:40 B
1135719 lrwxrwxrwx 1 manu manu 1 jui 29 20:41 X -> B
1135717 lrwxrwxrwx 1 manu manu 1 jui 29 20:40 Y -> A
--
Emmanuel Dreyfus
m...@netbsd.org
tell what benefit it would have: it
would ease porting from Linux.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
e in the way glusterfs
stores symlinks in its distributed and replicated setup. I suspect it
may involve treating such objects like directories, and have them
duplicated on all servers. An alternative would be to sacrifice the
garantee that symlinks are available during a rename, at least for
NetBS
Joerg Sonnenberger wrote:
> Given the very small number of programs that manage to mess up the
> symlink usage, I'm kind of opposed to providing another system call just
> as work around for them.
You did not explain what problems it would introduce, did you?
--
Emmanue
xr-xr-x 2 root wheel4 Aug 1 11:31 hsfile -> file
6 lrwxr-xr-x 2 root wheel 11 Aug 1 11:32 hvoid -> nonexistent
5 lrwxr-xr-x 2 root wheel3 Aug 1 11:31 sdir -> dir
4 lrwxr-xr-x 2 root wheel4 Aug 1 11:31 sfile -> file
6 lrwxr-xr-x 2 root wheel 11 Au
does not means we have to be wrong.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
D as well if it is considered desirable.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Christos Zoulas wrote:
> Except for the ktruser() call, looks good to me (my personal opinion).
Um, yes, that one was another pending patch I had for later. For now
ktrace does not show symlink(2) targets, which is annoying: sometime
you cannot tell what is going on.
--
Emmanuel Dreyfus
h
to support something else than Linux, but
I have no idea when, therefore it is not wise to hold our breath on it.
llink(2) is a simple change, FreeBSD already went there with linkat(2),
and it makes everything simple.
--
Emmanuel Dreyfus
m...@netbsd.org
andard system call.
Here is the specification. I will change llink to linkat and commit
shortly.
http://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html
--
Emmanuel Dreyfus
m...@netbsd.org
?
--
Emmanuel Dreyfus
m...@netbsd.org
e values other than AT_FDCW.
Then do the full Extended API set 2.
--
Emmanuel Dreyfus
m...@netbsd.org
It is much more code, since it happens on the client, which sends
filesystem operations to lower layers and regain control later using
callbacks. Have a look to the sources (xlator/cluster/dht/dht-rename.c)
and you will see why it is complex.
--
Emmanuel Dreyfus
m...@netbsd.org
its original form because it did horrible things instead of
> interfacing semi-sanely to namei.
But that does not answer the original question: it is sane to
ifdef _NETBSD_SOURCE a partial implementation of linkat(), while the
full thing is not yet ready.
--
Emmanuel Dreyfus
m...@netbsd.org
t is complex.
>
> Where does that path live? glusterfs source?
Yes, get it from here:
http://download.gluster.com/pub/gluster/glusterfs/3.2/3.2.2/
glusterfs-3.2.2.tar.gz
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> But that does not answer the original question: it is sane to
> ifdef _NETBSD_SOURCE a partial implementation of linkat(), while the
> full thing is not yet ready.
I can answer to myself: through config.h -> float.h -> whatever,
_NETBSD_SOURCE gets defi
would see our weird
netbsd_user and netbsd_system namespaces. A possible fix is to have a
libc stub, unused by the FreeBSD API, that rewrite them to system and
user, e.g.: "netbsd_system.foo" becomes "system.foo"
The answer to your question is therefore yes, but it may introduce
confusion.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
names are at odds
with usual Unix practice.
What we can do is to depreciate the FreeBSD API, and change our code
in-tree to use the Linux API. I think moving things to userland will do
more harm than good.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
with a page, and
fill it with zeros so that it can be reused. Anyone know the VM subsytem
well enough to tell me where the page is cleared with zeros?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
omething like this: it occurs at offsets as low as 16384,
and there is always valid data after the zeroed chunk.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
I prevent flushvncache() from executing in
src/sys/fs/puffs/puffs_vnops.c, the problem disapear. Note that I still
get my data written to the filesystem since I have set
PUFFS_KFLAG_WTCACHE (WT probably stands for Write Through)
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> That is not something like this: it occurs at offsets as low as 16384,
> and there is always valid data after the zeroed chunk.
I think I found the culprit: VOP_FSYNC cause a call do dosettattr, which
sends the SETATTR message. I never noticed that the file si
Mouse wrote:
> Personally? I'd say that scheduling it should be done by userland, but
> that putting the actual removal in the filesystem makes sense.
Yes, a TTL attribute on an inode: once it expires, the filesystem tosses
the file on next access attempt.
--
Emmanuel D
g enough.
The idea was to avoid a daemon that periodically inspets file timeout.
You just have to check for expiration on access. If expired, remove the
file and ansswer ENOENT.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
The idea is to have temporary file that vanish on
their own, right?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
DEBUG ?
--
Emmanuel Dreyfus
m...@netbsd.org
, but readdir should also kill
it. This is probably too complicated.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
it should not be committed?
--
Emmanuel Dreyfus
m...@netbsd.org
Emmanuel Dreyfus wrote:
> > i found the following patch in my tree.
> > unfortunately i forgot details and if there were more cases which needs
> > similar barriers.
> This fixes the problem with no detectable performance hit. Is there
> any reason why it should not
), but here it does not,
there is a single client running the test and no other activity.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
49, result: 0
resid after op: 0
RV reqid: 2450, result: 0
resid after op: 0
RV reqid: 2451, result: 0
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> Here is it below. The zeroed chunk is 2404 bytes at offset 0.
>
> As I understand, the problem is again an asynchronous SETATTR followed by a
> WRITE (reqid 2449).
That was without your patch. Another attempt with your patch it gives this, with
zeroed c
On Wed, Aug 17, 2011 at 07:09:34AM +0200, Emmanuel Dreyfus wrote:
> I will investigate further.
Here are the findings. I run with yamt's patch, and without
PUFFS_KFLAG_WTCACHE. zeroed chunks will randomly appear, and it seems
this is not restricted to offset zero after all.
Attached
On Wed, Aug 17, 2011 at 10:15:16AM +, Emmanuel Dreyfus wrote:
> Here are the findings. I run with yamt's patch, and without
> PUFFS_KFLAG_WTCACHE. zeroed chunks will randomly appear, and it seems
> this is not restricted to offset zero after all.
And one last but important poin
On Wed, Aug 17, 2011 at 12:33:46PM +, Emmanuel Dreyfus wrote:
> On Wed, Aug 17, 2011 at 10:15:16AM +0000, Emmanuel Dreyfus wrote:
> > Here are the findings. I run with yamt's patch, and without
> > PUFFS_KFLAG_WTCACHE. zeroed chunks will randomly appear, and it s
On Wed, Aug 17, 2011 at 03:24:25PM +, Emmanuel Dreyfus wrote:
> I backported this change, and the problem occurs much less often,
> and now always at offset 0. Do we have two different bugs?
> http://mail-index.netbsd.org/source-changes/2009/11/05/msg002710.html
Here is a trace belo
GETATTR (3), error = 0
RV reqid: 929, result: 0
FUSE< unique = 890, nodeid = 3101688548, opcode = SETATTR (4), error = 0
RV reqid: 928, result: 0
> puffs protocol lacks request serializations.
The big question is: can that be fixed?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
: 0
There the data that was never written at offset 787 has been read. It cannot
contain anything else than zeros. I guess this is how the zeroed chunk comes
along.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> Right, it sets the size to 787 but nothing has been written. It seems this was
> triggered by a sync, as seen below.
We have a sync without data written. It seems that in pageflush(), I
have UVM_OBJ_IS_CLEAN(&vp->v_uobj)
My test case iterate creating and
Hello
When copying data to a vnode using ubc_alloc() ubc_uiomove() and
ubc_release(), where does the kernel notes that data is dirty and will
need to be written by VOP_PUTPAGE? Is it the responsability of the
caller to request it (how?), or is it done somewhere in UBC?
--
Emmanuel Dreyfus
http
be reloaded from filesystem using
GETPAGE. The filesystems hands us what it has for the data that was
never previously written: a chunk of zeroes.
--
Emmanuel Dreyfus
m...@netbsd.org
On Fri, Aug 19, 2011 at 01:54:28PM +, Emmanuel Dreyfus wrote:
> Here is my complete anlysis of the problem:
(snip)
And here is a fix in puffs_vnop_getattr(). It was tested on netbsd-5 only so
far since the bug does not appear in -current. My test case does not exhibit
the bug anymore
and filesystem view of file size are not
atomic. Getting rid of the FAF for SETATTR will not prevent a race with
another thread that tests or sets file size.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
or struct pnode's pn_stat, set and
cleared in dosetattr(), and use vp->v_size on uvm_vnp_setsize() calls
when set? Or just avoid uvm_vnp_setsize() calls?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
YAMAMOTO Takashi wrote:
> uvm_vnp_setsize merely changes the kernel's idea of the size of the file.
It also triggers a file truncate if it discovers that the size set is
smaller than previous kernel idea.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> Or just avoid uvm_vnp_setsize() calls?
I wonder is that does not open the door to situation where fsync
semantics gets broken, because of a skiped uvm_vnp_setsize().
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
or should I introduce a new one, for instance pn_inrewrite in
struct pnode?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
ead2 VOP_PUTPAGES called (with mutex held)
thread2 VOP_PUTPAGES returns (releases mutex)
thread2 puffs_vnop_fsync returns
That seems to works.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
t a known problem?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
fs_vfsop_getattr ("zero-filed page on VOP_PUTPAGES"). Now that it uses
synchronous PUFFS operations, I wonder if there could be a deadlock between
ioflush and perfused/glusterfsd
sleepq_block
cv_wait_sig
puffs_msg_wait
puffs_vfsop_sync
sync_fsync
VOP_FSYNC
sched_sync
--
Emmanuel Dreyfus
ht
nop_fsync/flushvncache/dosetattr synchronous to fix the race with
puffs_vnop_getattr. This one is really getting tricky, as we have no way
to know when SETATTR from puffs_vnop_fsync/ completes.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
etsize() from
puffs_parkdone_setattr(), but that will open aanother can of worms with
stall size and spurious truncate.
--
Emmanuel Dreyfus
m...@netbsd.org
? sys/fs/puffs/puffs_vnops.c-debug
? sys/fs/puffs/puffs_vnops.c.ok1
? sys/fs/puffs/puffs_vnops.c.ok2
Index: sys/fs/puffs
Emmanuel Dreyfus wrote:
> That code passes my test case for data corruption, and it does not hang
> the kernel.
I spoke too fast, it is still able to deadlock. Here is ioflush
backtrace, while it is sleeping on km_getwait2
sleepq_block
mtsleep
uvm_wait
uvm_km_alloc
kmem_backend
Emmanuel Dreyfus wrote:
> > That code passes my test case for data corruption, and it does not hang
> > the kernel.
> I spoke too fast, it is still able to deadlock.
I identified simple situations where the ioflush thread would enter
puffs_vnop_strategy and fail to request as
ile was shrunk by another client. I think this is
better than deadlocks and data corruption.
If nobody has a better idea, I will got with this simple fix.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
1 - 100 of 940 matches
Mail list logo