Hi Elena & Dana,
shouldn't disabling file locking rather be a runtime mechanism? What
if you want to use the same binary on the same hardware with different
file system configurations, or one the same hardware writing to
different file systems, or if the sysadmin changes their mind on a daily
basis to enable or disable file locking?
Werner
On 16.05.2016 18:03, Dana Robinson wrote:
If a suitable way to lock files cannot be determined at configure time, a no-op
function is substituted. This is currently the case on Windows. File locking is
just advisory, so this isn't a big deal.
As for disabling file locking, we talked about this and will try to get a
configure-time mechanism for disabling file locking implemented for HDF5 1.10.1.
Dana Robinson
Software Engineer
The HDF Group
-----Original Message-----
From: Hdf-forum [mailto:[email protected]] On Behalf Of
Elena Pourmal
Sent: Sunday, May 15, 2016 9:01 PM
To: HDF Users Discussion List <[email protected]>
Subject: Re: [Hdf-forum] HDF5-1.10.0 and flock()
Hi Tim,
On May 13, 2016, at 10:55 AM, Timothy Brown <[email protected]>
wrote:
Hi all,
I was wondering if HDF5 was going to be keep the 1.8.x branch going? Or is it
recommend to move to the 1.10.x?
Yes, we will keep 1.8 going until we are satisfied with the quality of 1.10.x.
Transition from 1.8 to 1.10 should be seamless for our users :-)
I'm asking as we all know for SWMR you need flock() and that you can not
disable SWMR at compile time (I don't need it in my day to day use).
Hmm... HDF5 implements file locking in 1.10.x to prevent unauthorized access to
an HDF5 file (for example, file is opened for writing (non-SWMR) and another
process tries to write to it it). File locking is enabled if flock (or similar)
is available on the system. Configure checks if file locking is available, but
I think, we failed to check if it is disabled. We will take a look into this
situation.
Thank you for reporting!
Elena
On one of the clusters I run on we've got a Lustre file-system. However the
admin's have deemed that file locking is too expensive and have disabled it.
Here's the mount information:
mds01ib@o2ib1:mds02ib@o2ib1:/scratch on /lustre/janus_scratch type
lustre (rw,noauto,_netdev)
So when I run a very simple test to create a HDF5 with version 1.10.0 on this
file system it fails:
janus-compile1 ~$ ./test /lustre/janus_scratch/tibr1099/foo.h5
HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
#000: H5F.c line 491 in H5Fcreate(): unable to create file
major: File accessibilty
minor: Unable to open file
#001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize
file structure
major: File accessibilty
minor: Unable to open file
#002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
major: Virtual File Layer
minor: Can't update object
#003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno =
38, error message = 'Function not implemented'
major: File accessibilty
minor: Bad file ID accessed
Unable to open: /lustre/janus_scratch/tibr1099/foo.h5: -1
1
When I strace the program I see it's because flock() failed:
open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR) = 3 fstat(3,
{st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3) = 0
open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR|O_CREAT|O_TRUNC,
0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
flock(3, LOCK_EX|LOCK_NB) = -1 ENOSYS (Function not implemented)
close(3) = 0
Versus if I trace the program with version 1.8.15:
open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR) = 3 fstat(3,
{st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3) = 0
open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR|O_CREAT|O_TRUNC,
0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
brk(0x235a000) = 0x235a000
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f17252b8000
So my long winded example leads to three questions.
1) Do other HPC sites enable flock() on lustre? If so is it only localflock so
as not to have the burden of a cluster wide flock?
2) Is there a path forward for sites that don't enable flock?
3) Is there the opposite of H5Fstart_swmr_write?
Thanks!
Tim<test.f90>_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.or
g
Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019 Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5