Re: [one-users] Setting filesystem type for new disk crashes GlusterFS

2015-02-20 Thread Javier Fontan
I've missed this one, sorry.

Searching for your problem I've found people with the same problems using
glusterfs and XFS. Also XFS is usually the backing storage for gluster, at
least that's what most of the howtos say.

That's good to know that a new version solved the problem.

Cheers

On Mon Jan 26 2015 at 9:39:23 PM Wilma Hermann 
wrote:

> Hi,
>
>
> > There seems to be a problem with sparse files and glusterfs and/or the
> > underlying FS (XFS?).
> No, it's only in combination with mkfs. Creating sparse files without a
> preset filesystem (i.e. raw images) works perfectly.
>
> By the way: How did you figure out that XFS is used? Is it known to
> produce problems?
>
> In addition to your suggestions, I experimented with overriding the
> filesystem type setting by setting FSTYPE = "raw" in
> /var/lib/one/remotes/tm/shared/mkimage. As far as I tested, that worked as
> well. However, we recently upgraded to Ubuntu 14.04 and GlusterFS 3.4.2.
> Now the bug seems to be gone.
>
> Thanks anyway!
>
> Greetings
>
> Wilma
>
>
> 2015-01-22 10:10 GMT+01:00 Javier Fontan :
>
>> There seems to be a problem with sparse files and glusterfs and/or the
>> underlying FS (XFS?).
>>
>> You can disable that functionality adding an "exit 1" command at the
>> top of these scripts:
>>
>> * /var/lib/one/remotes/datastore/mkfs
>> * /var/lib/one/remotes/tm/mkimage
>>
>> Another way of solving this is changing the what the image is created
>> (so it is not sparse). The problem is that it will take a lot more
>> time to create the image. The commad to change is 'dd' from those
>> scripts. For example, for 'mkfs':
>>
>> exec_and_log "$DD if=/dev/zero of=$DST bs=1 count=1 seek=${SIZE}M" \
>> "Could not create image $DST"
>>
>> to
>>
>> exec_and_log "$DD if=/dev/zero of=$DST bs=1M count=${SIZE}" \
>> "Could not create image $DST"
>>
>> Cheers
>>
>> On Sat, Jan 17, 2015 at 10:43 AM, Wilma Hermann 
>> wrote:
>> > Hi,
>> >
>> > Our OpenNebula setup uses GlusterFS to share /var/lib/one among all
>> > machines. Yesterday a customer created a new volatile disk for a VM. But
>> > this image creation crashed the gluster client on the host the VM was
>> > running on. I assume it has something to do with the fact that the
>> customer
>> > entered 'ext3' as filesystem type.
>> >
>> > This isn't the first time this bug occured, we also had it almost one
>> year
>> > ago and there it was also related to the filesystem type of an image. I
>> > believe that this feature is rarely used by our customers and simply
>> wasn't
>> > used in the meantime. Now we are using OpenNebula 4.8.0 on Ubuntu
>> 12.04.5
>> > with glusterfs 3.2.5.
>> >
>> > Here's the log of the VM that triggered the crash:
>> >
>> > Sat Jan 10 13:24:21 2015 [Z0][VMM][I]: VM successfully rebooted-hard.
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Command execution fail:
>> > /var/lib/one/remotes/tm/shared/mkimage 51200 ext3
>> > 192.168.128.14:/var/lib/one//datastores/0/346/disk.2 346 0
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkimage: Making filesystem of
>> 51200M
>> > and type ext3 at 192.168.128.14:/var/lib/one//datastores/0/346/disk.2
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: mkimage: Command "set -e
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: export PATH=/usr/sbin:/sbin:$PATH
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: dd if=/dev/zero
>> > of=/var/lib/one/datastores/0/346/disk.2 bs=1 count=1 seek=51200M
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs -t ext3 -F
>> > /var/lib/one/datastores/0/346/disk.2" failed: Warning: Permanently added
>> > '192.168.128.14' (ECDSA) to the list of known hosts.
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records in
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records out
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1 byte (1 B) copied, 0.000576409
>> s,
>> > 1.7 kB/s
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mke2fs 1.42 (29-Nov-2011)
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector
>> 2:
>> > Attempt to write block to filesystem resulted in short write
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not read block 0:
>> > Attempt to read block from filesystem resulted in short read
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector
>> 0:
>> > Attempt to write block to filesystem resulted in short write
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs.ext3: Attempt to write
>> block to
>> > filesystem resulted in short write while zeroing block 13107184 at end
>> of
>> > filesystem
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]:
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Could not write 5 blocks in inode
>> > table starting at 1027: Attempt to write block to filesystem resulted in
>> > short write
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Could not create image
>> > /var/lib/one/datastores/0/346/disk.2
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: ExitCode: 1
>> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Failed to execute transfer
>> manager
>> > driver operation: tm_a

Re: [one-users] Setting filesystem type for new disk crashes GlusterFS

2015-01-26 Thread Wilma Hermann
Hi,

> There seems to be a problem with sparse files and glusterfs and/or the
> underlying FS (XFS?).
No, it's only in combination with mkfs. Creating sparse files without a
preset filesystem (i.e. raw images) works perfectly.

By the way: How did you figure out that XFS is used? Is it known to produce
problems?

In addition to your suggestions, I experimented with overriding the
filesystem type setting by setting FSTYPE = "raw" in
/var/lib/one/remotes/tm/shared/mkimage. As far as I tested, that worked as
well. However, we recently upgraded to Ubuntu 14.04 and GlusterFS 3.4.2.
Now the bug seems to be gone.

Thanks anyway!

Greetings
Wilma

2015-01-22 10:10 GMT+01:00 Javier Fontan :

> There seems to be a problem with sparse files and glusterfs and/or the
> underlying FS (XFS?).
>
> You can disable that functionality adding an "exit 1" command at the
> top of these scripts:
>
> * /var/lib/one/remotes/datastore/mkfs
> * /var/lib/one/remotes/tm/mkimage
>
> Another way of solving this is changing the what the image is created
> (so it is not sparse). The problem is that it will take a lot more
> time to create the image. The commad to change is 'dd' from those
> scripts. For example, for 'mkfs':
>
> exec_and_log "$DD if=/dev/zero of=$DST bs=1 count=1 seek=${SIZE}M" \
> "Could not create image $DST"
>
> to
>
> exec_and_log "$DD if=/dev/zero of=$DST bs=1M count=${SIZE}" \
> "Could not create image $DST"
>
> Cheers
>
> On Sat, Jan 17, 2015 at 10:43 AM, Wilma Hermann 
> wrote:
> > Hi,
> >
> > Our OpenNebula setup uses GlusterFS to share /var/lib/one among all
> > machines. Yesterday a customer created a new volatile disk for a VM. But
> > this image creation crashed the gluster client on the host the VM was
> > running on. I assume it has something to do with the fact that the
> customer
> > entered 'ext3' as filesystem type.
> >
> > This isn't the first time this bug occured, we also had it almost one
> year
> > ago and there it was also related to the filesystem type of an image. I
> > believe that this feature is rarely used by our customers and simply
> wasn't
> > used in the meantime. Now we are using OpenNebula 4.8.0 on Ubuntu 12.04.5
> > with glusterfs 3.2.5.
> >
> > Here's the log of the VM that triggered the crash:
> >
> > Sat Jan 10 13:24:21 2015 [Z0][VMM][I]: VM successfully rebooted-hard.
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Command execution fail:
> > /var/lib/one/remotes/tm/shared/mkimage 51200 ext3
> > 192.168.128.14:/var/lib/one//datastores/0/346/disk.2 346 0
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkimage: Making filesystem of
> 51200M
> > and type ext3 at 192.168.128.14:/var/lib/one//datastores/0/346/disk.2
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: mkimage: Command "set -e
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: export PATH=/usr/sbin:/sbin:$PATH
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: dd if=/dev/zero
> > of=/var/lib/one/datastores/0/346/disk.2 bs=1 count=1 seek=51200M
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs -t ext3 -F
> > /var/lib/one/datastores/0/346/disk.2" failed: Warning: Permanently added
> > '192.168.128.14' (ECDSA) to the list of known hosts.
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records in
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records out
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1 byte (1 B) copied, 0.000576409
> s,
> > 1.7 kB/s
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mke2fs 1.42 (29-Nov-2011)
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector 2:
> > Attempt to write block to filesystem resulted in short write
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not read block 0:
> > Attempt to read block from filesystem resulted in short read
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector 0:
> > Attempt to write block to filesystem resulted in short write
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs.ext3: Attempt to write block
> to
> > filesystem resulted in short write while zeroing block 13107184 at end of
> > filesystem
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]:
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Could not write 5 blocks in inode
> > table starting at 1027: Attempt to write block to filesystem resulted in
> > short write
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Could not create image
> > /var/lib/one/datastores/0/346/disk.2
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: ExitCode: 1
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Failed to execute transfer manager
> > driver operation: tm_attach.
> > Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Error attaching new VM Disk: Could
> > not create image /var/lib/one/datastores/0/346/disk.2
> >
> > After that crash all subsequent operations fail because the frontend was
> > unable to log into that particular host (since /var/lib/one was missing
> and
> > passwordless SSH did not work anymore).
> >
> > I have 2 questions:
> > 1) Does anyone have an idea what's going on there?
> > 2) Is it possible to disable this filesystem type fea

Re: [one-users] Setting filesystem type for new disk crashes GlusterFS

2015-01-22 Thread Javier Fontan
There seems to be a problem with sparse files and glusterfs and/or the
underlying FS (XFS?).

You can disable that functionality adding an "exit 1" command at the
top of these scripts:

* /var/lib/one/remotes/datastore/mkfs
* /var/lib/one/remotes/tm/mkimage

Another way of solving this is changing the what the image is created
(so it is not sparse). The problem is that it will take a lot more
time to create the image. The commad to change is 'dd' from those
scripts. For example, for 'mkfs':

exec_and_log "$DD if=/dev/zero of=$DST bs=1 count=1 seek=${SIZE}M" \
"Could not create image $DST"

to

exec_and_log "$DD if=/dev/zero of=$DST bs=1M count=${SIZE}" \
"Could not create image $DST"

Cheers

On Sat, Jan 17, 2015 at 10:43 AM, Wilma Hermann  wrote:
> Hi,
>
> Our OpenNebula setup uses GlusterFS to share /var/lib/one among all
> machines. Yesterday a customer created a new volatile disk for a VM. But
> this image creation crashed the gluster client on the host the VM was
> running on. I assume it has something to do with the fact that the customer
> entered 'ext3' as filesystem type.
>
> This isn't the first time this bug occured, we also had it almost one year
> ago and there it was also related to the filesystem type of an image. I
> believe that this feature is rarely used by our customers and simply wasn't
> used in the meantime. Now we are using OpenNebula 4.8.0 on Ubuntu 12.04.5
> with glusterfs 3.2.5.
>
> Here's the log of the VM that triggered the crash:
>
> Sat Jan 10 13:24:21 2015 [Z0][VMM][I]: VM successfully rebooted-hard.
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Command execution fail:
> /var/lib/one/remotes/tm/shared/mkimage 51200 ext3
> 192.168.128.14:/var/lib/one//datastores/0/346/disk.2 346 0
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkimage: Making filesystem of 51200M
> and type ext3 at 192.168.128.14:/var/lib/one//datastores/0/346/disk.2
> Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: mkimage: Command "set -e
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: export PATH=/usr/sbin:/sbin:$PATH
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: dd if=/dev/zero
> of=/var/lib/one/datastores/0/346/disk.2 bs=1 count=1 seek=51200M
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs -t ext3 -F
> /var/lib/one/datastores/0/346/disk.2" failed: Warning: Permanently added
> '192.168.128.14' (ECDSA) to the list of known hosts.
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records in
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1+0 records out
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: 1 byte (1 B) copied, 0.000576409 s,
> 1.7 kB/s
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mke2fs 1.42 (29-Nov-2011)
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector 2:
> Attempt to write block to filesystem resulted in short write
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not read block 0:
> Attempt to read block from filesystem resulted in short read
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Warning: could not erase sector 0:
> Attempt to write block to filesystem resulted in short write
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: mkfs.ext3: Attempt to write block to
> filesystem resulted in short write while zeroing block 13107184 at end of
> filesystem
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]:
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Could not write 5 blocks in inode
> table starting at 1027: Attempt to write block to filesystem resulted in
> short write
> Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Could not create image
> /var/lib/one/datastores/0/346/disk.2
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: ExitCode: 1
> Fri Jan 16 17:31:00 2015 [Z0][VMM][I]: Failed to execute transfer manager
> driver operation: tm_attach.
> Fri Jan 16 17:31:00 2015 [Z0][VMM][E]: Error attaching new VM Disk: Could
> not create image /var/lib/one/datastores/0/346/disk.2
>
> After that crash all subsequent operations fail because the frontend was
> unable to log into that particular host (since /var/lib/one was missing and
> passwordless SSH did not work anymore).
>
> I have 2 questions:
> 1) Does anyone have an idea what's going on there?
> 2) Is it possible to disable this filesystem type feature. We don't need it,
> but I would like to prevent these accidental host crashes.
>
> Greetings
> Wilma
>
> ___
> Users mailing list
> Users@lists.opennebula.org
> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
>



-- 
Javier Fontán Muiños
Developer
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | @OpenNebula | github.com/jfontan
___
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org