[Gluster-users] Tiering

2015-12-09 Thread Lindsay Mathieson
I see that 3.7 has settings for tiering, for the wording I presume 
hot/cold SSD tiering.


Is this beta yet? testable? are there any usage docs yet?

Thanks,

Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Meeting minutes of Gluster community meeting 2015-12-09

2015-12-09 Thread Atin Mukherjee
Minutes:
http://meetbot.fedoraproject.org/gluster-meeting/2015-12-09/gluster_community_weekly_meeting.2015-12-09-12.00.html
Minutes (text):
http://meetbot.fedoraproject.org/gluster-meeting/2015-12-09/gluster_community_weekly_meeting.2015-12-09-12.00.txt
Log:
http://meetbot.fedoraproject.org/gluster-meeting/2015-12-09/gluster_community_weekly_meeting.2015-12-09-12.00.log.html


Meeting summary
---
* Roll Call  (atinm, 12:01:08)

* AIs from last week  (atinm, 12:03:58)
  * ACTION: ndevos to send out a reminder to the maintainers about more
actively enforcing backports of bugfixes  (atinm, 12:05:23)
  * ACTION: raghu to call for volunteers and help from maintainers for
doing backports listed by rwareing to 3.6.8  (atinm, 12:06:52)
  * bug triage meeting doodle poll result to be announced on December
22, need more votes  (atinm, 12:09:10)
  * agenda is right here
https://public.pad.fsfe.org/p/gluster-community-meetings  (atinm,
12:09:49)
  * ACTION: rastar and msvbhat to publish a test exit criterion for
major/minor releases on gluster.org  (atinm, 12:10:39)
  * ACTION: kshlm & csim to set up faux/pseudo user email for gerrit,
bugzilla,  github  (atinm, 12:11:24)
  * ACTION: hagarth to decide on 3.7.7 release manager  (atinm,
12:14:12)
  * ACTION: amye to get on top of disucssion on long-term releases.
(atinm, 12:15:28)
  * ACTION: hagarth to post Gluster Monthly News this week  (atinm,
12:17:47)

* GlusterFS 3.7  (atinm, 12:18:36)

* GlusterFS 3.6  (atinm, 12:19:49)
  * raghu to create 3.6.8 tracker  (atinm, 12:20:27)
  * ACTION: hagarth to create 3.6.8 for bugzilla version  (atinm,
12:21:24)
  * ACTION: community needs to find out 3.6.8 release manager  (atinm,
12:23:38)
  * ACTION: raghu to ask for volunteers for release manager for 3.6.8
(atinm, 12:24:22)

* GlusterFS 3.8  (atinm, 12:25:16)

* GlusterFS 4.0  (atinm, 12:27:13)
  * 3.8 feature freeze to happen on mid-last  Jan 2016  (atinm,
12:31:36)
  * ACTION: kkeithley_ to send a mail about using sanity checker tools
in the codebase  (atinm, 12:32:47)
  * Another follow up meeting on 3.8 to take place on first week of
January, 2016  (atinm, 12:33:23)

* Open Floor  (atinm, 12:34:02)
  * LINK:
http://www.gluster.org/pipermail/gluster-devel/2015-November/047125.html
(atinm, 12:37:25)
  * ACTION: rastar to continue the discussion on rebase+fast forward as
an option to gerrit submit type  (atinm, 12:40:49)

Meeting ended at 12:49:26 UTC.




Action Items

* ndevos to send out a reminder to the maintainers about more actively
  enforcing backports of bugfixes
* raghu to call for volunteers and help from maintainers for doing
  backports listed by rwareing to 3.6.8
* rastar and msvbhat to publish a test exit criterion for major/minor
  releases on gluster.org
* kshlm & csim to set up faux/pseudo user email for gerrit, bugzilla,
  github
* hagarth to decide on 3.7.7 release manager
* amye to get on top of disucssion on long-term releases.
* hagarth to post Gluster Monthly News this week
* hagarth to create 3.6.8 for bugzilla version
* community needs to find out 3.6.8 release manager
* raghu to ask for volunteers for release manager for 3.6.8
* kkeithley_ to send a mail about using sanity checker tools in the
  codebase
* rastar to continue the discussion on rebase+fast forward as an option
  to gerrit submit type




Action Items, by person
---
* kkeithley_
  * kkeithley_ to send a mail about using sanity checker tools in the
codebase
* msvbhat
  * rastar and msvbhat to publish a test exit criterion for major/minor
releases on gluster.org
* raghu
  * raghu to call for volunteers and help from maintainers for doing
backports listed by rwareing to 3.6.8
  * raghu to ask for volunteers for release manager for 3.6.8
* rastar
  * rastar and msvbhat to publish a test exit criterion for major/minor
releases on gluster.org
  * rastar to continue the discussion on rebase+fast forward as an
option to gerrit submit type
* **UNASSIGNED**
  * ndevos to send out a reminder to the maintainers about more actively
enforcing backports of bugfixes
  * kshlm & csim to set up faux/pseudo user email for gerrit, bugzilla,
github
  * hagarth to decide on 3.7.7 release manager
  * amye to get on top of disucssion on long-term releases.
  * hagarth to post Gluster Monthly News this week
  * hagarth to create 3.6.8 for bugzilla version
  * community needs to find out 3.6.8 release manager




People Present (lines said)
---
* atinm (116)
* obnox (21)
* kkeithley_ (15)
* raghu (10)
* rastar (10)
* jiffin (7)
* rafi (5)
* hgowtham (4)
* anoopcs (3)
* zodbot (3)
* pranithk (3)
* Manikandan (2)
* skoduri (1)
* msvbhat (1)
* ggarg (1)
* partner (1)
* rjoseph (1)

Cheers,
Atin
___
Gluster-users mailing list
Gluster-users@gluster.org

Re: [Gluster-users] Sharding - what next?

2015-12-09 Thread Lindsay Mathieson
Hi Guys, sorry for the late reply, my attention tends to be somewhat 
sporadic due to work and the large number of rescue dogs/cats I care for :)


On 3/12/2015 8:34 PM, Krutika Dhananjay wrote:
We would love to hear from you on what you think of the feature and 
where it could be improved.

Specifically, the following are the questions we are seeking feedback on:
a) your experience testing sharding with VM store use-case - any bugs 
you ran into, any performance issues, etc


Testing was initially somewhat stressful as I regularly encountered file 
corruption. However I don't think that was due to bugs, rather incorrect 
settings for the VM usecase. Once I got that sorted out it has been very 
stable - I have really stressed failure modes we run into at work - 
nodes going down while heavy writes were happening. Live migrations 
during heals. gluster software being killed while VM were running on the 
host. So far its held up without a hitch.


To that end, one thing I think should be made more obvious is the 
settings required for VM Hosting:


   quick-read=off
   read-ahead=off
   io-cache=off
   stat-prefetch=off
   eager-lock=enable
   remote-dio=enable
   quorum-type=auto
   server-quorum-type=server

They are  quite crucial and very easy to miss in the online docs. And 
they are only recommended with noo mention that you will corrupt KVM 
VM's if you live migrate them between gluster nodes without them set. 
Also the virt group is missing from the debian packages.


Setting them does seem to have slowed sequential writes by about 10% but 
I need to test that more.



Something related - sharding is useful because it makes heals much more 
granular and hence faster. To that end it would be really useful if 
there was a heal info variant that gave a overview of the process - 
rather than list the shards that are being healed, just a aggregate 
total, e.g.


$ gluster volume heal datastore1 status
volume datastore1
- split brain: 0
- Wounded:65
- healing:4

It gives one a easy feeling of progress - heals aren't happening faster, 
but it would feel that way :)



Also, it would be great if the heal info command could return faster, 
sometimes it takes over a minute.


Thanks for the great work,

Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] GlusterFS Volume - Slow Copy Performance

2015-12-09 Thread Srikanth Mampilakal
Hi,

Hi,


I have a newly setup gluster file service used as a shared storage where
the content management system uses it as document root. I have run in to a
performance issue with the gluster/fuse client.

Looking for your thoughts and experience in resolving Gluster performance
issues:

Gluster Infrastructure

Gluster version :GlusterFS 3.7.6

2 gluster nodes  of the same config below

Redhat EL7.0-64
Memory : 4GB
Processor : 2 x 2.0 Ghz
Network : 100 Mbps
File Storage Volume : NETAPP Storage LUN with 2.0 IOPS/GB

Gluster Volume information:

[root@GlusterFileServe1 ~]# gluster volume info

Volume Name: prodcmsroot
Type: Replicate
Volume ID: f1284bf0-1939-46f9-a672-a7716e362947
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: Server1:/glusterfs/brick1/prodcmsroot
Brick2: Server2:/glusterfs/brick1/prodcmsroot
Options Reconfigured:
performance.io-thread-count: 64
performance.cache-size: 1073741824
performance.readdir-ahead: on
performance.write-behind-window-size: 524288

[root@GlusterFileServe1  ~]#

The replication between Gluster node are quick and consistent.

The apache webservers are accessing the Gluster volume using native gluster
fuse client and located in the same VLAN as the Gluster Server.

GlusterFileServe1:/prodcmsroot  /mnt/glusterfs glusterfs
direct-io-mode=disable,defaults,_netdev 0 0

The server utilization (memory,cpu,network and disk 1/0) is relatively low

I am experiencing very slow performance while copying multiple file/folders
(approx 75 MB) and it takes atleast approx 35 min. Even copy a folder (with
multiple files/subfolders) within the Gluster volume take the same time.

However, if I do dd to check the copy speed, I get the below result.

[root@ClientServer ~]#  time sh -c "dd if=/dev/zero
of=/mnt/testmount/test.tmp bs=4k count=2 && sync"
2+0 records in
2+0 records out
8192 bytes (82 MB) copied, 17.1357 s, 4.8 MB/s

real0m17.337s
user0m0.031s
sys 0m0.317s


Anyone who faced similar experience and resolved, please let me know your
thoughts.

Cheers

-- 
Cheers
Shrikanth
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Strange file corruption

2015-12-09 Thread Udo Giacomozzi

Am 09.12.2015 um 14:39 schrieb Lindsay Mathieson:


Udo, it occurs to me that if your VM's were running on #2 & #3 and you 
live migrated them to #1 prior to rebooting #2/3, then you would 
indeed rapidly get progressive VM corruption.


However it wouldn't be due to the heal process, but rather the live 
migration with "performance.stat-prefetch" on. This always leads to 
qcow2 files becoming corrupted and unusable.


Nope. All VMs were running on #1, no exception.
Nodes #2 and #3 never had a VM running on them, so they were pratically 
idle since their installation.


Basically I set up node #1, including all VMs.
Then I've installed nodes #2 and #3, configured Proxmox and Gluster 
cluster and then waited quite some time until Gluster had synced up 
nodes #2 and #3 (healing).
From then on, I've rebooted nodes 2 & 3, but in theory these nodes 
never had to do any writes to the Gluster volume at all.


If you're interested, you can read about my upgrade strategy in this 
Proxmox forum post: 
http://forum.proxmox.com/threads/24990-Upgrade-3-4-HA-cluster-to-4-0-via-reinstallation-with-minimal-downtime?p=125040#post125040


Also, It seems rather strange to me that pratically all ~15 VMs  (!) 
suffered from data corruption. It's like if Gluster considered node #2 
or #3 to be ahead and it "healed" in the wrong direction. I don't know..


BTW, once I understood what was going on, /with the problematic 
"healing" still in progress/, I was able to overwrite the bad images 
(still active on #1) by using standard Proxmox backup-restore and 
Gluster handled it correctly.



Anway, I really love the simplicity of Gluster (setting up and 
maintaining a cluster is extremely easy), but these healing issues are 
causing some headache to me... ;-)


Udo

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.5 - S57glusterfind-delete-post.py error

2015-12-09 Thread Marco Lorenzo Crociani

Directory is present

# ls -la /var/lib/glusterd/
totale 60
drwxr-xr-x. 13 root root 4096  3 dic 15:34 .
drwxr-xr-x. 25 root root 4096  9 dic 12:40 ..
drwxr-xr-x.  3 root root 4096 24 ott 10:06 bitd
-rw---.  1 root root   66  3 dic 15:34 glusterd.info
drwxr-xr-x.  3 root root 4096  2 dic 17:24 glustershd
drwxr-xr-x.  2 root root 4096 24 ott 16:44 groups
drwxr-xr-x.  3 root root 4096  7 ott 18:27 hooks
drwxr-xr-x.  3 root root 4096  2 dic 17:24 nfs
-rw---.  1 root root   24  9 lug 18:19 options
drwxr-xr-x.  2 root root 4096  3 dic 15:34 peers
drwxr-xr-x.  3 root root 4096  2 dic 17:24 quotad
drwxr-xr-x.  3 root root 4096 24 ott 10:06 scrub
drwxr-xr-x.  2 root root 4096  3 dic 15:34 snaps
drwxr-xr-x.  2 root root 4096 24 ott 16:44 ss_brick
drwxr-xr-x.  8 root root 4096 11 nov 14:03 vols


On 09/12/2015 12:44, Aravinda wrote:
Thanks. I will fix the issue. Was directory /var/lib/glusterd deleted 
post installation? (During any cleanup process)


This cleanup script was expecting /var/lib/glusterd/glusterfind 
directory to be present. Now I will handle the script to ignore if 
that directory not present.


Opened a bug for the same and sent patch to fix the issue. (Once 
review complete, we will make it available in 3.7.7 release)

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1289935
Patch: http://review.gluster.org/#/c/12923/

Thanks for reporting the issue.

regards
Aravinda

On 12/09/2015 03:35 PM, Marco Lorenzo Crociani wrote:

Hi,

# /var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py 
--volname=VOL_ZIMBRA

Traceback (most recent call last):
  File 
"/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py", line 
60, in 

main()
  File 
"/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py", line 
43, in main

for session in os.listdir(glusterfind_dir):
OSError: [Errno 2] No such file or directory: 
'/var/lib/glusterd/glusterfind'



# which glusterfind
/usr/bin/glusterfind


Regards,

Marco Crociani

On 07/12/2015 14:45, Aravinda wrote:
Looks like failed to execute the Cleanup script as part of Volume 
delete.


Please run the following command in the failed node and let us know 
the output and return code.


/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py 
--volname=VOL_ZIMBRA

echo $?

This error can be ignored if not using Glusterfind.
regards
Aravinda
On 11/11/2015 06:55 PM, Marco Lorenzo Crociani wrote:

Hi,
I removed one volume from the ovirt console.
oVirt 3.5.4
Gluster 3.7.5
CentOS release 6.7

In the logs there where these errors:


[2015-11-11 13:03:29.783491] I [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(+0x5fc75) 
[0x7fc7d6002c75] 
-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x4cc) 
[0x7fc7d60920bc] -->/usr/lib64/libglusterfs.so.0(runner_log+0x11e) 
[0x7fc7e162868e] ) 0-management: Ran script: 
/var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh 
--volname=VOL_ZIMBRA --last=no
[2015-11-11 13:03:29.789594] E [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(+0x5fc75) 
[0x7fc7d6002c75] 
-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x470) 
[0x7fc7d6092060] -->/usr/lib64/libglusterfs.so.0(runner_log+0x11e) 
[0x7fc7e162868e] ) 0-management: Failed to execute script: 
/var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh 
--volname=VOL_ZIMBRA --last=no
[2015-11-11 13:03:29.790807] I [MSGID: 106132] 
[glusterd-utils.c:1371:glusterd_service_stop] 0-management: brick 
already stopped
[2015-11-11 13:03:31.108959] I [MSGID: 106540] 
[glusterd-utils.c:4105:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered MOUNTV3 successfully
[2015-11-11 13:03:31.109881] I [MSGID: 106540] 
[glusterd-utils.c:4114:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered MOUNTV1 successfully
[2015-11-11 13:03:31.110725] I [MSGID: 106540] 
[glusterd-utils.c:4123:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered NFSV3 successfully
[2015-11-11 13:03:31.111562] I [MSGID: 106540] 
[glusterd-utils.c:4132:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered NLM v4 successfully
[2015-11-11 13:03:31.112396] I [MSGID: 106540] 
[glusterd-utils.c:4141:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered NLM v1 successfully
[2015-11-11 13:03:31.113225] I [MSGID: 106540] 
[glusterd-utils.c:4150:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered ACL v3 successfully
[2015-11-11 13:03:32.212071] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd 
already stopped
[2015-11-11 13:03:32.212862] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub 
already stopped
[2015-11-11 13:03:32.213099] I [MSGID: 106144] 
[glusterd-pmap.c:274:pmap_registry_remove] 0-pmap: removing brick 
/gluster/VOL_ZIMBRA/brick on port 49191
[2015-11-11 13:03:32.282685] I [MSGID: 106144] 
[glusterd-pmap.c:274:pmap_registry_remove] 0-pmap: removing brick 

Re: [Gluster-users] Strange file corruption

2015-12-09 Thread Udo Giacomozzi

Am 08.12.2015 um 07:57 schrieb Krutika Dhananjay:


quick-read=off
read-ahead=off
io-cache=off
stat-prefetch=off
eager-lock=enable
remote-dio=enable
quorum-type=auto
server-quorum-type=server


Perfectly put. I am one of the devs who work on replicate module. You 
can alternatively enable this configuration in one shot using the 
following command for VM workloads:

# gluster volume set  group virt



The RedHat guide 
 
states:


After the volume is tagged using |group virt| command, you must not 
use the volume for any other storage purpose, other than to store 
virtual machine images. Also, ensure to access the volume only through 
gluster native client. 


Okay, I'll give it a try.
I've created a new volume, set the options mentioned above, accessing it 
via native GlusterFS (using 127.0.0.1 address) and moved some 
non-critical VMs to that storage.

We'll see it that works fine... :-)

Udo
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Strange file corruption

2015-12-09 Thread Lindsay Mathieson



On 7/12/2015 9:03 PM, Udo Giacomozzi wrote:
All VMs were running on machine #1 - the two other machines (#2 and 
#3) were *idle*.

Gluster was fully operating (no healing) when I rebooted machine #2.
For other reasons I had to reboot machines #2 and #3 a few times, but 
since all VMs were running on machine #1 and nothing on the other 
machines was accessing Gluster files, I was confident that this 
wouldn't disturb Gluster.
But anyway this means that I rebootet Gluster nodes during a healing 
process.


After a few minutes, Gluster files began showing corruption - up to 
the point that the qcow2 files became unreadable and all VMs stopped 
working.


Udo, it occurs to me that if your VM's were running on #2 & #3 and you 
live migrated them to #1 prior to rebooting #2/3, then you would indeed 
rapidly get progressive VM corruption.


However it wouldn't be due to the heal process, but rather the live 
migration with "performance.stat-prefetch" on. This always leads to 
qcow2 files becoming corrupted and unusable.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster - Performance issue while copying bulk files/folders

2015-12-09 Thread Srikanth Mampilakal
Hi,


I have production gluster file service used as a shared storage where the
content management system uses it as document root. I have run in to a
performance issue with the gluster/fuse client.

Looking for your thoughts and experience in resolving Gluster performance
issues:

Gluster Infrastructure

Gluster version :GlusterFS 3.7.6

2 gluster nodes of the same config below

Redhat EL7.0-64
Memory : 4GB
Processor : 2 x 2.0 Ghz
Network : 100 Mbps
File Storage Volume : NETAPP Storage LUN with 2.0 IOPS/GB

Gluster Volume information:

[root@GlusterFileServe1 ~]# gluster volume info

Volume Name: prodcmsroot
Type: Replicate
Volume ID: f1284bf0-1939-46f9-a672-a7716e362947
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: Server1:/glusterfs/brick1/prodcmsroot
Brick2: Server2:/glusterfs/brick1/prodcmsroot
Options Reconfigured:
performance.io-thread-count: 64
performance.cache-size: 1073741824
performance.readdir-ahead: on
performance.write-behind-window-size: 524288

[root@GlusterFileServe1  ~]#

The replication between Gluster node are quick and consistent.

The apache webservers are accessing the Gluster volume using native gluster
fuse client and located in the same VLAN as the Gluster Server.

GlusterFileServe1:/prodcmsroot  /mnt/glusterfs glusterfs
direct-io-mode=disable,defaults,_netdev 0 0

The server utilization (memory,cpu,network and disk 1/0) is relatively low

I am experiencing very slow performance while copying multiple file/folders
(approx 75 MB) and it takes atleast approx 35 min. Even copy a folder (with
multiple files/subfolders) within the Gluster volume take the same time.

However, if I do dd to check the copy speed, I get the below result.

[root@ClientServer ~]#  time sh -c "dd if=/dev/zero
of=/mnt/testmount/test.tmp bs=4k count=2 && sync"
2+0 records in
2+0 records out
8192 bytes (82 MB) copied, 17.1357 s, 4.8 MB/s

real0m17.337s
user0m0.031s
sys 0m0.317s


Anyone experience the same kind of performance issue, please let me know
your thoughts.

Cheers
Srikanth
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.5 - S57glusterfind-delete-post.py error

2015-12-09 Thread Marco Lorenzo Crociani

Hi,

# /var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py 
--volname=VOL_ZIMBRA

Traceback (most recent call last):
  File 
"/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py", 
line 60, in 

main()
  File 
"/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py", 
line 43, in main

for session in os.listdir(glusterfind_dir):
OSError: [Errno 2] No such file or directory: 
'/var/lib/glusterd/glusterfind'



# which glusterfind
/usr/bin/glusterfind


Regards,

Marco Crociani

On 07/12/2015 14:45, Aravinda wrote:

Looks like failed to execute the Cleanup script as part of Volume delete.

Please run the following command in the failed node and let us know 
the output and return code.


/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py 
--volname=VOL_ZIMBRA

echo $?

This error can be ignored if not using Glusterfind.
regards
Aravinda
On 11/11/2015 06:55 PM, Marco Lorenzo Crociani wrote:

Hi,
I removed one volume from the ovirt console.
oVirt 3.5.4
Gluster 3.7.5
CentOS release 6.7

In the logs there where these errors:


[2015-11-11 13:03:29.783491] I [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(+0x5fc75) 
[0x7fc7d6002c75] 
-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x4cc) 
[0x7fc7d60920bc] -->/usr/lib64/libglusterfs.so.0(runner_log+0x11e) 
[0x7fc7e162868e] ) 0-management: Ran script: 
/var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh 
--volname=VOL_ZIMBRA --last=no
[2015-11-11 13:03:29.789594] E [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(+0x5fc75) 
[0x7fc7d6002c75] 
-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x470) 
[0x7fc7d6092060] -->/usr/lib64/libglusterfs.so.0(runner_log+0x11e) 
[0x7fc7e162868e] ) 0-management: Failed to execute script: 
/var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh 
--volname=VOL_ZIMBRA --last=no
[2015-11-11 13:03:29.790807] I [MSGID: 106132] 
[glusterd-utils.c:1371:glusterd_service_stop] 0-management: brick 
already stopped
[2015-11-11 13:03:31.108959] I [MSGID: 106540] 
[glusterd-utils.c:4105:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered MOUNTV3 successfully
[2015-11-11 13:03:31.109881] I [MSGID: 106540] 
[glusterd-utils.c:4114:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered MOUNTV1 successfully
[2015-11-11 13:03:31.110725] I [MSGID: 106540] 
[glusterd-utils.c:4123:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered NFSV3 successfully
[2015-11-11 13:03:31.111562] I [MSGID: 106540] 
[glusterd-utils.c:4132:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered NLM v4 successfully
[2015-11-11 13:03:31.112396] I [MSGID: 106540] 
[glusterd-utils.c:4141:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered NLM v1 successfully
[2015-11-11 13:03:31.113225] I [MSGID: 106540] 
[glusterd-utils.c:4150:glusterd_nfs_pmap_deregister] 0-glusterd: 
De-registered ACL v3 successfully
[2015-11-11 13:03:32.212071] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd 
already stopped
[2015-11-11 13:03:32.212862] I [MSGID: 106132] 
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub 
already stopped
[2015-11-11 13:03:32.213099] I [MSGID: 106144] 
[glusterd-pmap.c:274:pmap_registry_remove] 0-pmap: removing brick 
/gluster/VOL_ZIMBRA/brick on port 49191
[2015-11-11 13:03:32.282685] I [MSGID: 106144] 
[glusterd-pmap.c:274:pmap_registry_remove] 0-pmap: removing brick 
/gluster/VOL_ZIMBRA/brick3 on port 49168
[2015-11-11 13:03:32.364079] I [MSGID: 101053] 
[mem-pool.c:616:mem_pool_destroy] 0-management: size=588 max=1 total=1
[2015-11-11 13:03:32.364111] I [MSGID: 101053] 
[mem-pool.c:616:mem_pool_destroy] 0-management: size=124 max=1 total=1
[2015-11-11 13:03:32.374604] I [MSGID: 101053] 
[mem-pool.c:616:mem_pool_destroy] 0-management: size=588 max=1 total=1
[2015-11-11 13:03:32.374640] I [MSGID: 101053] 
[mem-pool.c:616:mem_pool_destroy] 0-management: size=124 max=1 total=1
[2015-11-11 13:03:41.906892] I [MSGID: 106495] 
[glusterd-handler.c:3049:__glusterd_handle_getwd] 0-glusterd: 
Received getwd req
[2015-11-11 13:03:41.910931] E [run.c:190:runner_log] 
(-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(+0xef3d2) 
[0x7fc7d60923d2] 
-->/usr/lib64/glusterfs/3.7.5/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x470) 
[0x7fc7d6092060] -->/usr/lib64/libglusterfs.so.0(runner_log+0x11e) 
[0x7fc7e162868e] ) 0-management: Failed to execute script: 
/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post.py 
--volname=VOL_ZIMBRA




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users





--
Marco Crociani
Prisma Telecom Testing S.r.l.
via Petrocchi, 4  20127 MILANO  ITALY
Phone:  +39 02 26113507
Fax:  +39 02 26113597
e-mail:  mar...@prismatelecomtesting.com
web:  http://www.prismatelecomtesting.com

Questa 

Re: [Gluster-users] Strange file corruption

2015-12-09 Thread Udo Giacomozzi

Am 08.12.2015 um 02:59 schrieb Lindsay Mathieson:
Hi Udo, thanks for posting your volume info settings. Please note for 
the following, I am not one of the devs, just a user, so unfortunately 
I have no authoritative answers :(


I am running a very similar setup - Proxmox 4.0, three nodes, but 
using ceph for our production storage. Am heavily testing gluster 3.7 
on the side. We find the performance of ceph slow on these small 
setups and management of it a PITA.



Some more questions

- how are your VM images being accessed by Proxmox? gfapi? (Proxmox 
Gluster storage type) or by using the fuse mount?




Sorry, forgot to say that: I'm accessing the Gluster Storage via NFS 
since (at least in version 3.4 of Proxmox) the gfapi method has some 
problems with sockets.



- whats your underlying filesystem (ext4, zfs etc)


a dedicated ext4 partition


- Are you using the HA/Watchdog system in Proxmox?


I am now (watchdog HA), but Proxmox was running in non-HA mode at the 
time of failure.







On 07/12/15 21:03, Udo Giacomozzi wrote:
esterday I had a strange situation where Gluster healing corrupted 
*all* my VM images.



In detail:
I had about 15 VMs running (in Proxmox 4.0) totaling about 600 GB of 
qcow2 images. Gluster is used as storage for those images in 
replicate 3 setup (ie. 3 physical servers replicating all data).
All VMs were running on machine #1 - the two other machines (#2 and 
#3) were *idle*.

Gluster was fully operating (no healing) when I rebooted machine #2.
For other reasons I had to reboot machines #2 and #3 a few times, but 
since all VMs were running on machine #1 and nothing on the other 
machines was accessing Gluster files, I was confident that this 
wouldn't disturb Gluster.
But anyway this means that I rebootet Gluster nodes during a healing 
process.


After a few minutes, Gluster files began showing corruption - up to 
the point that the qcow2 files became unreadable and all VMs stopped 
working. 



:( sounds painful - my sympathies.

You're running 3.5.2 - thats getting rather old. I use the gluster 
debian repos:


  3.6.7 : 
http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/Debian/
  3.7.6 : 
http://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/jessie/


3.6.x is the latest stable, 3.7 is close to stable(?) 3.7 has some 
nice new features such as sharding, which is very useful for VM 
hosting - it enables much faster heal times.


I'm using the most current version provided by Proxmox APT sources.

Can 3.6 or even 3.7 Gluster nodes work together with 3.5 node? If not 
I'm wondering how I could upgrade...


You might understand that I hesitate a bit to upgrade Gluster without 
having some certainty that it won't make things even worse. I mean, this 
is a production system..




Regards what happened with your VM's, I'm not sure. Having two servers 
down should have disabled the entire store making it not readable or 
writable. 


I'm not sure if both servers were down at the same time (could be, 
though). I'm just sure that I rebootet them rather quickly in sequence.


Right now my credo is "/never ever reboot/shutdown more than 1 node at a 
time and most importantly, always make sure that no Gluster healing is 
in progress/". For sure, I did not respect that when I crashed my storage.


I note that you are missing some settings that need to be set for VM 
stores - there will be corruption problems if you live migrate without 
them.


quick-read=off
read-ahead=off
io-cache=off
stat-prefetch=off
eager-lock=enable
remote-dio=enable
quorum-type=auto
server-quorum-type=server


"stat-prefetch=off" is particularly important.



Thanks. Is there a document that explains the reasoning behind this config?

Does this apply to volumes for virtual HDD images only? My "docker-repo" 
is still "replicate 3" type but is used by the VMs themselves, not by 
the hypervisor - I guess other settings apply there..


Thanks a lot,
Udo

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Strange file corruption

2015-12-09 Thread Lindsay Mathieson



On 10/12/2015 3:15 AM, Udo Giacomozzi wrote:

This were the commands executed on node #2 during step 6:

gluster volume add-brick "systems" replica 3
metal1:/data/gluster/systems
gluster volume heal "systems" full   # to trigger sync


Then I waited for replication to finish before doing anything else 
(about 1 hour or maybe more), checking _gluster volume heal "systems" 
info_



Did you execute the heal command from host #2? Might be related to a 
possible issue I encountered during testing adding bricks recently, 
still in the process of recreating and testing the issue.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster nfs-ganesha enable fails and is driving me crazy

2015-12-09 Thread Marco Antonio Carcano

Hi Kaleb,

thank you very much for the quick reply

I tried what you suggested, but I got the same error

I tried both

HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_glstr01.carcano.local="192.168.65.250"
VIP_glstr02.carcano.local="192.168.65.251"

as well as

HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_glstr01_carcano_local="192.168.65.250"
VIP_glstr02_carcano_local="192.168.65.251"

Finally I reiinstalled everything and tried also with hostname rather 
than FQDN, that is


HA_NAME="ganesha-ha-360"
HA_VOL_SERVER="glstr01"
HA_CLUSTER_NODES="glstr01v,glstr02v"
VIP_glstr01v="192.168.65.250"
VIP_glstr02v="192.168.65.251"

but still no luck - Maybe do you have any other advice?

Kind regards

Marco


Il 08/12/15 13:30, Kaleb KEITHLEY ha scritto:

On 12/08/2015 03:46 AM, Marco Antonio Carcano wrote:

Hi,

/etc/ganesha/ganesha-ha.conf

HA_NAME="ganesha-ha-360"
HA_VOL_SERVER="glstr01.carcano.local"
HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_server1="192.168.65.250"
VIP_server2="192.168.65.251"


change your /etc/ganesha/ganesha-ha.conf file:

HA_NAME="ganesha-ha-360"
HA_VOL_SERVER="glstr01.carcano.local"
HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_glstr01.carcano.local="192.168.65.250"
VIP_glstr02.carcano.local="192.168.65.251"

I'd change the HA_NAME to something else too, but as long as you don't
set up another cluster on the same network you should be fine.

--

Kaleb



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] libgfapi access

2015-12-09 Thread Ankireddypalle Reddy
Hi,
 I upgraded my setup to gluster 3.7.3. I tested writes by performing 
writes through fuse and through libgfapi. Attached are the profiles generated 
from fuse and libgfapi. The test programs essentially writes 1 blocks each 
of 128K.

[root@santest2 Base]# time ./GlusterFuseTest /ws/glus 131072 1
Mount path: /ws/glus
Block size: 131072
Num of blocks: 1
Will perform write test on mount path : /ws/glus
Succesfully created file /ws/glus/1449697583.glfs
Successfully filled file /ws/glus/1449697583.glfs
Write test succeeded
Write test succeeded.

real0m18.722s
user0m3.913s
sys 0m1.126s

[root@santest2 Base]# time ./GlusterLibGFApiTest dispersevol santest2 24007 
131072 1
Host name: santest2
Volume: dispersevol
Port: 24007
Block size: 131072
Num of blocks: 1
Will perform write test on volume: dispersevol
Successfully filled file 1449697651.glfs
Write test succeeded
Write test succeeded.

real0m18.630s
user0m8.804s
sys 0m1.870s

Thanks and Regards,
Ram

  

-Original Message-
From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com] 
Sent: Wednesday, December 09, 2015 1:39 AM
To: Ankireddypalle Reddy; Vijay Bellur; gluster-users@gluster.org
Subject: Re: [Gluster-users] libgfapi access



On 12/08/2015 08:28 PM, Ankireddypalle Reddy wrote:
> Vijay,
>   We are trying to write data backed up by Commvault simpana to 
> glusterfs volume.  The data being written is around 30 GB. Two kinds of write 
> requests happen.
>   1) 1MB requests
>   2) Small write requests of size 128 bytes. In case of libgfapi access 
> these are cached and a single 128KB write request is made where as in case of 
> FUSE the 128 byte write request is handled to FUSE directly.
>
>   glusterfs 3.6.5 built on Aug 24 2015 10:02:43
>
>  Volume Name: dispersevol
>   Type: Disperse
>   Volume ID: c5d6ccf8-6fec-4912-ab2e-6a7701e4c4c0
>   Status: Started
>   Number of Bricks: 1 x (2 + 1) = 3
>   Transport-type: tcp
>   Bricks:
>   Brick1: ssdtest:/mnt/ssdfs1/brick3
>   Brick2: sanserver2:/data/brick3
>   Brick3: santest2:/home/brick3
>   Options Reconfigured:
>   performance.cache-size: 512MB
>   performance.write-behind-window-size: 8MB
>   performance.io-thread-count: 32
>   performance.flush-behind: on
hi,
  Things look okay. May be we can find something using profile info.

Could you post the results of the following operations:
1) gluster volume profile  start
2) Run the fuse workload
3) gluster volume profile  info > /path/to/file-1/to/send/us
4) Run the libgfapi workload
5)gluster volume profile  info > /path/to/file-2/to/send/us

Send both these files to us to check what are the extra fops if any that are 
sent over network which may be causing the delay.

I see that you are using disperse volume. If you are going to use disperse 
volume for production usecases, I suggest you use 3.7.x preferably 3.7.3. We 
fixed a bug in releases from 3.7.4 till 3.7.6 which will be released in 3.7.7.

Pranith
>
> Thanks and Regards,
> Ram
>
>
> -Original Message-
> From: Vijay Bellur [mailto:vbel...@redhat.com]
> Sent: Monday, December 07, 2015 6:13 PM
> To: Ankireddypalle Reddy; gluster-users@gluster.org
> Subject: Re: [Gluster-users] libgfapi access
>
> On 12/07/2015 10:29 AM, Ankireddypalle Reddy wrote:
>> Hi,
>>
>>  I am trying to use  libgfapi  interface to access gluster 
>> volume. What I noticed is that reads/writes to the gluster volume 
>> through libgfapi interface are slower than FUSE.  I was expecting the 
>> contrary. Are there any recommendations/settings suggested to be used 
>> while using libgfapi interface.
>>
> Can you please provide more details about your tests? Providing information 
> like I/O block size, file size, throughput would be helpful.
>
> Thanks,
> Vijay
>
>
>
>
>
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material 
> for the sole use of the intended recipient. Any unauthorized review, 
> use or distribution by others is strictly prohibited. If you have 
> received the message by mistake, please advise the sender by reply email and 
> delete the message. Thank you."
> **
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users




***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**


Re: [Gluster-users] Gluster Documentation Update

2015-12-09 Thread Lindsay Mathieson



On 10/12/2015 8:56 AM, Amye Scavarda wrote:
n the interest of making our documentation usable again, we've gone 
through MediaWiki (the old community pages and documentation) and 
found out what was left behind and what needed to be moved over to our 
Github-based wiki pages. We'll be turning those live on Github this 
week, before December 11th.


Brilliant, thanks.

BTW, I still get a lot of dead links from the search on readthedocs



We'd love to make sure that we're not missing anything, so we'll leave 
the MediaWiki up until January 1st, and work to consolidate our 
documentation on ReadTheDocs moving forward.


Let me know if you have questions or want to help QA this!



What would QA involve?

--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Gluster Documentation Update

2015-12-09 Thread Amye Scavarda
In the interest of making our documentation usable again, we've gone
through MediaWiki (the old community pages and documentation) and found out
what was left behind and what needed to be moved over to our Github-based
wiki pages. We'll be turning those live on Github this week, before
December 11th.

We'd love to make sure that we're not missing anything, so we'll leave the
MediaWiki up until January 1st, and work to consolidate our documentation
on ReadTheDocs moving forward.

Let me know if you have questions or want to help QA this!
-- amye

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster Documentation Update

2015-12-09 Thread Humble Devassy Chirammal
HI Lindsay,

>
BTW, I still get a lot of dead links from the search on readthedocs
>

Can you please share couple of examples?

Because, if the dead links are formed due to recent rearrangement of docs,
it will be vanished soon. . That said, when we moved the documentation to
github based, we kept everything in github and rendered to readthedocs.
However, recently one more revisit happened on this approach and now the
documentation is mainly in 3 places.

*) Developer docuement : Inside gluster source code repo (
https://github.com/gluster/glusterfs/tree/master/*doc/developer-guide*)
*) Features/specs  :  https://github.com/gluster*/glusterfs-specs/*
*) Admin/user based : https://github.com/gluster/*glusterdocs*

If I am searching on a string (for ex: quota) in readthedocs  , below link
may come in the search result
http://gluster.readthedocs.org/en/latest/Features/quota-object-count/?highlight=quota

Above is a deadlink because "Features" directory itself is moved and its
now part of https://github.com/gluster/*glusterfs-specs*
/blob/master/done/Features/quota-object-count.md

Please let us know if you are facing the deadlink issue for some other
scenarios.





--Humble


On Thu, Dec 10, 2015 at 4:38 AM, Lindsay Mathieson <
lindsay.mathie...@gmail.com> wrote:

>
>
> On 10/12/2015 8:56 AM, Amye Scavarda wrote:
>
>> n the interest of making our documentation usable again, we've gone
>> through MediaWiki (the old community pages and documentation) and found out
>> what was left behind and what needed to be moved over to our Github-based
>> wiki pages. We'll be turning those live on Github this week, before
>> December 11th.
>>
>
> Brilliant, thanks.
>
> BTW, I still get a lot of dead links from the search on readthedocs
>
>
>> We'd love to make sure that we're not missing anything, so we'll leave
>> the MediaWiki up until January 1st, and work to consolidate our
>> documentation on ReadTheDocs moving forward.
>>
>> Let me know if you have questions or want to help QA this!
>>
>
>
> What would QA involve?
>
> --
> Lindsay Mathieson
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster Documentation Update

2015-12-09 Thread Humble Devassy Chirammal
Hi Lindsay,

All above mentioned links are broken due to the same reason mentioned in
this thread. The "Features"
directory is moved from this repo and now its part of "gluster-specs".  I
am not sure how we can remove the search cache in readthedocs, will look
into this though.

--Humble


On Thu, Dec 10, 2015 at 11:10 AM, Lindsay Mathieson <
lindsay.mathie...@gmail.com> wrote:

> On 10/12/15 13:04, Humble Devassy Chirammal wrote:
>
>> BTW, I still get a lot of dead links from the search on readthedocs
>> >
>>
>> Can you please share couple of examples?
>>
>
>
> Starting from: http://gluster.readthedocs.org/en/latest, search on
> "tier". Gives the following:
>
> http://gluster.readthedocs.org/en/latest/Features/tier/?highlight=tier
>
>
> http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Data%20Classification/?highlight=tier
>
>
> http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Small%20File%20Performance/?highlight=tier
>
> All bad links.
>
> Same for searching on "shard"
>
>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster Documentation Update

2015-12-09 Thread Lindsay Mathieson

On 10/12/15 13:04, Humble Devassy Chirammal wrote:

BTW, I still get a lot of dead links from the search on readthedocs
>

Can you please share couple of examples? 



Starting from: http://gluster.readthedocs.org/en/latest, search on 
"tier". Gives the following:


http://gluster.readthedocs.org/en/latest/Features/tier/?highlight=tier

http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Data%20Classification/?highlight=tier

http://gluster.readthedocs.org/en/latest/Feature%20Planning/GlusterFS%203.7/Small%20File%20Performance/?highlight=tier

All bad links.

Same for searching on "shard"




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster nfs-ganesha enable fails and is driving me crazy

2015-12-09 Thread Soumya Koduri



On 12/10/2015 02:51 AM, Marco Antonio Carcano wrote:

Hi Kaleb,

thank you very much for the quick reply

I tried what you suggested, but I got the same error

I tried both

HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_glstr01.carcano.local="192.168.65.250"
VIP_glstr02.carcano.local="192.168.65.251"

as well as

HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_glstr01_carcano_local="192.168.65.250"
VIP_glstr02_carcano_local="192.168.65.251"

Finally I reiinstalled everything and tried also with hostname rather
than FQDN, that is

HA_NAME="ganesha-ha-360"
HA_VOL_SERVER="glstr01"
HA_CLUSTER_NODES="glstr01v,glstr02v"
VIP_glstr01v="192.168.65.250"
VIP_glstr02v="192.168.65.251"

but still no luck - Maybe do you have any other advice?



Could you check glusterd logs now? Does it still complain with below error?

[glusterd-ganesha.c:264:glusterd_op_set_ganesha] 0-management: Initial 
NFS-Ganesha set up failed


Also please check below log files for the errors/warnings thrown during 
the setup -


'/var/log/messages'
'/var/log/pacemaker.log'
'/var/log/ganesha.log'
'/var/log/ganesha-gfapi.log'

Thanks.
Soumya


Kind regards

Marco


Il 08/12/15 13:30, Kaleb KEITHLEY ha scritto:

On 12/08/2015 03:46 AM, Marco Antonio Carcano wrote:

Hi,

/etc/ganesha/ganesha-ha.conf

HA_NAME="ganesha-ha-360"
HA_VOL_SERVER="glstr01.carcano.local"
HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_server1="192.168.65.250"
VIP_server2="192.168.65.251"


change your /etc/ganesha/ganesha-ha.conf file:

HA_NAME="ganesha-ha-360"
HA_VOL_SERVER="glstr01.carcano.local"
HA_CLUSTER_NODES="glstr01.carcano.local,glstr02.carcano.local"
VIP_glstr01.carcano.local="192.168.65.250"
VIP_glstr02.carcano.local="192.168.65.251"

I'd change the HA_NAME to something else too, but as long as you don't
set up another cluster on the same network you should be fine.

--

Kaleb



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Sharding - what next?

2015-12-09 Thread Krutika Dhananjay
- Original Message -

> From: "Lindsay Mathieson" 
> To: "Krutika Dhananjay" , "Gluster Devel"
> , "gluster-users" 
> Sent: Wednesday, December 9, 2015 6:48:40 PM
> Subject: Re: Sharding - what next?

> Hi Guys, sorry for the late reply, my attention tends to be somewhat sporadic
> due to work and the large number of rescue dogs/cats I care for :)

> On 3/12/2015 8:34 PM, Krutika Dhananjay wrote:

> > We would love to hear from you on what you think of the feature and where
> > it
> > could be improved.
> 
> > Specifically, the following are the questions we are seeking feedback on:
> 
> > a) your experience testing sharding with VM store use-case - any bugs you
> > ran
> > into, any performance issues, etc
> 

> Testing was initially somewhat stressful as I regularly encountered file
> corruption. However I don't think that was due to bugs, rather incorrect
> settings for the VM usecase. Once I got that sorted out it has been very
> stable - I have really stressed failure modes we run into at work - nodes
> going down while heavy writes were happening. Live migrations during heals.
> gluster software being killed while VM were running on the host. So far its
> held up without a hitch.

> To that end, one thing I think should be made more obvious is the settings
> required for VM Hosting:

> > quick-read=off
> 
> > read-ahead=off
> 
> > io-cache=off
> 
> > stat-prefetch=off
> 
> > eager-lock=enable
> 
> > remote-dio=enable
> 
> > quorum-type=auto
> 
> > server-quorum-type=server
> 

> They are quite crucial and very easy to miss in the online docs. And they are
> only recommended with noo mention that you will corrupt KVM VM's if you live
> migrate them between gluster nodes without them set. Also the virt group is
> missing from the debian packages.
Hi Lindsay, 
Thanks for the feedback. I will get in touch with Humble to find out what can 
be done about the docs. 

> Setting them does seem to have slowed sequential writes by about 10% but I
> need to test that more.

> Something related - sharding is useful because it makes heals much more
> granular and hence faster. To that end it would be really useful if there
> was a heal info variant that gave a overview of the process - rather than
> list the shards that are being healed, just a aggregate total, e.g.

> $ gluster volume heal datastore1 status
> volume datastore1
> - split brain: 0
> - Wounded:65
> - healing:4

> It gives one a easy feeling of progress - heals aren't happening faster, but
> it would feel that way :)
There is a 'heal-info summary' command that is under review, written by 
Mohammed Ashiq @ http://review.gluster.org/#/c/12154/3 which prints the number 
of files that are yet to be healed. 
It could perhaps be enhanced to print files in split-brain and also files which 
are possibly being healed. Note that these counts are printed per brick. 
It does not print a single list of counts with aggregated values. Would that be 
something you would consider useful? 

> Also, it would be great if the heal info command could return faster,
> sometimes it takes over a minute.
Yeah, I think part of the problem could be eager-lock feature which is causing 
the GlusterFS client process to not relinquish the network lock on the file 
soon enough, causing the heal info utility to be blocked for longer duration. 
There is an enhancement Anuradha Talur is working on where heal-info would do 
away with taking locks altogether. Once that is in place, heal-info should 
return faster. 

-Krutika 

> Thanks for the great work,

> Lindsay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Strange file corruption

2015-12-09 Thread Udo Giacomozzi

Am 09.12.2015 um 17:17 schrieb Joe Julian:

A-1) shut down node #1 (the first that is about to be upgraded)
A-2) remove node #1 from the Proxmox cluster (/pvevm delnode "metal1"/)
A-3) remove node #1 from the Gluster volume/cluster (/gluster volume 
remove-brick ... && gluster peer detach "metal1"/)
A-4) install Debian Jessie on node #1, overwriting all data on the 
HDD -*with same Network settings and hostname as before*
A-5)install Proxmox 4.0 
on 
node #1
A-6) install Gluster on node #1 and add it back to the Gluster volume 
(/gluster volume add-brick .../) => shared storage will be complete 
again (spanning 3.4 and 4.0 nodes)
A-7) configure the Gluster volume as shared storage in Proxmox 4 
(node #1)

A-8) configure the external Backup storage on node #1 (Proxmox 4)


Was the data on the gluster brick deleted as part of step 4? 


Yes, all data on physical HDD was deleted (reformatted / repartitioned).


When you remove the brick, gluster will no longer track pending 
changes for that brick. If you add it back in with stale data but 
matching gfids, you would have two clean bricks with mismatching data. 
Did you have to use "add-brick...force"?


No, "force" was not necessary and the added directory 
"/data/gluster/systems" did not exist.


This were the commands executed on node #2 during step 6:

   gluster volume add-brick "systems" replica 3
   metal1:/data/gluster/systems
   gluster volume heal "systems" full   # to trigger sync


Then I waited for replication to finish before doing anything else 
(about 1 hour or maybe more), checking _gluster volume heal "systems" info_



Udo

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users