date:20170216

Re: [Gluster-users] File operation failure on simple distributed volume

2017-02-16 Thread Mohammed Rafi K C

Hi Yonex

Recently Poornima has fixed one corruption issue with upcall, which
seems unlikely the cause of the issue, given that you are running fuse
clients. Even then I would like to give you a debug build including the
fix [1] and adding additional logs.

Will you be able to run the debug build ?


[1] : https://review.gluster.org/#/c/16613/

Regards

Rafi KC


On 02/16/2017 09:13 PM, yonex wrote:
> Hi Rafi,
>
> I'm still on this issue. But reproduction has not yet been achieved
> outside of production. In production environment, I have made
> applications stop writing data to glusterfs volume. Only read
> operations are going.
>
> P.S. It seems that I have corrupted the email thread..;-(
> http://lists.gluster.org/pipermail/gluster-users/2017-January/029679.html
>
> 2017-02-14 17:19 GMT+09:00 Mohammed Rafi K C :
>> Hi Yonex,
>>
>> Are you still hitting this issue ?
>>
>>
>> Regards
>>
>> Rafi KC
>>
>>
>> On 01/16/2017 10:36 AM, yonex wrote:
>>
>> Hi
>>
>> I noticed that there is a high throughput degradation while attaching the
>> gdb script to a glusterfs client process. Write speed becomes 2% or less. It
>> is not be able to keep thrown in production.
>>
>> Could you provide the custom build that you mentioned before? I am going to
>> keep trying to reproduce the problem outside of the production environment.
>>
>> Regards
>>
>> 2017年1月8日 21:54、Mohammed Rafi K C :
>>
>> Is there any update on this ?
>>
>>
>> Regards
>>
>> Rafi KC
>>
>> On 12/24/2016 03:53 PM, yonex wrote:
>>
>> Rafi,
>>
>>
>> Thanks again. I will try that and get back to you.
>>
>>
>> Regards.
>>
>>
>>
>> 2016-12-23 18:03 GMT+09:00 Mohammed Rafi K C :
>>
>> Hi Yonex,
>>
>>
>> As we discussed in irc #gluster-devel , I have attached the gdb script
>>
>> along with this mail.
>>
>>
>> Procedure to run the gdb script.
>>
>>
>> 1) Install gdb,
>>
>>
>> 2) Download and install gluster debuginfo for your machine . packages
>>
>> location --- > https://cbs.centos.org/koji/buildinfo?buildID=12757
>>
>>
>> 3) find the process id and attach gdb to the process using the command
>>
>> gdb attach  -x 
>>
>>
>> 4) Continue running the script till you hit the problem
>>
>>
>> 5) Stop the gdb
>>
>>
>> 6) You will see a file called mylog.txt in the location where you ran
>>
>> the gdb
>>
>>
>>
>> Please keep an eye on the attached process. If you have any doubt please
>>
>> feel free to revert me.
>>
>>
>> Regards
>>
>>
>> Rafi KC
>>
>>
>>
>> On 12/19/2016 05:33 PM, Mohammed Rafi K C wrote:
>>
>> On 12/19/2016 05:32 PM, Mohammed Rafi K C wrote:
>>
>> Client 0-glusterfs01-client-2 has disconnected from bricks around
>>
>> 2016-12-15 11:21:17.854249 . Can you look and/or paste the brick logs
>>
>> around the time.
>>
>> You can find the brick name and hostname for 0-glusterfs01-client-2 from
>>
>> client graph.
>>
>>
>> Rafi
>>
>>
>> Are you there in any of gluster irc channel, if so Have you got a
>>
>> nickname that I can search.
>>
>>
>> Regards
>>
>> Rafi KC
>>
>>
>> On 12/19/2016 04:28 PM, yonex wrote:
>>
>> Rafi,
>>
>>
>> OK. Thanks for your guide. I found the debug log and pasted lines around
>> that.
>>
>> http://pastebin.com/vhHR6PQN
>>
>>
>> Regards
>>
>>
>>
>> 2016-12-19 14:58 GMT+09:00 Mohammed Rafi K C :
>>
>> On 12/16/2016 09:10 PM, yonex wrote:
>>
>> Rafi,
>>
>>
>> Thanks, the .meta feature I didn't know is very nice. I finally have
>>
>> captured debug logs from a client and bricks.
>>
>>
>> A mount log:
>>
>> - http://pastebin.com/Tjy7wGGj
>>
>>
>> FYI rickdom126 is my client's hostname.
>>
>>
>> Brick logs around that time:
>>
>> - Brick1: http://pastebin.com/qzbVRSF3
>>
>> - Brick2: http://pastebin.com/j3yMNhP3
>>
>> - Brick3: http://pastebin.com/m81mVj6L
>>
>> - Brick4: http://pastebin.com/JDAbChf6
>>
>> - Brick5: http://pastebin.com/7saP6rsm
>>
>>
>> However I could not find any message like "EOF on socket". I hope
>>
>> there is any helpful information in the logs above.
>>
>> Indeed. I understand that the connections are in disconnected state. But
>>
>> what particularly I'm looking for is the cause of the disconnect, Can
>>
>> you paste the debug logs when it start disconnects, and around that. You
>>
>> may see a debug logs that says "disconnecting now".
>>
>>
>>
>> Regards
>>
>> Rafi KC
>>
>>
>>
>> Regards.
>>
>>
>>
>> 2016-12-14 15:20 GMT+09:00 Mohammed Rafi K C :
>>
>> On 12/13/2016 09:56 PM, yonex wrote:
>>
>> Hi Rafi,
>>
>>
>> Thanks for your response. OK, I think it is possible to capture debug
>>
>> logs, since the error seems to be reproduced a few times per day. I
>>
>> will try that. However, so I want to avoid redundant debug outputs if
>>
>> possible, is there a way to enable debug log only on specific client
>>
>> nodes?
>>
>> if you are using fuse mount, there is proc kind of feature called .meta
>>
>> . You can set log level through that for a particular client [1] . But I
>>
>> also want log from bricks

[Gluster-users] Machine becomes its own peer

2017-02-16 Thread Scott Hazelhurst


Dear all

Last week I posted a query about a problem I had with a machine that had failed 
but the underlying hard disk with the gluster brick was good. I’ve made some 
progress in restoring. I now have the problem with my new restored machine 
where it becomes its own peer, which then breaks everything.

1. Gluster daemons are off on all peers, content of /var/lib/glusterd/peers 
looks good.
2. I start the gluster daemons on all peers. All looks good.
3. For about 2 minutes, there’s no obvious problem — if I do a gluster peer 
status on any machine it looks good, if I do a gluster volume status A01 on any 
machine it looks good.
4. Then at some point, the /var/lib/glusterd/peers file of the new, restored 
machine gets an entry for itself and things start breaking. A typical error 
message is the understandable 

: Unable to get lock for uuid: 4fb930f7-554e-462a-9204-4592591feeb8, lock held 
by: 4fb930f7-554e-462a-9204-4592591feeb8 

5. This is repeatable — if I stop daemons, remove the offending entry in 
/var/lib/glusterd/peer, and restart, the same behavior occurs — all good for a 
minute or two and then something magically puts something in 
/var/lib/glusterd/peers

In a previous step in restoring my machine, I had a different error of 
mismatching cksums and what I did then may be the cause of the problem.  In 
searching the list archives I found someone with a similar cksum problem, and 
the proposed solution was to copy the /var/lib/glusterd/vols/ from another of 
the peers to the new machine. This may not be the issue but this is the only 
thing I think I did that was unconventional.

I am running version 3.7.5-19 on Scientific Linux 6.8

If anyone can suggest a way forward I would be grateful

Many thanks

Scott


 

This communication is 
intended for the addressee only. It is confidential. If you have received this 
communication in error, please notify us immediately and destroy the original 
message. You may not copy or disseminate this communication without the 
permission of the University. Only authorised signatories are competent to 
enter into agreements on behalf of the University and recipients are thus 
advised that the content of this message may not be legally binding on the 
University and may contain the personal views and opinions of the author, which 
are not necessarily the views and opinions of The University of the 
Witwatersrand, Johannesburg. All agreements between the University and 
outsiders are subject to South African Law unless the University agrees in 
writing to the contrary. 

http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 90 Brick/Server suggestions?

2017-02-16 Thread Serkan Çoban

>We have 12 on order.  Actually the DSS7000 has two nodes in the chassis,
>and  each accesses 45 bricks.  We will be using an erasure code scheme
>probably 24:3 or 24:4, we have not sat down and really thought about the
>exact scheme we will use.

If we cannot get 1 node/90 disk configuration, we also get it as 2
nodes/45 disks each.
Be careful about EC. I am using 16+4 in production, only drawback is
slow rebuild times.
It takes 10 days to rebuild 8TB disk. Although parallel heal for EC
improves it in 3.9,
don't forget to test rebuild times for different EC configurations,

>90 disks per server is a lot.  In particular, it might be out of balance with 
>other
>characteristics of the machine - number of cores, amount of memory, network
>or even bus bandwidth

Nodes will be pretty powerful, 2x18 core CPUs with 256GB RAM and 2X10Gb bonded
ethernet. It will be used for archive purposes so I don't need more
than 1GB/s/node.
RAID is not an option, JBOD with EC will be used.

>gluster volume set all cluster.brick-multiplex on
I just read the 3.10 release notes and saw this. I think this is a
good solution,
I plan to use 3.10.x and will probably test multiplexing and get in
touch for help..

Thanks for the suggestions,
Serkan


On Fri, Feb 17, 2017 at 1:39 AM, Jeff Darcy  wrote:
>> We are evaluating dell DSS7000 chassis with 90 disks.
>> Has anyone used that much brick per server?
>> Any suggestions, advices?
>
> 90 disks per server is a lot.  In particular, it might be out of balance with 
> other characteristics of the machine - number of cores, amount of memory, 
> network or even bus bandwidth.  Most people who put that many disks in a 
> server use some sort of RAID (HW or SW) to combine them into a smaller number 
> of physical volumes on top of which filesystems and such can be built.  If 
> you can't do that, or don't want to, you're in poorly explored territory.  My 
> suggestion would be to try running as 90 bricks.  It might work fine, or you 
> might run into various kinds of contention:
>
> (1) Excessive context switching would indicate not enough CPU.
>
> (2) Excessive page faults would indicate not enough memory.
>
> (3) Maxed-out network ports . . . well, you can figure that one out.  ;)
>
> If (2) applies, you might want to try brick multiplexing.  This is a new 
> feature in 3.10, which can reduce memory consumption by more than 2x in many 
> cases by putting multiple bricks into a single process (instead of one per 
> brick).  This also drastically reduces the number of ports you'll need, since 
> the single process only needs one port total instead of one per brick.  In 
> terms of CPU usage or performance, gains are far more modest.  Work in that 
> area is still ongoing, as is work on multiplexing in general.  If you want to 
> help us get it all right, you can enable multiplexing like this:
>
>   gluster volume set all cluster.brick-multiplex on
>
> If multiplexing doesn't help for you, speak up and maybe we can make it 
> better, or perhaps come up with other things to try.  Good luck!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] 90 Brick/Server suggestions?

2017-02-16 Thread Alastair Neil

We have 12 on order.  Actually the DSS7000 has two nodes in the chassis,
and  each accesses 45 bricks.  We will be using an erasure code scheme
probably 24:3 or 24:4, we have not sat down and really thought about the
exact scheme we will use.

On 15 February 2017 at 14:04, Serkan Çoban  wrote:

> Hi,
>
> We are evaluating dell DSS7000 chassis with 90 disks.
> Has anyone used that much brick per server?
> Any suggestions, advices?
>
> Thanks,
> Serkan
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] File operation failure on simple distributed volume

2017-02-16 Thread yonex

Hi Rafi,

I'm still on this issue. But reproduction has not yet been achieved
outside of production. In production environment, I have made
applications stop writing data to glusterfs volume. Only read
operations are going.

P.S. It seems that I have corrupted the email thread..;-(
http://lists.gluster.org/pipermail/gluster-users/2017-January/029679.html

2017-02-14 17:19 GMT+09:00 Mohammed Rafi K C :
> Hi Yonex,
>
> Are you still hitting this issue ?
>
>
> Regards
>
> Rafi KC
>
>
> On 01/16/2017 10:36 AM, yonex wrote:
>
> Hi
>
> I noticed that there is a high throughput degradation while attaching the
> gdb script to a glusterfs client process. Write speed becomes 2% or less. It
> is not be able to keep thrown in production.
>
> Could you provide the custom build that you mentioned before? I am going to
> keep trying to reproduce the problem outside of the production environment.
>
> Regards
>
> 2017年1月8日 21:54、Mohammed Rafi K C :
>
> Is there any update on this ?
>
>
> Regards
>
> Rafi KC
>
> On 12/24/2016 03:53 PM, yonex wrote:
>
> Rafi,
>
>
> Thanks again. I will try that and get back to you.
>
>
> Regards.
>
>
>
> 2016-12-23 18:03 GMT+09:00 Mohammed Rafi K C :
>
> Hi Yonex,
>
>
> As we discussed in irc #gluster-devel , I have attached the gdb script
>
> along with this mail.
>
>
> Procedure to run the gdb script.
>
>
> 1) Install gdb,
>
>
> 2) Download and install gluster debuginfo for your machine . packages
>
> location --- > https://cbs.centos.org/koji/buildinfo?buildID=12757
>
>
> 3) find the process id and attach gdb to the process using the command
>
> gdb attach  -x 
>
>
> 4) Continue running the script till you hit the problem
>
>
> 5) Stop the gdb
>
>
> 6) You will see a file called mylog.txt in the location where you ran
>
> the gdb
>
>
>
> Please keep an eye on the attached process. If you have any doubt please
>
> feel free to revert me.
>
>
> Regards
>
>
> Rafi KC
>
>
>
> On 12/19/2016 05:33 PM, Mohammed Rafi K C wrote:
>
> On 12/19/2016 05:32 PM, Mohammed Rafi K C wrote:
>
> Client 0-glusterfs01-client-2 has disconnected from bricks around
>
> 2016-12-15 11:21:17.854249 . Can you look and/or paste the brick logs
>
> around the time.
>
> You can find the brick name and hostname for 0-glusterfs01-client-2 from
>
> client graph.
>
>
> Rafi
>
>
> Are you there in any of gluster irc channel, if so Have you got a
>
> nickname that I can search.
>
>
> Regards
>
> Rafi KC
>
>
> On 12/19/2016 04:28 PM, yonex wrote:
>
> Rafi,
>
>
> OK. Thanks for your guide. I found the debug log and pasted lines around
> that.
>
> http://pastebin.com/vhHR6PQN
>
>
> Regards
>
>
>
> 2016-12-19 14:58 GMT+09:00 Mohammed Rafi K C :
>
> On 12/16/2016 09:10 PM, yonex wrote:
>
> Rafi,
>
>
> Thanks, the .meta feature I didn't know is very nice. I finally have
>
> captured debug logs from a client and bricks.
>
>
> A mount log:
>
> - http://pastebin.com/Tjy7wGGj
>
>
> FYI rickdom126 is my client's hostname.
>
>
> Brick logs around that time:
>
> - Brick1: http://pastebin.com/qzbVRSF3
>
> - Brick2: http://pastebin.com/j3yMNhP3
>
> - Brick3: http://pastebin.com/m81mVj6L
>
> - Brick4: http://pastebin.com/JDAbChf6
>
> - Brick5: http://pastebin.com/7saP6rsm
>
>
> However I could not find any message like "EOF on socket". I hope
>
> there is any helpful information in the logs above.
>
> Indeed. I understand that the connections are in disconnected state. But
>
> what particularly I'm looking for is the cause of the disconnect, Can
>
> you paste the debug logs when it start disconnects, and around that. You
>
> may see a debug logs that says "disconnecting now".
>
>
>
> Regards
>
> Rafi KC
>
>
>
> Regards.
>
>
>
> 2016-12-14 15:20 GMT+09:00 Mohammed Rafi K C :
>
> On 12/13/2016 09:56 PM, yonex wrote:
>
> Hi Rafi,
>
>
> Thanks for your response. OK, I think it is possible to capture debug
>
> logs, since the error seems to be reproduced a few times per day. I
>
> will try that. However, so I want to avoid redundant debug outputs if
>
> possible, is there a way to enable debug log only on specific client
>
> nodes?
>
> if you are using fuse mount, there is proc kind of feature called .meta
>
> . You can set log level through that for a particular client [1] . But I
>
> also want log from bricks because I suspect bricks process for
>
> initiating the disconnects.
>
>
>
> [1] eg : echo 8 > /mnt/glusterfs/.meta/logging/loglevel
>
>
> Regards
>
>
> Yonex
>
>
> 2016-12-13 23:33 GMT+09:00 Mohammed Rafi K C :
>
> Hi Yonex,
>
>
> Is this consistently reproducible ? if so, Can you enable debug log [1]
>
> and check for any message similar to [2]. Basically you can even search
>
> for "EOF on socket".
>
>
> You can set your log level back to default (INFO) after capturing for
>
> some time.
>
>
>
> [1] : gluster volume set  diagnostics.brick-log-level DEBUG and
>
> gluster volume set  diagnostics.client-log-level

[Gluster-users] Failed snapshot clone leaving undeletable orphaned volume on a single peer

2017-02-16 Thread Gambit15

Hey guys,
 I tried to create a new volume from a cloned snapshot yesterday, however
something went wrong during the process & I'm now stuck with the new volume
being created on the server I ran the commands on (s0), but not on the rest
of the peers. I'm unable to delete this new volume from the server, as it
doesn't exist on the peers.

What do I do?
Any insights into what may have gone wrong?

CentOS 7.3.1611
Gluster 3.8.8

The command history & extract from etc-glusterfs-glusterd.vol.log are
included below.

gluster volume list
gluster snapshot list
gluster snapshot clone data-teste data-bck_GMT-2017.02.09-14.15.43
gluster volume status data-teste
gluster volume delete data-teste
gluster snapshot create teste data
gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04
gluster snapshot status
gluster snapshot activate teste_GMT-2017.02.15-12.44.04
gluster snapshot clone data-teste teste_GMT-2017.02.15-12.44.04


[2017-02-15 12:43:21.667403] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume data-teste
[2017-02-15 12:43:21.682530] E [MSGID: 106301]
[glusterd-syncop.c:1297:gd_stage_op_phase] 0-management: Staging of
operation 'Volume Status' failed on localhost : Volume data-teste is not
started
[2017-02-15 12:43:43.633031] I [MSGID: 106495]
[glusterd-handler.c:3128:__glusterd_handle_getwd] 0-glusterd: Received
getwd req
[2017-02-15 12:43:43.640597] I [run.c:191:runner_log]
(-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcc4b2)
[0x7ffb396a14b2]
-->/usr/lib64/glusterfs/3.8.8/xlator/mgmt/glusterd.so(+0xcbf65)
[0x7ffb396a0f65] -->/lib64/libglusterfs.so.0(runner_log+0x115)
[0x7ffb44ec31c5] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/delete/post/S57glusterfind-delete-post
--volname=data-teste
[2017-02-15 13:05:20.103423] E [MSGID: 106122]
[glusterd-snapshot.c:2397:glusterd_snapshot_clone_prevalidate]
0-management: Failed to pre validate
[2017-02-15 13:05:20.103464] E [MSGID: 106443]
[glusterd-snapshot.c:2413:glusterd_snapshot_clone_prevalidate]
0-management: One or more bricks are not running. Please run snapshot
status command to see brick status.
Please start the stopped brick and then issue snapshot clone command
[2017-02-15 13:05:20.103481] W [MSGID: 106443]
[glusterd-snapshot.c:8563:glusterd_snapshot_prevalidate] 0-management:
Snapshot clone pre-validation failed
[2017-02-15 13:05:20.103492] W [MSGID: 106122]
[glusterd-mgmt.c:167:gd_mgmt_v3_pre_validate_fn] 0-management: Snapshot
Prevalidate Failed
[2017-02-15 13:05:20.103503] E [MSGID: 106122]
[glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Snapshot on local node
[2017-02-15 13:05:20.103514] E [MSGID: 106122]
[glusterd-mgmt.c:2243:glusterd_mgmt_v3_initiate_snap_phases] 0-management:
Pre Validation Failed
[2017-02-15 13:05:20.103531] E [MSGID: 106027]
[glusterd-snapshot.c:8118:glusterd_snapshot_clone_postvalidate]
0-management: unable to find clone data-teste volinfo
[2017-02-15 13:05:20.103542] W [MSGID: 106444]
[glusterd-snapshot.c:9063:glusterd_snapshot_postvalidate] 0-management:
Snapshot create post-validation failed
[2017-02-15 13:05:20.103561] W [MSGID: 106121]
[glusterd-mgmt.c:351:gd_mgmt_v3_post_validate_fn] 0-management:
postvalidate operation failed
[2017-02-15 13:05:20.103572] E [MSGID: 106121]
[glusterd-mgmt.c:1660:glusterd_mgmt_v3_post_validate] 0-management: Post
Validation failed for operation Snapshot on local node
[2017-02-15 13:05:20.103582] E [MSGID: 106122]
[glusterd-mgmt.c:2363:glusterd_mgmt_v3_initiate_snap_phases] 0-management:
Post Validation Failed
[2017-02-15 13:11:15.862858] W [MSGID: 106057]
[glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find] 0-management:
Snap volume
c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps-c3ceae3889484e96ab8bed69593cf6d3-brick1-data-brick
not found [Argumento inválido]
[2017-02-15 13:11:16.314759] I [MSGID: 106143]
[glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
/run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick1/data/brick on
port 49452
[2017-02-15 13:11:16.316090] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-02-15 13:11:16.348867] W [MSGID: 106057]
[glusterd-snapshot-utils.c:410:glusterd_snap_volinfo_find] 0-management:
Snap volume
c3ceae3889484e96ab8bed69593cf6d3.s0.run-gluster-snaps-c3ceae3889484e96ab8bed69593cf6d3-brick6-data-arbiter
not found [Argumento inválido]
[2017-02-15 13:11:16.558878] I [MSGID: 106143]
[glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
/run/gluster/snaps/c3ceae3889484e96ab8bed69593cf6d3/brick6/data/arbiter on
port 49453
[2017-02-15 13:11:16.559883] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-02-15 13:11:23.279721] E [MSGID: 106030]
[glusterd-snapshot.c:4736:glusterd_take_lvm_snapshot] 0-management: taking
snapshot of the brick

Re: [Gluster-users] Optimal shard size & self-heal algorithm for VM hosting?

2017-02-16 Thread Krutika Dhananjay

On Wed, Feb 15, 2017 at 9:38 PM, Gambit15  wrote:

> Hey guys,
>  I keep seeing different recommendations for the best shard sizes for VM
> images, from 64MB to 512MB.
>
> What's the benefit of smaller v larger shards?
> I'm guessing smaller shards are quicker to heal, but larger shards will
> provide better sequential I/O for single clients? Anything else?
>

That's the main difference. And also smaller shards provide better brick
utilization and distribution of IO in distributed-replicated volumes as
opposed to larger shards.

>
> I also usually see "cluster.data-self-heal-algorithm: full" is generally
> recommended in these cases. Why not "diff"? Is it simply to reduce CPU load
> when there's plenty of excess network capacity?
>

That's correct. diff heal requires rolling checksum to be computed for
every 128KB chunk of the file on both source and sink bricks, which is CPU
intensive, potentially affecting IO traffic.

-Krutika

>
> Thanks in advance,
> Doug
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Error while setting cluster.granular-entry-heal

2017-02-16 Thread Krutika Dhananjay

Could you please attach the "glfsheal-.log" logfile?

-Krutika

On Thu, Feb 16, 2017 at 12:05 AM, Andrea Fogazzi  wrote:

> Hello,
>
> I have a gluster volume on 3.8.8 which has multiple volumes, each on
> distributed/replicated on 5 servers (2 replicas+1 quorum); each volume is
> both accessed as gluster client (RW, clients have 3.8.8 or 3.8.5) or
> Ganesha FS (RO).
>
>
> On one of the  volumes we have:
>
> - cluster.data-self-heal-algorithm: full
> - cluster.locking-scheme: granular
>
> When we try to set
>
> - cluster.granular-entry-heal on
>
> (command I use is "gluster volume set vol-cor-homes
> cluster.granular-entry-heal on")
>
> I receive
>
> *volume set: failed:  'gluster volume set 
> cluster.granular-entry-heal {enable, disable}' is not supported. Use
> 'gluster volume heal  granular-entry-heal {enable, disable}'
> instead.*
>
>
> Answer is not clear to me; I also tried command suggested in the command
> response, but it does not work (I get "Enable granular entry heal on
> volume vol-cor-homes has been unsuccessful on bricks that are down. Please
> check if all brick processes are running." while I am sure all bricks are
> online).
>
>
> Do you have any suggestion on what I am doing wrong, or how to debug the
> issue?
>
>
>
> Thanks in advance.
>
>
> Best regards
>
> andrea
>
>
>
>
>
> --
> Andrea Fogazzi
> fo...@fogazzi.com
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] File operation failure on simple distributed volume

[Gluster-users] Machine becomes its own peer

Re: [Gluster-users] 90 Brick/Server suggestions?

Re: [Gluster-users] 90 Brick/Server suggestions?

Re: [Gluster-users] File operation failure on simple distributed volume

[Gluster-users] Failed snapshot clone leaving undeletable orphaned volume on a single peer

Re: [Gluster-users] Optimal shard size & self-heal algorithm for VM hosting?

Re: [Gluster-users] Error while setting cluster.granular-entry-heal

8 matches

Site Navigation

Mail list logo

Footer information