Re: [Gluster-devel] [ovirt-users] oVirt Survey 2019 results

2019-04-02 Thread Sahina Bose
On Tue, Apr 2, 2019 at 12:07 PM Sandro Bonazzola 
wrote:

> Thanks to the 143 participants to oVirt Survey 2019!
> The survey is now closed and results are publicly available at
> https://bit.ly/2JYlI7U
> We'll analyze collected data in order to improve oVirt thanks to your
> feedback.
>
> As a first step after reading the results I'd like to invite the 30
> persons who replied they're willing to contribute code to send an email to
> de...@ovirt.org introducing themselves: we'll be more than happy to
> welcome them and helping them getting started.
>
> I would also like to invite the 17 people who replied they'd like to help
> organizing oVirt events in their area to either get in touch with me or
> introduce themselves to us...@ovirt.org so we can discuss about events
> organization.
>
> Last but not least I'd like to invite the 38 people willing to contribute
> documentation and the one willing to contribute localization to introduce
> themselves to de...@ovirt.org.
>

Thank you all for the feedback.
I was looking at the feedback specific to Gluster. While it's disheartening
to see "Gluster weakest link in oVirt", I can understand where the feedback
and frustration is coming from.

Over the past month and in this survey, the common themes that have come up
- Ensure smoother upgrades for the hyperconverged deployments with
GlusterFS.  The oVirt 4.3 release with upgrade to gluster 5.3 caused
disruption for many users and we want to ensure this does not happen again.
To this end, we are working on adding upgrade tests to OST based CI .
Contributions are welcome.

- improve performance on gluster storage domain. While we have seen
promising results with gluster 6 release this is an ongoing effort. Please
help this effort with inputs on the specific workloads and usecases that
you run, gathering data and running tests.

- deployment issues. We have worked to improve the deployment flow in 4.3
by adding pre-checks and changing to gluster-ansible role based deployment.
We would love to hear specific issues that you're facing around this -
please raise bugs if you haven't already (
https://bugzilla.redhat.com/enter_bug.cgi?product=cockpit-ovirt)



> Thanks!
> --
>
> SANDRO BONAZZOLA
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>
> Red Hat EMEA 
>
> sbona...@redhat.com
> 
> ___
> Users mailing list -- us...@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/us...@ovirt.org/message/4N5DYCXY2S6ZAUI7BWD4DEKZ6JL6MSGN/
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] gluster-gnfs missing in CentOS repos

2018-09-27 Thread Sahina Bose
On Mon, Sep 24, 2018 at 3:04 PM Sahina Bose  wrote:

> Hi all,
>
> gluster-gnfs rpms are missing in 4.0/4.1 repos in CentOS storage. Is this
> intended?
>

Rephrasing my question - is there plans to push gluster-gnfs rpms to the
CentOS repos as well?


> thanks
> sahina
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] gluster-gnfs missing in CentOS repos

2018-09-24 Thread Sahina Bose
Hi all,

gluster-gnfs rpms are missing in 4.0/4.1 repos in CentOS storage. Is this
intended?

thanks
sahina
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [ovirt-users] Next Gluster Updates?

2018-08-28 Thread Sahina Bose
On Mon, Aug 27, 2018 at 5:51 PM, Robert O'Kane  wrote:

> I had a bug request in Bugzilla for Gluster being killed due to a memory
> leak. The Gluster People say it is fixed in gluster-3.12.13
>
> When will Ovirt have this update?  I am getting tired of having to restart
> my hypervisors every week or so...
>
> I currently have ovirt-release42-4.2.5.1-1.el7.noarch  and yum
> check-updates shows me no new gluster versions.(still 3.12.11)
>

oVirt will pick it up as soon as the gluster release is pushed to CentOS
storage repo - http://mirror.centos.org/centos/7/storage/x86_64/
gluster-3.12/

Niels, Shyam - any ETA for gluster-3.12.13 in CentOS


> Cheers,
>
> Robert O'Kane
>
> --
> Robert O'Kane
> Systems Administrator
> Kunsthochschule für Medien Köln
> Peter-Welter-Platz 2
> 50676 Köln
>
> fon: +49(221)20189-223
> fax: +49(221)20189-49223
> ___
> Users mailing list -- us...@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/communit
> y/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archiv
> es/list/us...@ovirt.org/message/L7ZTIQA3TAM7IR4LCTWMXXCSGCLWUJJN/
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Monitoring and acting on LVM thin-pool consumption

2018-04-10 Thread Sahina Bose
On Tue, Apr 10, 2018 at 3:08 PM, Niels de Vos  wrote:

> Recently I have been implementing "volume clone" support in Heketi. This
> uses the snapshot+clone functionality from Gluster. In order to create
> snapshots and clone them, it is required to use LVM thin-pools on the
> bricks. This is where my current problem originates
>
> When there are cloned volumes, the bricks of these volumes use the same
> thin-pool as the original bricks. This makes sense, and allows cloning
> to be really fast! There is no need to copy data from one brick to a new
> one, the thin-pool provides copy-on-write semantics.
>
> Unfortunately it can be rather difficult to estimate how large the
> thin-pool should be when the initial Gluster Volume is created.
> Over-allocation is likely needed, but by how much? It may not be clear
> how many clones there will be made, nor how much % of data will change
> on each of the clones.
>
> A wrong estimate can easily cause the thin-pool to become full. When
> that happens, the filesystem on the bricks will go readonly. Mounting
> the filesystem read-writable may not be possible at all. I've even seen
> /dev entries for the LV getting removed. This makes for a horrible
> Gluster experience, and it can be tricky to recover from it.
>
> In order to make thin-provisioning more stable in Gluster, I would like
> to see integrated monitoring of (thin) LVs and some form of acting on
> crucial events. One idea would be to make the Gluster Volume read-only
> when it detects that a brick is almost out-of-space. This is close to
> what local filesystems do when their block-device is having issues.
>
> The 'dmeventd' process already monitors LVM, and by default writes to
> 'dmesg'. Checking dmesg for warnings is not really a nice solution, so
> maybe we should write a plugin for dmeventd. Possibly something exists
> already what we can use, or take inspiration from.
>
> Please provide ideas, thoughts and any other comments. Thanks!
>

For the oVirt-Gluster integration, where gluster volumes are managed and
consumed as VM image store by oVirt - a feature was added to monitor and
report guaranteed capacity for bricks as opposed to the reported size when
created on thin-provisioned LVs/vdo devices. The feature page provide some
details -
https://ovirt.org/develop/release-management/features/gluster/gluster-multiple-bricks-per-storage/.
Also, adding Denis, the feature owner.


Niels
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] BoF - Gluster for VM store use case

2017-10-31 Thread Sahina Bose
During Gluster Summit, we discussed gluster volumes as storage for VM
images - feedback on the usecase and upcoming features that may benefit
this usecase.

Some of the points discussed

* Need to ensure there are no issues when expanding a gluster volume when
sharding is turned on.
* Throttling feature for self-heal, rebalance process could be useful for
this usecase
* Erasure coded volumes with sharding - seen as a good fit for VM disk
storage
* Performance related
  ** accessing qemu images using gfapi driver does not perform as well as
fuse access. Need to understand why.
  ** Using zfs with cache or lvmcache for xfs filesystem is seen to improve
performance

If you have any further inputs on this topic, please add to thread.

thanks!
sahina
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] drop-in management/control panel

2017-04-26 Thread Sahina Bose
On Wed, Apr 26, 2017 at 9:38 PM, Nux! <n...@li.nux.ro> wrote:

> Thanks Sahina,
>
> Do you know which bit I need? is it part of ovirt-engine, is that what I
> need to install? (looks heavy, 1GB install)
>

Yes, the ovirt-engine. It's a Java web based app running on Jboss - yes, a
bit heavy.


>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> ----- Original Message -
> > From: "Sahina Bose" <sab...@redhat.com>
> > To: "Gluster Devel" <gluster-devel@gluster.org>, "Nux!" <n...@li.nux.ro>
> > Sent: Wednesday, 26 April, 2017 16:45:52
> > Subject: Re: [Gluster-devel] drop-in management/control panel
>
> > OVirt (oVirt.org) has a web interface to help manage your gluster setup.
> It
> > can be installed in a gluster only mode if you don't want the
> > virtualization features.
> >
> >
> > On Tue, 25 Apr 2017 at 11:03 PM, Nux! <n...@li.nux.ro> wrote:
> >
> >> Hi,
> >>
> >> Anyone knows of any solutions I can just drop in my current gluster
> setup
> >> and help me with administrative tasks (create, delete, quota, acl etc)
> from
> >> a web ui?
> >>
> >> Thanks,
> >> Lucian
> >>
> >> --
> >> Sent from the Delta quadrant using Borg technology!
> >>
> >> Nux!
> >> www.nux.ro
> >> ___
> >> Gluster-devel mailing list
> >> Gluster-devel@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] drop-in management/control panel

2017-04-26 Thread Sahina Bose
OVirt (oVirt.org) has a web interface to help manage your gluster setup. It
can be installed in a gluster only mode if you don't want the
virtualization features.


On Tue, 25 Apr 2017 at 11:03 PM, Nux!  wrote:

> Hi,
>
> Anyone knows of any solutions I can just drop in my current gluster setup
> and help me with administrative tasks (create, delete, quota, acl etc) from
> a web ui?
>
> Thanks,
> Lucian
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [GEDI] Release 3.10 feature proposal:: Statedump for libgfapi

2017-01-10 Thread Sahina Bose
On Tue, Jan 10, 2017 at 11:17 AM, Poornima Gurusiddaiah  wrote:

>
> - Original Message -
> > From: "Niels de Vos" 
> > To: "Shyam" 
> > Cc: "Rajesh Joseph" , "Gluster Devel" <
> gluster-devel@gluster.org>, integrat...@gluster.org,
> > "Poornima G" 
> > Sent: Monday, January 9, 2017 5:05:14 PM
> > Subject: Re: [Gluster-devel] Release 3.10 feature proposal:: Statedump
> for libgfapi
> >
> > On Mon, Jan 09, 2017 at 10:27:03AM +0530, Shyam wrote:
> > > On 01/05/2017 07:10 PM, Niels de Vos wrote:
> > ...
> > > > Because we would really like this in 3.10 to allow applications to
> > > > integrate better with Gluster, I propose to split the functionality
> over
> > > > several changes:
> > > >
> > > > 1. ground work and API exposed for applications (and testing)
> > >
> > > Poornima is working on this as a part of the patch posted at [0].
> Poornima
> > > do you want to add more details here?
> >
> > Yes, I'm waiting for a reply rom Poornima as well. I'd like a discussion
> > about an extendible interface that is not limited to doing statedumps. I
> > do have patches for this based on her work and I want to share those in
> > the discussion.
> >
> > > > 2. enablement through a simple interface, similar to
> /proc/sysrq-trigger
> > > > 3. enablement through gluster-cli command
> > >
> > > The initial implementation of triggering a statedump via the CLI
> already
> > > exists as a part of the patch [0].
> >
> > Yes, and I am aware of that. But I also like patches to be modular and
> > have split for each single functionality. That makes it easier for
> > testing and reviewing. The current approach is a large chunk that I
> > would like to see split. Again, waiting for Poornima to join the
> > discussion.
> >
> > > > These options should be explained in the feature page, with the plan
> to
> > > > provide the three options for 3.10. I'm happy to massage the patch
> from
> > > > Poornima [0] and split it in 1 and 3. Additional improvements for 3
> > > > might be needed, and we'll have to see who does that work. Point 2 is
> > > > something I'll take on as well.
> > > >
>
> From the methods mentioned 1, 3 are there as a part of the single patch.
> As you
> mentioned the api is not extendable and glusterd requires some
> improvements. And
> also the patch needs to be split, since you already have the patches
> ready, please
> go ahead. I can abandon this patch. It would be very useful if we can get
> either
> approach 2 or 3 for 3.10.
>

FWIW, 3 is an acceptable solution when gluster is running hyperconverged
for the VM use case (for instance, with oVirt)


> Thanks,
> Poornima
>
> > > > What do others think about this?
> > >
> > > My question thus is, where are we drawing a line for this in 3.10
> > > considering we have about a *week* left for *branching*?
> > >   - Is 1, and 3 enough as it exists (i.e with the intention of
> exposing the
> > > API as in 1 additionally)?
> >
> > The API does not exist (well, it was added this morning). But the API
> > needs discussion because it is not extendible. This discussion needs to
> > be had, and with the new feature page we can actually do that somewhere.
> >
> > >   - Is 2 mandatory or can come in later (i.e 3.11)?
> >
> > It can come later, but the feature would be kess useful if this does not
> > exist. Statedumps are helpful to diagnose network/communication
> > problems, relying on the network to trigger them is probably not helpful
> > in many situations.
> >
> > >   - Is additions to 3 (i.e improvements to the gluster cli) mandatory
> or
> > >   can
> > > come in later (i.e 3.11)?
> >
> > I see 1 as mandatory. The other interfaces would be welcome, but need
> > discussion and approval from different component maintainers and the
> > target users.
> >
> > HTH,
> > Niels
> >
> >
> > >
> > > >
> > > > Thanks,
> > > > Niels
> > > >
> > > > [0] http://review.gluster.org/9228
> > > [1] http://review.gluster.org/16357
> >
> ___
> integration mailing list
> integrat...@gluster.org
> http://lists.gluster.org/mailman/listinfo/integration
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [Review] oVirt and Gluster - Integrated solution for Disaster Recovery

2016-09-28 Thread Sahina Bose
[Forwarding to a wider audience]

Feature page outlining the proposed solution is at
http://www.ovirt.org/develop/release-management/features/gluster/gluster-dr/
Please review and provide feedback.

thanks,
sahina

-- Forwarded message --
From: Sahina Bose <sab...@redhat.com>
Date: Wed, Sep 14, 2016 at 5:51 PM
Subject: Integrating oVirt and Gluster geo-replication to provide a DR
solution
To: devel <de...@ovirt.org>


Hi all,

Though there are many solutions that integrate with oVirt to provide
disaster recovery for the guest images, these solutions either rely on
backup agents running on guests or third party software and are complicated
to setup

Since oVirt already integrates with glusterfs, we can leverage gluster's
geo-replication feature to mirror contents to a remote/secondary site
periodically for disaster recovery, without the need for additional software

Please review the PR[1] for the feature page outlining the solution and
integration in oVirt.
Comments and feedback welcome.

[1] https://github.com/oVirt/ovirt-site/pull/453

thanks,
sahina
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] presentation slides used at devconf

2016-02-11 Thread Sahina Bose



On 02/11/2016 01:20 AM, Paul Cuzner wrote:

Great presentation Prasanna.

There are a couple of things that are difficult to make out solely 
from the slides - could you shed some light on the following;

- libgfapi vs fuse chart
  - what was the I/O profile you were measuring
  - for the georep results is this saying that with the i/o load 
active, and georep active, this is the io latency as observed at the vm?

  - for georep what was the change delta?


I/O profile used was the 80:20 random read:write fio workload. There 
were a total of 6 VMs running, 2 on each host, on a 3 node setup. 
(replica 3 gluster volume)


For the geo-rep results, yes - when geo-replication was active, this is 
the IO latency observed at the vm. The change delta is the ongoing data 
created by fio jobs (10g per VM)





thanks,

PC


On Wed, Feb 10, 2016 at 9:42 AM, Prasanna Kumar Kalever 
> wrote:


On Wednesday, February 10, 2016 5:07:27 PM, Amye Scavarda wrote:

On Wed, Feb 10, 2016 at 12:28 PM, Samikshan Bairagya
> wrote:



On 02/10/2016 04:40 PM, Michael Scherer wrote:

Le mercredi 10 février 2016 à 12:11 +0530, Atin
Mukherjee a écrit :

It'd be better if you can send a PR to glusterdocs
with the odp.


*grmbl* top post *grmlb*

I am not sure if that's a good idea to add all kind of
binary files to
the git repo. Since git will store it in the repo on
clone, that might
make it grow bigger with time, thus making it more and
more annoying for
people cloning the repo from zero with a bad internet
connexion.

I think its a better option to have a document instead
that either contains a link to a page that contains all
the presentations, or a list of URLs for respective
presentations.


Regards,

Samikshan

Great!
That can be something we can add to gluster.org
 through a PR to the gluster.org
 github repo,  but that has nothing to do
with this current round of presentations being discussed.

Can we go back to the task at hand of helping to publicize
these beyond the mailing list? I'd love to be able to get
these out on the planet and the gluster.org
 blog.
Prasanna, would you be willing to do a post about this?
 -amye

Amye, I am happy to make a post on this talk.
But please let me know what could be a better, meaning a video or
just a textual post (blog) explaining the presentation

Thanks,
-Prasanna

___
Gluster-devel mailing list
Gluster-devel@gluster.org

http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel




-- 
Amye Scavarda | a...@redhat.com  |

Gluster Community Lead

___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] Add gluster-nagios* to github

2016-02-04 Thread Sahina Bose



On 02/04/2016 07:36 PM, Niels de Vos wrote:

On Thu, Feb 04, 2016 at 07:17:25PM +0530, Kaushal M wrote:

Hi Sahina,

The gluster-nagios project currently lives in the 3 repositories in
review.gluster.org. Finding these projects on gerrit is not easy.

To be more visible it would be better if it is added to github under
the gluster organization. We can help set gerrit up to replicate to
the github repo. This will help with visibility of the project a lot.



Will add to github, and would appreciate help to replicate.


We also would like the plugins packaged in Fedora, EPEL and other
distributions. Who would be willing to get that done?



Ramesh should be able to get this going, will reach out on this group 
for help, if required.




Is there a plan to get the plugins to become part of the upstream Nagios
repository? If that is not acceptible, is there a central location or
registry where Nagios users check for available plugins?


Thanks for the prod. The plugins can be added to Nagios exchange 
(https://exchange.nagios.org/directory/Plugins/) once the github repo is 
ready.



Many thanks,
Niels


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-03 Thread Sahina Bose



On 09/03/2015 12:13 PM, Krutika Dhananjay wrote:





*From: *"Shyam" 
*To: *"Krutika Dhananjay" 
*Cc: *"Aravinda" , "Gluster Devel"

*Sent: *Wednesday, September 2, 2015 11:13:55 PM
*Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication

On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
>
>
>

>
> *From: *"Shyam" 
> *To: *"Aravinda" , "Gluster Devel"
> 
> *Sent: *Wednesday, September 2, 2015 8:09:55 PM
> *Subject: *Re: [Gluster-devel] Gluster Sharding and
Geo-replication
>
> On 09/02/2015 03:12 AM, Aravinda wrote:
>  > Geo-replication and Sharding Team today discussed about
the approach
>  > to make Sharding aware Geo-replication. Details are as below
>  >
>  > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja,
Vijay Bellur
>  >
>  > - Both Master and Slave Volumes should be Sharded Volumes
with same
>  >configurations.
>
> If I am not mistaken, geo-rep supports replicating to a
non-gluster
> local FS at the slave end. Is this correct? If so, would this
> limitation
> not make that problematic?
>
> When you state *same configuration*, I assume you mean the
sharding
> configuration, not the volume graph, right?
>
> That is correct. The only requirement is for the slave to have shard
> translator (for, someone needs to present aggregated view of the
file to
> the READers on the slave).
> Also the shard-block-size needs to be kept same between master and
> slave. Rest of the configuration (like the number of subvols of
DHT/AFR)
> can vary across master and slave.

Do we need to have the sharded block size the same? As I assume
the file
carries an xattr that contains the size it is sharded with
(trusted.glusterfs.shard.block-size), so if this is synced across, it
would do. If this is true, what it would mean is that "a sharded
volume
needs a shard supported slave to ge-rep to".

Yep. Even I feel it should probably not be necessary to enforce 
same-shard-size-everywhere as long as shard translator on the slave 
takes care not to further "shard" the individual shards gsyncD would 
write to, on the slave volume.
This is especially true if different files/images/vdisks on the master 
volume are associated with different block sizes.
This logic has to be built into the shard translator based on 
parameters (client-pid, parent directory of the file being written to).
What this means is that shard-block-size attribute on the slave would 
essentially be a don't-care parameter. I need to give all this some 
more thought though.



I think this may help with coping with changes in shard block size 
configuration in master. Otherwise, once user changes the shard block 
size in master, the slave will be affected.
Are there any other shard volume options that if changed on master, 
would affect slave? How do we ensure master and slave are in sync w.r.t 
the shard configuration?




-Krutika

>
> -Krutika
>
>
>
>  > - In Changelog record changes related to Sharded files
also. Just
> like
>  >any regular files.
>  > - Sharding should allow Geo-rep to list/read/write
Sharding internal
>  >Xattrs if Client PID is gsyncd(-1)
>  > - Sharding should allow read/write of Sharded files(that
is in
> .shards
>  >directory) if Client PID is GSYNCD
>  > - Sharding should return actual file instead of returning the
>  >aggregated content when the Main file is
requested(Client PID
>  >GSYNCD)
>  >
>  > For example, a file f1 is created with GFID G1.
>  >
>  > When the file grows it gets sharded into chunks(say 5
chunks).
>  >
>  >  f1   G1
>  >  .shards/G1.1   G2
>  >  .shards/G1.2   G3
>  >  .shards/G1.3   G4
>  >  .shards/G1.4   G5
>  >
>  > In Changelog, this is recorded as 5 different files as below
>  >
>  >  CREATE G1 f1
>  >  DATA G1
>  >  META G1
>  >  CREATE G2 PGS/G1.1
>  >  DATA G2
>  >  META G1
>  >  CREATE G3 PGS/G1.2
>  >  DATA G3
>  >  META G1
>  >  CREATE G4 PGS/G1.3
>  >  DATA G4
>  >  META G1
>  >  CREATE G5 PGS/G1.4
>  >  DATA G5
>  >  META G1
>  >
   

Re: [Gluster-devel] Improving Geo-replication Status and Checkpoints

2015-04-01 Thread Sahina Bose
/Delete/MKDIR/RENAME etc
DATA - Data operations
METADATA - SETATTR, SETXATTR etc

Let me know your suggestions.

--
regards
Aravinda


On 02/02/2015 04:51 PM, Aravinda wrote:

Thanks Sahina, replied inline.

--
regards
Aravinda

On 02/02/2015 12:55 PM, Sahina Bose wrote:


On 01/28/2015 04:07 PM, Aravinda wrote:

Background
--
We have `status` and `status detail` commands for GlusterFS 
geo-replication, This mail is to fix the existing issues in these 
command outputs. Let us know if we need any other columns which 
helps users to get meaningful status.


Existing output
---
Status command output
MASTER NODE - Master node hostname/IP
MASTER VOL - Master volume name
MASTER BRICK - Master brick path
SLAVE - Slave host and Volume name(HOST::VOL format)
STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
CHECKPOINT STATUS - Details about Checkpoint completion
CRAWL STATUS - Hybrid/History/Changelog

Status detail -
MASTER NODE - Master node hostname/IP
MASTER VOL - Master volume name
MASTER BRICK - Master brick path
SLAVE - Slave host and Volume name(HOST::VOL format)
STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
CHECKPOINT STATUS - Details about Checkpoint completion
CRAWL STATUS - Hybrid/History/Changelog
FILES SYNCD - Number of Files Synced
FILES PENDING - Number of Files Pending
BYTES PENDING - Bytes pending
DELETES PENDING - Number of Deletes Pending
FILES SKIPPED - Number of Files skipped


Issues with existing status and status detail:
--

1. Active/Passive and Stable/faulty status is mixed up - Same 
column is used to show both active/passive status as well as 
Stable/faulty status. If Active node goes faulty then by looking 
at the status it is difficult to understand Active node is faulty 
or the passive one.
2. Info about last synced time, unless we set checkpoint it is 
difficult to understand till what time data is synced to slave. 
For example, if a admin want's to know all the files synced which 
are created 15 mins ago, it is not possible without setting 
checkpoint.

3. Wrong values in metrics.
4. When multiple bricks present in same node. Status shows Faulty 
when one of the worker is faulty in that node.


Changes:

1. Active nodes will be prefixed with * to identify it is a active 
node.(In xml output active tag will be introduced with values 0 or 1)
2. New column will show the last synced time, which minimizes the 
use of checkpoint feature. Checkpoint status will be shown only in 
status detail.
3. Checkpoint Status is removed, Separate Checkpoint command will 
be added to gluster cli(We can introduce multiple Checkpoint 
feature with this change)
4. Status values will be Not 
Started/Initializing/Started/Faulty/Stopped. Stable is changed to 
Started
5. Slave User column will be introduced to show to which user 
geo-rep session is established.(Useful in Non root geo-rep)
6. Bytes pending column will be removed. It is not possible to 
identify the delta without simulating sync. For example, we are 
using rsync to sync data from master to slave, If we need to know 
how much data to be transferred then we have to run the rsync 
command with --dry-run flag before running actual command. With 
tar-ssh we have to stat all the files which are identified to be 
synced to calculate the total bytes to be synced. Both are costly 
operations which degrades the geo-rep performance.(In Future we 
can include these columns)
7. Files pending, Synced, deletes pending are only session 
information of the worker, these numbers will not match with the 
number of files present in Filesystem. If worker restarts, counter 
will reset to zero. When worker restarts, it logs previous session 
stats before resetting it.
8. Files Skipped is persistent status across sessions, Shows exact 
count of number of files skipped(Can get list of GFIDs skipped 
from log file)

9. Deletes Pending column can be removed?


Is there any way to know if there are errors syncing any of the 
files? Which column would that reflect in?

Skipped Column shows number of files failed to sync to Slave.

Is the last synced time - the least of the synced time across the 
nodes?
Status output will have one entry for each brick, so we are planning 
to display last synced time from that brick.





Example output

MASTER NODE  MASTER VOL  MASTER BRICK  SLAVE USER 
SLAVE STATUSLAST SYNCED   CRAWL
 

* fedoravm1  gvm /gfs/b1   root fedoravm3::gvs 
Started   2014-05-10 03:07 pm   Changelog
  fedoravm2  gvm /gfs/b2   root fedoravm4::gvs 
Started   2014-05-10 03:07 pm   Changelog


New Status columns

ACTIVE_PASSIVE - * if Active else none.
MASTER NODE - Master node hostname/IP
MASTER VOL - Master volume name