Re: [Gluster-devel] Kluster for Kubernetes (was Announcing Gluster for Container Storage)

2018-08-24 Thread Joe Julian

On 8/24/18 8:24 AM, Michael Adam wrote:

On 2018-08-23 at 13:54 -0700, Joe Julian wrote:

Personally, I'd like to see the glusterd service replaced by a k8s native controller 
(named "kluster").

If you are exclusively interested in gluster for kubernetes
storage, this might seem the right approach.  But I think
this is much too narrow. The standalone, non-k8s deployments
still are important and will be for some time.

So what we've always tried to achieve (this is my personal
very firm credo, and I think several of the other gluster
developers are on the same page), is to keep any business
logic of *how* to manage bricks, create volumes, how to do a
mount, how to grow, shrink and grow volumes and clusters,
etc... close to the core gluster project, so that these
features are usable irrespective of whether gluster is
used in kubernetes or not.

The kubernetes components just need to make use of these,
and so they can stay nicely small, too:

* The provisioners and csi drivers mainly do api translation
   between k8s and gluster(heketi in the old style) and are
   rather trivial.

* The operator would implement the logic "when" and "why"
   to invoke the gluster operations, but should imho not
   bother about the "how".

What can not be implemented with that nice separation
of responsibilies?


Thinking about this a bit more, I do actually feel
more and more that it would be wrong to put all of
gluster into k8s even if we were only interested
in k8s. And I'm really curious how you want to do
that: I think you would have to rewrite more parts
of how gluster actually works. Currently glusterd
mananges (spawns) other gluster processes. Clients
for mounting first connect to glusterd to get the
volfile and maintain a connection to glusterd
throughout the whole lifetime of the mount, etc...

Really interested to hear your thoughts about the above!


Cheers - Michael

To be clear, I'm not saying throw away glusterd and only do gluster for 
Kubernetes and nothing else. That would be silly.


On k8s, a native controller would still need to use some of what 
glusterd2 does as libraries, however things like spawning processes 
would be relegated to the scheduler. Glusterfsd, glustershd, gsyncd, 
etc. would just be pods in the cluster (probably with affinities set for 
storage localization). This allows better resource and fault management, 
better logging, and better monitoring.


Through Kubernetes custom resource definitions (CRDs), volumes would 
declarative and the controller would be responsible for converging the 
declaration and the state. I admit this goes opposite to what some 
developers in the gluster community have strong feelings about, but the 
industry has been moving away from having human managed resources and 
toward declarative state engines for good reason. It scales, is less 
prone to error, and allows for simpler interfaces.


Volume definitions (vol files, not the CRD) could be stored in 
ConfigMaps or Secrets. The client (both glusterfsd and glusterfs) could 
be made k8s aware and retrieve these directly or as an easier first step 
the cm/secret could be mounted into the pod and the client could load 
its vol from a file (the client would need to be altered to reload the 
graph if the file changes).


As an aside, the "maintained" connection to glusterd is only true as 
long as glusterd always lives at the same IP address. There's a 
long-standing bug where the client will never try to find another 
glusterd if the one it first connected to ever goes away.


There are still a lot of questions that I don't have answers to. I think 
this could be done in a way that's complementary with glusterd and does 
not create a bunch of double-work. Most importantly, I think this is 
something that could get community buy-in and would suit a need in 
Kubernetes that's not well supported at this time.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Unplanned Jenkins Restart

2018-08-24 Thread Nigel Babu
Oops, big note: Centos Regression jobs may have ended up canceled. Please
retry them.

On Fri, Aug 24, 2018 at 9:31 PM Nigel Babu  wrote:

> Hello,
>
> We've had to do an unplanned Jenkins restart. Jenkins was overloaded and
> not responding to any requests. There was a backlog of over 100 jobs as
> well. The restart seems to have fixed things up.
>
> More details in bug: https://bugzilla.redhat.com/show_bug.cgi?id=1622173
>
> --
> nigelb
>


-- 
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Unplanned Jenkins Restart

2018-08-24 Thread Nigel Babu
Hello,

We've had to do an unplanned Jenkins restart. Jenkins was overloaded and
not responding to any requests. There was a backlog of over 100 jobs as
well. The restart seems to have fixed things up.

More details in bug: https://bugzilla.redhat.com/show_bug.cgi?id=1622173

-- 
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Announcing Gluster for Container Storage (GCS)

2018-08-24 Thread Michael Adam
On 2018-08-23 at 13:54 -0700, Joe Julian wrote:
> Personally, I'd like to see the glusterd service replaced by a k8s native 
> controller (named "kluster").

If you are exclusively interested in gluster for kubernetes
storage, this might seem the right approach.  But I think
this is much too narrow. The standalone, non-k8s deployments
still are important and will be for some time.

So what we've always tried to achieve (this is my personal
very firm credo, and I think several of the other gluster
developers are on the same page), is to keep any business
logic of *how* to manage bricks, create volumes, how to do a
mount, how to grow, shrink and grow volumes and clusters,
etc... close to the core gluster project, so that these
features are usable irrespective of whether gluster is
used in kubernetes or not.

The kubernetes components just need to make use of these,
and so they can stay nicely small, too:

* The provisioners and csi drivers mainly do api translation
  between k8s and gluster(heketi in the old style) and are
  rather trivial.

* The operator would implement the logic "when" and "why"
  to invoke the gluster operations, but should imho not
  bother about the "how".

What can not be implemented with that nice separation
of responsibilies?


Thinking about this a bit more, I do actually feel
more and more that it would be wrong to put all of
gluster into k8s even if we were only interested
in k8s. And I'm really curious how you want to do
that: I think you would have to rewrite more parts
of how gluster actually works. Currently glusterd
mananges (spawns) other gluster processes. Clients
for mounting first connect to glusterd to get the
volfile and maintain a connection to glusterd
throughout the whole lifetime of the mount, etc...

Really interested to hear your thoughts about the above!


Cheers - Michael




> I'm hoping to use this vacation I'm currently on to write up a design doc.
> 
> On August 23, 2018 12:58:03 PM PDT, Michael Adam  wrote:
> >On 2018-07-25 at 06:38 -0700, Vijay Bellur wrote:
> >> Hi all,
> >
> >Hi Vijay,
> >
> >Thanks for announcing this to the public and making everyone
> >more aware of Gluster's focus on container storage!
> >
> >I would like to add an additional perspective to this,
> >giving some background about the history and origins:
> >
> >Integrating Gluster with kubernetes for providing
> >persistent storage for containerized applications is
> >not new. We have been working on this since more than
> >two years now, and it is used by many community users
> >and and many customers (of Red Hat) in production.
> >
> >The original software stack used heketi
> >(https://github.com/heketi/heketi) as a high level service
> >interface for gluster to facilitate the easy self-service for
> >provisioning volumes in kubernetes. Heketi implemented some ideas
> >that were originally part of the glusterd2 plans already in a
> >separate, much more narrowly scoped project to get us started
> >with these efforts in the first place, and also went beyond those
> >original ideas.  These features are now being merged into
> >glusterd2 which will in the future replace heketi in the
> >container storage stack.
> >
> >We were also working on kubernetes itself, writing the
> >privisioners for various forms of gluster volumes in kubernets
> >proper (https://github.com/kubernetes/kubernetes) and also the
> >external storage repo
> >(https://github.com/kubernetes-incubator/external-storage).
> >Those provisioners will eventually be replaced by the mentioned
> >csi drivers. The expertise of the original kubernetes
> >development is now flowing into the CSI drivers.
> >
> >The gluster-containers repository was created and used
> >for this original container-storage effort already.
> >
> >The mentioned https://github.com/gluster/gluster-kubernetes
> >repository was not only the place for storing the deployment
> >artefacts and tools, but it was actually intended to be the
> >upstream home of the gluster-container-storage project.
> >
> >In this view, I see the GCS project announced here
> >as a GCS version 2. The original first version,
> >even though never officially announced that widely in a formal
> >introduction like this, and never given a formal release
> >or version number (let me call it version one), was the
> >software stack described above and homed at the
> >gluster-kubernetes repository. If you look at this project
> >(and heketi), you see that it has a nice level of popularity.
> >
> >I think we should make use of this traction instead of
> >ignoring the legacy, and turn gluster-kubernetes into the
> >home of GCS (v2). In my view, GCS (v2) will be about:
> >
> >* replacing some of the components with newer, i.e.
> >  - i.e. glusterd2 instead of the heketi and glusterd1 combo
> >  - csi drivers (the new standard) instead of native
> >kubernetes plugins
> >* adding the operator feature,
> >  (even though we are currently also working on an operator
> >  for the current stack 

[Gluster-devel] Coverity covscan for 2018-08-24-3cb5b63a (master branch)

2018-08-24 Thread staticanalysis


GlusterFS Coverity covscan results for the master branch are available from
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2018-08-24-3cb5b63a/

Coverity covscan results for other active branches are also available at
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Test health report (week ending 19th Aug. 2018)

2018-08-24 Thread Nithya Balachandran
On 20 August 2018 at 23:06, Shyam Ranganathan  wrote:

> Although tests have stabilized quite a bit, and from the maintainers
> meeting we know that some tests have patches coming in, here is a
> readout of other tests that needed a retry. We need to reduce failures
> on retries as well, to be able to not have spurious or other failures in
> test runs.
>
> Tests being worked on (from the maintainers meeting notes):
> - bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
>
> Other retries and failures, request component maintainers to look at the
> test case and resulting failures and post back any findings to the lists
> to take things forward,
>
> https://build.gluster.org/job/line-coverage/481/console
> 20:10:01 1 test(s) needed retry
> 20:10:01 ./tests/basic/distribute/rebal-all-nodes-migrate.t
>
> Sorry for the delay but I have been busy all this week. I will take a look
at this sometime next week.

> https://build.gluster.org/job/line-coverage/483/console
> 18:42:01 2 test(s) needed retry
> 18:42:01 ./tests/basic/tier/fops-during-migration-pause.t
> 18:42:01
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-
> txn-on-quorum-failure.t
> (fix in progress)
>
> https://build.gluster.org/job/regression-test-burn-in/4067/console
> 18:27:21 1 test(s) generated core
> 18:27:21 ./tests/bugs/readdir-ahead/bug-1436090.t
>
> https://build.gluster.org/job/regression-test-with-multiplex/828/console
> 18:19:39 1 test(s) needed retry
> 18:19:39 ./tests/bugs/glusterd/validating-server-quorum.t
>
> https://build.gluster.org/job/regression-test-with-multiplex/829/console
> 18:24:14 2 test(s) needed retry
> 18:24:14 ./tests/00-geo-rep/georep-basic-dr-rsync.t
> 18:24:14 ./tests/bugs/glusterd/quorum-validation.t
>
> https://build.gluster.org/job/regression-test-with-multiplex/831/console
> 18:20:49 1 test(s) generated core
> 18:20:49 ./tests/basic/ec/ec-5-2.t
>
> Shyam
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Glusterfs v4.1.1 issue encountered while executing test case ./tests/features/trash.t

2018-08-24 Thread Vijay Bellur
On Mon, Aug 20, 2018 at 5:24 AM Abhay Singh  wrote:

> Hi Vijay,
>
> As per your previous reply, i tried running the test cases using the
> endianess check through the command  lscpu | grep "Big endian". Thankfully,
> the namespace.t test case passed successfully.


This is good to hear. Would you mind submitting a patch through gerrit for
this?


> However, the trash.t test case is still failing with the following error:-
>
> =
> TEST 37 (line 190): 0 rebalance_completed
> ok 37, LINENUM:190
> RESULT 37: 0
> ls: cannot access '/d/backends/patchy3/rebal*': No such file or directory
> basename: missing operand
> Try 'basename --help' for more information.
> =
> TEST 38 (line 196): Y wildcard_exists /d/backends/patchy3/1
> /d/backends/patchy3/a
> ok 38, LINENUM:196
> RESULT 38: 0
> =
> TEST 39 (line 197): Y wildcard_exists
> /d/backends/patchy1/.trashcan/internal_op/*
> not ok 39 Got "N" instead of "Y", LINENUM:197
> RESULT 39: 1
> =
>
> =
> TEST 63 (line 247): Y wildcard_exists
> /d/backends/patchy1/abc/internal_op/rebal*
> not ok 63 Got "N" instead of "Y", LINENUM:247
> RESULT 63: 1
> rm: cannot remove '/mnt/glusterfs/0/abc/internal_op': Operation not
> permitted
> =
>
> Failed 2/66 subtests
>
> Test Summary Report
> ---
> ./tests/features/trash.t (Wstat: 0 Tests: 66 Failed: 2)
>   Failed tests:  39, 63
> Files=1, Tests=66
> Result: FAIL
>
> Although, running these test cases on an xfs backend is still pending.
> Please let me know how do I proceed further or if anything more needs to
> be done.
>
>

As per our IRC chat, it looks like running on an xfs backend does not make
any difference. I will attempt to recreate this problem on a Big Endian
machine.

Jiffin - is there anything else that you would like to suggest for the
failing tests?

Thanks,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel