Re: [Gluster-devel] Brick multiplexing approaches

2016-06-14 Thread Jeff Darcy
> There is *currently* no graph switch on the bricks (as I understand it). > Configuration changes, yes, but no graph switch as the xlator pipeline > is fixed, if that changes the bricks need to be restarted. Others can > correct me if I am wrong. > > Noting the above here, as it may not be that

[Gluster-devel] Brick multiplexing approaches

2016-06-13 Thread Jeff Darcy
"Brick multiplexing" is a new feature, tentatively part of 4.0, that allows multiple bricks to be served from a single glusterfsd process. This promises to give us many benefits over the current "process per brick" approach. * Lower total memory use, by having only one copy of various global

Re: [Gluster-devel] io-stats: Fix overwriting of client profile by the bricks

2016-05-26 Thread Jeff Darcy
> >> I think we need to reconsider the above change. The bug is real and > >> needs a fix, but maybe we append the xlator name to the end of the > >> provided filename and dump the stats into that, than unwind from the > >> first instance of io-stats. > > > > I assume you mean the first instance

Re: [Gluster-devel] io-stats: Fix overwriting of client profile by the bricks

2016-05-26 Thread Jeff Darcy
> I think we need to reconsider the above change. The bug is real and > needs a fix, but maybe we append the xlator name to the end of the > provided filename and dump the stats into that, than unwind from the > first instance of io-stats. I assume you mean the first instance of io-stats that

Re: [Gluster-devel] XDR RPC Spec

2016-05-20 Thread Jeff Darcy
> As a community > if we would like to increase adoption of Gluster I think it would be > good if other languages could produce bindings in their native language > without needing to use the C libraries. If I'm writing code in Java or > Rust or (name your language here) and I see a Gluster

Re: [Gluster-devel] Usage of xdata to send anything on wire

2016-05-13 Thread Jeff Darcy
> > > There were concerns raised around this for the below reasons: > > > 1. word-size and endianness > > > 2. security issues > > > 3. Backward compatibility issues i.e. old/new client and server > > > combination. > > > > Can you please elaborate on the above 3 points? A bit more verbose > >

Re: [Gluster-devel] Usage of xdata to send anything on wire

2016-05-13 Thread Jeff Darcy
> In the recent past, have seen multiple instances where we needed to > send some data along with the fop on wire. And we have been using the > xdata for the same. Eg: > > 1. Add lease-id, transaction id to every fop. The original purpose of xdata was to convey extra information without having

Re: [Gluster-devel] [Gluster-infra] Slow regression tests, likely due to lvm archives

2016-05-12 Thread Jeff Darcy
> I also remember that Jeff Darcy posted on "we need to change a value in > lvm.conf (LVM allocation settings, mail from 6th of april), so anyone > has more information about that ? (or if someone can ping Jeff for me) Yes, I found that the default value for activation.reserved_mem

Re: [Gluster-devel] State of the 4.0 World

2016-05-03 Thread Jeff Darcy
> Great summary Jeff - thanks! You're welcome. Also, it seems like I somehow managed to leave out the piece that I'm personally most involved in - Journal Based Replication (formerly New Style Replication). I might as well correct that now. What we (Avra and I) have so far is a working I/O

[Gluster-devel] State of the 4.0 World

2016-05-03 Thread Jeff Darcy
One of my recurring action items at community meetings is to report to the list on how 4.0 is going. So, here we go. The executive summary is that 4.0 is on life support. Many features were proposed - some quite ambitious. Many of those *never* had anyone available to work on them. Of those

Re: [Gluster-devel] Improve EXPECT/EXPECT_WITHIN result check in tests

2016-05-02 Thread Jeff Darcy
> Would it be ok to change all regular expression checks to something like > this in include.rc ? > > if [[ "$a" =~ ^$e$ ]]; then > > This will allow using regular expressions in EXPECT and EXPECT_WITHIN, > but will enforce full answer match in all cases, avoiding some possible > side

Re: [Gluster-devel] Regression-test-burn-in crash in EC test

2016-04-29 Thread Jeff Darcy
> The test is doing renames where source and target directories are > different. At the same time a new ec-set is added and rebalance started. > Rebalance will cause dht to also move files between bricks. Maybe this > is causing some race in dht ? > > I'll try to continue investigating when I

Re: [Gluster-devel] Possible bug in the communications layer ?

2016-04-28 Thread Jeff Darcy
> This happens with Gluster 3.7.11 accessed through Ganesha and gfapi. The > volume is a distributed-disperse 4*(4+2). > > I'm able to reproduce the problem easily doing the following test: > > iozone -t2 -s10g -r1024k -i0 -w -F /iozone{1..2}.dat > echo 3 >/proc/sys/vm/drop_caches > iozone -t2

Re: [Gluster-devel] Regression-test-burn-in crash in EC test

2016-04-28 Thread Jeff Darcy
> Where can we find the core dump? If you follow the console-log link at the bottom of the previous email, that log contains links to the other pieces. Core dumps are at: http://slave24.cloud.gluster.org/archived_builds/build-install-20160427:15:53:06.tar.bz2 There's a pretty specific gdb

[Gluster-devel] Regression-test-burn-in crash in EC test

2016-04-27 Thread Jeff Darcy
One of the "rewards" of reviewing and merging people's patches is getting email if the next regression-test-burn-in should fail - even if it fails for a completely unrelated reason. Today I got one that's not among the usual suspects. The failure was a core dump in

Re: [Gluster-devel] Should it be possible to disable own-thread for encrypted RPC?

2016-04-21 Thread Jeff Darcy
> I've recently become aware of another problem with own-threads. The > threads launched are not reaped, pthread_joined, after a TLS > connection disconnects. This is especially problematic with GlusterD > as it launches a lot of threads to handle generally short lived > connections (volfile

Re: [Gluster-devel] [Gluster-infra] freebsd-smoke failures

2016-04-19 Thread Jeff Darcy
> So can a workable solution be pushed to git, because I plan to force the > checkout to be like git, and it will break again (and this time, no > workaround will be possible). > It has been pushed to git, but AFAICT pull requests for that repo go into a black hole.

Re: [Gluster-devel] Should it be possible to disable own-thread for encrypted RPC?

2016-04-15 Thread Jeff Darcy
> I've been testing release-3.7 in the lead up to tagging 3.7.11, and > found that the fix I did to allow daemons to start when management > encryption is enabled, doesn't work always. The daemons fail to start > because they can't connect to glusterd to fetch the volfiles, and the > connection

Re: [Gluster-devel] tests/features/nuke.t fails on NetBSD

2016-04-14 Thread Jeff Darcy
> Following test fails on NetBSD. Can you help? > > == > > [07:24:54] Running tests in file ./tests/features/nuke.t > tar: Failed open to read/write on > /build/install/var/log/glusterfs/nuke.tar (No such file or directory) > tar: Unexpected EOF on archive file >

Re: [Gluster-devel] Which test generates the core?

2016-04-14 Thread Jeff Darcy
> On Thu, Apr 14, 2016 at 10:13 AM, Atin Mukherjee < amukh...@redhat.com > > wrote: > > I am pretty sure that earlier we used to log the culprit test generating > > > the core. But now I don't see that same behavior in these runs [1] [2] > > > [1] > > >

Re: [Gluster-devel] Test-case "./tests/basic/tier/tier-file-create.t" hung

2016-04-12 Thread Jeff Darcy
> tier can lead to parallel lookups in two different epoll threads on > hot/cold tiers. The race-window to hit the common-dictionary in lookup > use-after-free is too low without dict_copy_with_ref() in either ec/afr. > In either afr/ec side one thread should be executing dict_serialization > in

Re: [Gluster-devel] Test-case "./tests/basic/tier/tier-file-create.t" hung

2016-04-12 Thread Jeff Darcy
> This is a memory corruption issue which is already reported and there is a > patch by Pranith in 3.7 [1] waiting to get reviews. Patch [1] will solve the > issue . > [1] : http://review.gluster.org/#/c/13574/ That patch seems to be about making and modifying a copy of xattr_req, instead of

Re: [Gluster-devel] Need help diagnosing regression-test crashes

2016-04-08 Thread Jeff Darcy
Upon further investigation, I've been able to determine that the problem lies in this line of our generic cleanup routine. type cleanup_lvm &>/dev/null && cleanup_lvm || true; This works great if snapshot.rc we're at the end of a test that included snapshot.rc (which defines

[Gluster-devel] Need help diagnosing regression-test crashes

2016-04-08 Thread Jeff Darcy
I've been trying to figure out what's causing CentOS regression tests to fail with core dumps, quite clearly unrelated to the patches that are being affected. I've managed to find a couple of clues, so it seems that maybe someone else will recognize something and zero in on the problem faster

Re: [Gluster-devel] Slow regression tests

2016-04-07 Thread Jeff Darcy
> Are the slower jobs running on a particular set of slave VMs? I > rebooted/reset some of VMs that were offline in Jenkins for quite some > time. Could be that these are running slowly. The ones I looked at - both slow and fast - were all on slave20 and slave21.

Re: [Gluster-devel] [Gluster-infra] freebsd-smoke failures

2016-04-04 Thread Jeff Darcy
> Once this was done, I pushed to use the regular upstream change, > something that was not done before since the local change broke > automation to deploy test suite on Freebsd. It looks like we have two options here: (1) Fix configure so that it accurately detects whether the system can/should

Re: [Gluster-devel] !! operator

2016-04-02 Thread Jeff Darcy
> Thanks for the explanation. I was able to practice C for 18 years in > numerous projects without having the opportunity to see it. I will got > to bed less ignorant tonight. :-) Sorry if my comment came off as dismissive. I was only trying to explain why I'd assume it's intentional. I'll

Re: [Gluster-devel] !! operator

2016-04-02 Thread Jeff Darcy
> I found a !! in glusterfs sources. Is it a C syntax I do not know, a > bug, or just a weird syntax? > > xlators/cluster/afr/src/afr-inode-write.c: > local->stable_write = !!((fd->flags|flags)&(O_SYNC|O_DSYNC)); It's a common idiom in the Linux kernel/coreutils community. I thought it was

Re: [Gluster-devel] [Gluster-infra] freebsd-smoke failures

2016-04-02 Thread Jeff Darcy
> Please make sure that this change also gets included in the repository: > > https://github.com/gluster/glusterfs-patch-acceptance-tests Looks like we're getting a bit of a queue there. Who can merge some of these? ___ Gluster-devel mailing list

Re: [Gluster-devel] [Gluster-infra] freebsd-smoke failures

2016-04-02 Thread Jeff Darcy
- Original Message - > On Sat, Apr 02, 2016 at 07:53:32AM -0400, Jeff Darcy wrote: > > > IIRC, this happens because in the build job use "--enable-bd-xlator" > > > option while configure > > > > I came to the same conclusion, and set --enable

Re: [Gluster-devel] [Gluster-infra] freebsd-smoke failures

2016-04-02 Thread Jeff Darcy
> IIRC, this happens because in the build job use "--enable-bd-xlator" > option while configure I came to the same conclusion, and set --enable-bd-xlator=no on the slave. I also had to remove -Werror because that was also causing failures. FreeBSD smoke is now succeeding.

[Gluster-devel] Reminder: adding source files

2016-04-01 Thread Jeff Darcy
If you add a file to the project, please remember to add it to the appropriate Makefile.am as well. Failure to do so *will not show up* in our standard smoke/regression tests because those do "make install" but they will prevent RPMs (and probably equivalents on other distros/platforms) from

Re: [Gluster-devel] [IMPORTANT] Release planning for GlusterFS 3.8, schedule, deadlines and all

2016-03-24 Thread Jeff Darcy
> The disabling and possible removing of features would only be done in > the release-3.8 branch. There is no intention to remove the features > that are not finished/stable from the master branch. > > I also prefer to see changes merged early on, and encourage all of the > contributors to post

Re: [Gluster-devel] [IMPORTANT] Release planning for GlusterFS 3.8, schedule, deadlines and all

2016-03-22 Thread Jeff Darcy
> For most of the Gluster 4.0 features, I do not expect to see them as > part of glusterfs-3.8, not even as experimental. For being in the > experimental category, a feature needs to have a pretty complete > implementation, documentation and tests. I'm OK with that being true in a release branch,

Re: [Gluster-devel] Flooding of client logs with JSON fop statistics under DEBUG log-level

2016-03-18 Thread Jeff Darcy
> Since we have a volume set option(diagnostics.stats-dump-interval) to > increase/decrease the dump interval can't we make its default value to 0 > which will disable dumping statistics at first place? I don't have a particularly strong opinion on the matter. My *personal* preference is to

Re: [Gluster-devel] Running Vagrant tests on the CentOS CI (WAS: Re: 3.7.9 update)

2016-03-15 Thread Jeff Darcy
> If it is in CentOS CI, then why do we need vagrant? I'm not sure how vagrant > would make things more simple. > > We can use duffy to provision the machines, we can use gdeploy to install > glusterfs and use distaf to run the tests. In the nightly job I created, it > is using the same (minus the

Re: [Gluster-devel] races in dict_foreach() causing crashes in tier-file-creat.t

2016-03-11 Thread Jeff Darcy
> Tier does send lookups serially, which fail on the hashed subvolumes of > dhts. Both of them trigger lookup_everywhere which is executed in epoll > threads, thus the they are executed in parallel. According to your earlier description, items are being deleted by EC (i.e. the cold tier) while

Re: [Gluster-devel] Regression: Core generated by fdl-overflow.t

2016-03-10 Thread Jeff Darcy
> In few of the regression runs, found that ./tests/features/fdl-overflow.t is > causing a core dump. > > Here is the bt: > /build/install/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xf2)[0x7f4905e5e3f6] > /build/install/lib/libglusterfs.so.0(gf_print_trace+0x22b)[0x7f4905e64381] >

Re: [Gluster-devel] Fuse Subdirectory mounts, access-control

2016-03-09 Thread Jeff Darcy
> When this command is executed, volfile is requested with volfile-id > '/vol/subdir1' > Glusterd on seeing this volfile-id will generate the client xlator with > remote-subvolume appending '/subdir1' I don't think GlusterD needs to be involved here at all. See below. > When graph

Re: [Gluster-devel] Default quorum for 2 way replication

2016-03-04 Thread Jeff Darcy
> I like the default to be 'none'. Reason: If we have 'auto' as quorum for > 2-way replication and first brick dies, there is no HA. If users are > fine with it, it is better to use plain distribute volume "Availability" is a tricky word. Does it mean access to data now, or later despite

Re: [Gluster-devel] Is anyone paying attention to centos regressions on jenkins?

2016-03-04 Thread Jeff Darcy
- Original Message - > There are a handful of centos regressions that have been running for over > eight hours. > > I don't know if that's contributing to the short backlog of centos > regressions waiting to run. I'm going to kill these in a moment, but here's more specific info in

Re: [Gluster-devel] tests/basic/tier/tier.t failure in NetBSD

2016-02-28 Thread Jeff Darcy
> Well I will not agree on the part of "The whole ship floats or sinks > together", > as in the past we have seen tests(not only with tiering but other features) > that were failing and were addressed and fixed. I'm not sure what you're trying to say here. If a test is failing because of one

Re: [Gluster-devel] Regarding default_forget/releasedir/release() fops

2016-02-23 Thread Jeff Darcy
> Recently while doing some tests (which involved lots of inode_forget()), > I have noticed that my log file got flooded with below messages - > > [2016-02-22 08:57:44.025565] W [defaults.c:2889:default_forget] (--> > /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x231)[0x7fd00f63c15d] >

Re: [Gluster-devel] FreeBSD smoke failure

2016-02-21 Thread Jeff Darcy
> This is how I fixed that exact same problem for NetBSD. Then I'm all in favor. Thanks! ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] FreeBSD smoke failure

2016-02-20 Thread Jeff Darcy
> On NetBSD Jenkins slaves slave VM, /opt/qa/build.sh contains this: > > PYDIR=`$PYTHONBIN -c 'from distutils.sysconfig import get_python_lib; > print(get_python_lib())'` > su -m root -c "/usr/bin/install -d -o jenkins -m 755 $PYDIR/gluster" OK, so are you proposing that we add the same thing on

Re: [Gluster-devel] FreeBSD smoke failure

2016-02-20 Thread Jeff Darcy
> The problem is that glusterfs install target copies glupy python module > outside of glusterfs install directory. The permissions must be > propeperly setup for the unprivilegied build process to succeed the > copy. What solution do you suggest? If these files were installed elsewhere, we'd

Re: [Gluster-devel] FreeBSD smoke failure

2016-02-19 Thread Jeff Darcy
> I've been seeing FreeBSD smoke failures quite a few time now and here is > one of them [1] > /usr/home/jenkins/root/workspace/freebsd-smoke/install-sh -c -d > '/usr/local/lib/python2.7/site-packages/gluster/glupy' > mkdir: /usr/local/lib/python2.7/site-packages/gluster: Permission denied >

[Gluster-devel] Speedier-regression-testing idea

2016-02-11 Thread Jeff Darcy
As I've watched regression tests grind away all day, I realized there was a pattern we could exploit. Not counting the spurious timing-related failures we've talked about many times, the most likely failures - by far - are the ones *that are part of the current patch*. It shouldn't be hard for

Re: [Gluster-devel] regarding GF_CONTENT_KEY and dht2 - perf with small files

2016-02-04 Thread Jeff Darcy
> Even with compound fops It will still require two sequential network > operations from dht2. One to MDC and one to DC So I don't think it helps. There are still two hops, but making it a compound op keeps the server-to-server communication in the compounding translator (which should already be

Re: [Gluster-devel] regarding GF_CONTENT_KEY and dht2 - perf with small files

2016-02-03 Thread Jeff Darcy
> Problem is with workloads which know the files that need to be read > without readdir, like hyperlinks (webserver), swift objects etc. These > are two I know of which will have this problem, which can't be improved > because we don't have metadata, data co-located. I have been trying to > think

Re: [Gluster-devel] regarding GF_CONTENT_KEY and dht2 - perf with small files

2016-02-02 Thread Jeff Darcy
> Background: Quick-read + open-behind xlators are developed to help > in small file workload reads like apache webserver, tar etc to get the > data of the file in lookup FOP itself. What happens is, when a lookup > FOP is executed, GF_CONTENT_KEY is added in xdata with max-length and >

Re: [Gluster-devel] Throttling xlator on the bricks

2016-01-28 Thread Jeff Darcy
> TBF isn't complicated at all - it's widely used for traffic shaping, cgroups, > UML to rate limit disk I/O. It's not complicated and it's widely used, but that doesn't mean it's the right fit for our needs. Token buckets are good to create a *ceiling* on resource utilization, but what if you

Re: [Gluster-devel] distributed files/directories and [cm]time updates

2016-01-26 Thread Jeff Darcy
> If the time is set on a file by the client, this increases the critical > complexity to include the clients whereas before it was only critical to > have the servers time synced, now the clients should be as well. With any kind of server-side replication, the times could be generated by the

Re: [Gluster-devel] Tips and Tricks for Gluster Developer

2016-01-25 Thread Jeff Darcy
Oh boy, here we go. ;) I second Richard's suggestion to use cscope or some equivalent. It's a good idea in general, but especially with a codebase as large and complex as Gluster's. I literally wouldn't be able to do my job without it. I also have a set of bash/zsh aliases that will

Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Jeff Darcy
> on Saturday the 30th of January I am scheduled to give a presentation > titled "Gluster roadmap, recent improvements and upcoming features": > > https://fosdem.org/2016/schedule/event/gluster_roadmap/ > > I would like to ask from all feature owners/developers to reply to this > email with a

Re: [Gluster-devel] NetBSD tests not running to completion.

2016-01-08 Thread Jeff Darcy
> I am a bit disturbed by the fact that people raise the > "NetBSD regression ruins my life" issue without doing the work of > listing the actual issues encountered. That's because it's not a simple list of persistent issues. As with spurious regression-test failures on Linux, it's an ever

Re: [Gluster-devel] NetBSD tests not running to completion.

2016-01-08 Thread Jeff Darcy
- Original Message - > On Fri, Jan 08, 2016 at 05:11:22AM -0500, Jeff Darcy wrote: > > [08:45:57] ./tests/basic/afr/arbiter-statfs.t .. > > [08:43:03] ./tests/basic/afr/arbiter-statfs.t .. > > [08:40:06] ./tests/basic/afr/arbiter-statfs.t .. > > [08:08:51

Re: [Gluster-devel] NetBSD tests not running to completion.

2016-01-08 Thread Jeff Darcy
> I think we just need to come up with rules for considering a > platform to have voting ability before merging the patch. I totally agree, except for the "just" part. ;) IMO a platform is much like a feature in terms of requiring commitment/accountability, community agreement on

Re: [Gluster-devel] NetBSD tests not running to completion.

2016-01-07 Thread Jeff Darcy
> How are you going to make a serious issue a blocker? We can turn off the "merge" button at any time, by either technical or social means. The "how" is easy; it's the "when" that's fraught with controversy. > If we go that way, we need to run a regression for each merged patch, > which will be

Re: [Gluster-devel] compound fop design first cut

2016-01-06 Thread Jeff Darcy
> 1) fops will be compounded per inode, meaning 2 fops on different > inodes can't be compounded (Not because of the design, Just reducing > scope of the problem). > > 2) Each xlator that wants a compound fop packs the arguments by > itself. Packed how? Are we talking about XDR here, or

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Jeff Darcy
On December 9, 2015 at 7:07:06 AM, Ira Cooper (i...@redhat.com) wrote: > A simple "abort on failure" and let the higher levels clean it up is > probably right for the type of compounding I propose. It is what SMB2 > does. So, if you get an error return value, cancel the rest of the > request,

Re: [Gluster-devel] libgfapi compound operations - multiple writes

2015-12-09 Thread Jeff Darcy
On December 9, 2015 at 10:31:03 AM, Raghavendra Gowdappa (rgowd...@redhat.com) wrote: > forking off since it muddles the original conversation. I've some questions: > > 1. Why do multiple writes need to be compounded together? > 2. If the reason is aggregation, cant we tune write-behind to

Re: [Gluster-devel] compound fop design first cut

2015-12-08 Thread Jeff Darcy
On December 8, 2015 at 12:53:04 PM, Ira Cooper (i...@redhat.com) wrote: > Raghavendra Gowdappa writes: > I propose that we define a "compound op" that contains ops. > > Within each op, there are fields that can be "inherited" from the > previous op, via use of a sentinel value. > > Sentinel

Re: [Gluster-devel] libgfapi changes to add lk_owner and lease ID

2015-12-04 Thread Jeff Darcy
On December 4, 2015 at 8:25:10 AM, Niels de Vos (nde...@redhat.com) wrote: > Okay, so you meant to say that client_t "is a horror show" for this > particular use-case (lease_id). It indeed does not sound suitable to use > client_t here. > > I'm not much of a fan for using Thread Local Storage

[Gluster-devel] NSR (semi-)monthly status

2015-11-25 Thread Jeff Darcy
I’m not going to follow the detailed format that I’ve seen some of the other teams using, because I’m lazy, but here’s a brief snapshot of where NSR development is now and where it’s going. *** Design Info Avra has pulled together pieces from various other documents into a spec here: 

[Gluster-devel] Broken code alert - translator fini functions

2015-11-25 Thread Jeff Darcy
In the process of debugging a test that relies on my translator’s “fini” function being called, I discovered that these functions are not being called for translators in glusterfsd.  The offending code seems to be at glusterfsd.c:1274 (cleanup_and_exit). 1274         if (ctx->process_mode ==

Re: [Gluster-devel] Caching support in glusterfs

2015-11-24 Thread Jeff Darcy
On November 24, 2015 at 6:47:59 AM, Avik Sil (avik@hgst.com) wrote: > While searching for caching support in glusterfs I stumbled upon this > link: > http://www.gluster.org/community/documentation/index.php/Features/caching > > But I didn't get much info from it. What is plan ahead?

Re: [Gluster-devel] Replacing loopback with Unix Domain Sockets for I/O

2015-11-18 Thread Jeff Darcy
On November 18, 2015 at 2:44:39 PM, Prasanna Kumar Kalever (pkale...@redhat.com) wrote: > As expected, I can see there are good numbers in the performance by using > UDS (Unix Domain Socket), please checkout the results (extracted using > Iozone benchmark tool attached above) Those look like

Re: [Gluster-devel] Replacing loopback with Unix Domain Sockets for I/O

2015-11-06 Thread Jeff Darcy
On November 6, 2015 at 3:13:01 AM, Prasanna Kumar Kalever (pkale...@redhat.com) wrote: > Humble, I am sure the patches above refer to using Unix Domain sockets > for volfile transmission.  My proposal is for I/O between processes on > the same hypervisor, specially for hyper-convergence scenario

Re: [Gluster-devel] FreeBSD and NetBSD builds failing on master

2015-11-02 Thread Jeff Darcy
On November 2, 2015 at 12:09:45 PM, Emmanuel Dreyfus (m...@netbsd.org) wrote: > On Mon, Nov 02, 2015 at 11:50:25AM -0500, Jeff Darcy wrote: > > I should have caught this problem (mea culpa) but ?break the world? is a > > bit hyperbolic. ?The world? is broken much of the tim

Re: [Gluster-devel] FreeBSD and NetBSD builds failing on master

2015-11-02 Thread Jeff Darcy
> Generally speaking, I am in favor of backing otu changes that > break the world. I should have caught this problem (mea culpa) but “break the world” is a bit hyperbolic.  “The world” is broken much of the time, because very few of the developers on this project know how to fix *BSD portability

Re: [Gluster-devel] pluggability of some aspects in afr/nsr/ec

2015-10-29 Thread Jeff Darcy
> >> I want to understand if there is a possibility of exposing these as > >> different modules that we can mix and match, using options. It’s not only possible, but it’s easier than you might think.  If an option is set (cluster.nsr IIRC) then we replace cluster/afr with cluster/nsr-client and

Re: [Gluster-devel] RFC: Gluster.Next: Where and how DHT2 work/code would be hosted

2015-10-29 Thread Jeff Darcy
On October 29, 2015 at 9:12:46 AM, Shyam (srang...@redhat.com) wrote: > Will code that NSR puts up in master be ready to ship when 3.8 is > branched? Do we know when 3.8 will be branched? > I ask the above, as I think we need a *process*, and not an open ended > "put it where you want option",

Re: [Gluster-devel] RFC: Gluster.Next: Where and how DHT2 work/code would be hosted

2015-10-29 Thread Jeff Darcy
On October 29, 2015 at 8:42:50 PM, Shyam (srang...@redhat.com) wrote: > I assume this is about infra changes (as the first 2 points are for > some reason squashed in my reader). I think what you state is infra > (or other non-experimental) code impact due to changes by > experimental/inprogress

Re: [Gluster-devel] Journal translator spec

2015-10-16 Thread Jeff Darcy
October 14 2015 11:02 PM, "Atin Mukherjee" wrote: > Could you push the design document along with the journal spec to > gluster.readthedocs.org as PRs? It's not clear where in the hierarchy these should go. I would have guessed "Developer-guide" but that's back in the main

Re: [Gluster-devel] NSR design document

2015-10-14 Thread Jeff Darcy
October 14 2015 3:11 PM, "Manoj Pillai" wrote: > E.g. 3x number of bricks could be a problem if workload has > operations that don't scale well with brick count. Fortunately we have DHT2 to address that. > Plus the brick > configuration guidelines would not exactly be

Re: [Gluster-devel] NSR design document

2015-10-14 Thread Jeff Darcy
> "The reads will also be sent to, and processed by the current > leader." > > So, at any given time, only one brick in the replica group is > handling read requests? For a read-only workload-phase, > all except one will be idle in any given term? By default and in theory, yes. The question is:

Re: [Gluster-devel] RFC: Gluster.Next: Where and how DHT2 work/code would be hosted

2015-10-09 Thread Jeff Darcy
> I wonder if glusterd2 could also be a different directory in > experimental/. We could add a new configure option, say something like > --enable-glusterd2, that compiles & installs glusterd2 instead of the > existing glusterd. Thoughts? It might be a bit painful. Firstly, anything that

Re: [Gluster-devel] TEST FAILED ./tests/basic/mount-nfs-auth.t

2015-10-09 Thread Jeff Darcy
October 9 2015 2:56 AM, "Milind Changire" wrote: > https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/10776/consoleFull > > says > [06:18:00] ./tests/basic/mount-nfs-auth.t .. > not ok 62 Got "N" instead of "Y" > not ok 64 Got "N" instead of "Y" > not

Re: [Gluster-devel] RFC: Gluster.Next: Where and how DHT2 work/code would be hosted

2015-10-09 Thread Jeff Darcy
My position is that we should maximize visibility for other developers by doing all work on review.gluster.org. If it doesn't affect existing tests, it should even go on master. This includes: * Minor changes (e.g. list.h or syncop.* in http://review.gluster.org/#/c/8913/) * Refactoring that

Re: [Gluster-devel] Managing etcd (4.0)

2015-09-10 Thread Jeff Darcy
> I've a follow up question here. Could you elaborate the difference > between *use* & *join*? Venky's interpretation was the same as what I had intended. "Join" means that the new node becomes both a server and client for the existing etcd cluster. "Use" means that it becomes a client only,

Re: [Gluster-devel] FOP ratelimit?

2015-09-10 Thread Jeff Darcy
> Have we given thought about other IO scheduling algorithms like mclock > algorithm [1], used by vmware for their QOS solution. > Plus another point to keep in mind here is the distributed nature of the > solution. Its easier to think of a brick > controlling the throughput for a client or a

[Gluster-devel] Managing etcd (4.0)

2015-09-09 Thread Jeff Darcy
Better get comfortable, everyone, because I might ramble on for a bit. Over the last few days, I've been looking into the issue of how to manage our own instances of etcd (or something similar) as part of our 4.0 configuration store. This is highly relevant for GlusterD 2.0, which would be

Re: [Gluster-devel] GlusterD 2.0 status updates

2015-09-07 Thread Jeff Darcy
> Agree here. I think we can use Go as the other non-C language for > components like glusterd, geo-replication etc. and retire the python > implementation of gsyncd that exists in 3.x over time. That sounds like a pretty major effort. Are you suggesting that it (or even part of it) should be

Re: [Gluster-devel] GlusterD 2.0 status updates

2015-09-07 Thread Jeff Darcy
> - Support for SSL transports - thought we wouldn't be using this > initially, we would require it later on. Control-plane security is important enough that I think it has to be a Day One requirement. ___ Gluster-devel mailing list

Re: [Gluster-devel] GlusterD 2.0 status updates

2015-09-07 Thread Jeff Darcy
> Every day we hear about vulnerabilities and exploits. Have we not yet > reached the point were we just start with security enabled as a matter > of routine? > > Why aren't we using TLS (the real name, FWIW) from day one? I think making it default for 3.8+ would be a good idea. The only issue

Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Jeff Darcy
> Do you have any ideas here on QoS? Can it be provided as a use-case for > multi-tenancy you were working on earlier? My interpretation of QoS would include rate limiting, but more per *activity* (e.g. self-heal, rebalance, user I/O) or per *tenant* rather than per *client*. Also, it's easier

Re: [Gluster-devel] GlusterD 2.0 status updates

2015-09-01 Thread Jeff Darcy
> 1. The skeleton of GlusterD 2.0 codebase is now available @ [1] and the > same is integrated with gerrithub. > > 2. Rest end points for basic commands like volume > create/start/stop/delete/info/list have been implemented. Needs little > bit of more polishing to strictly follow the heketi APIs

Re: [Gluster-devel] GlusterFS (network) messaging layer - possible next steps

2015-08-31 Thread Jeff Darcy
> nanomsg wouldn't be an easy drop in replacement to our existing messaging > infrastructure. We need to understand how our code would be structured if we > decide to use nanomsg. I am considering the Go implementation of nanomsg > (protocol) called mangos[3] for inter-GlusterD communication in

Re: [Gluster-devel] GlusterFS (network) messaging layer - possible next steps

2015-08-31 Thread Jeff Darcy
> It's not packaged in Fedora. There have been two attempts, in 2013 and > 2014; https://bugzilla.redhat.com/show_bug.cgi?id=1012392 and > https://bugzilla.redhat.com/show_bug.cgi?id=1123511 respectively. >From looking at those, it didn't look like there were any truly intractable issues - e.g.

Re: [Gluster-devel] Removing dict from the RPC protocol

2015-08-19 Thread Jeff Darcy
This is probably a controversial topic to bring up but, I think the dict that's in the RPC protocol needs to be removed. It generates a lot of confusion in the code because the dict is opaque. The XDR protocol has great support for serializing structs. That's great when the set of things

Re: [Gluster-devel] [Gluster-users] gluster small file performance

2015-08-18 Thread Jeff Darcy
Note: The log files attached have the No data available messages parsed out to reduce the file size. There were an enormous amount of these. One of my colleagues submitted something to the message board about these errors in 3.7.3. [2015-08-17 17:03:37.270219] W

Re: [Gluster-devel] [Gluster-users] gluster small file performance

2015-08-18 Thread Jeff Darcy
I changed the logging to error to get rid of these messages as I was wondering if this was part of the problem. It didn't change the performance. Also, I get these same errors both before and after the reboot. I only see the slowdown after the reboot. I have SELinux disabled. Not sure about

Re: [Gluster-devel] semi-sync replication

2015-08-17 Thread Jeff Darcy
Do we have plans to support semi-synchronous type replication in the future? By semi-sync I mean writing to one leg the replica, securing the write on a faster stable storage (capacitor backed SSD or NVRAM) and then acknowledge the client. The write on other replica leg may happen at later

Re: [Gluster-devel] Modifying GlusterFS to cope with C99 inline semantics

2015-07-27 Thread Jeff Darcy
My opinion: don't try to second guess the compiler/optimizer; stop using inline. Full Stop. To this end, I've submitted a patch which removes all instances in .c files. http://review.gluster.org/#/c/11769/ For removing the ones in .h files, I propose that we create a new file

Re: [Gluster-devel] Modifying GlusterFS to cope with C99 inline semantics

2015-07-27 Thread Jeff Darcy
My opinion: don't try to second guess the compiler/optimizer; stop using inline. Full Stop. To this end, I've submitted a patch which removes all instances in .c files. http://review.gluster.org/#/c/11769/ Regression tests blew up in snapshot code. It's really hard to see how

Re: [Gluster-devel] Spurious failures again

2015-07-09 Thread Jeff Darcy
Sad but true. More tests are failing than passing, and the failures are often *clearly* unrelated to the patches they're supposedly testing. Let's revive the Etherpad, and use it to track progress as we clean this up. ___ Gluster-devel mailing list

Re: [Gluster-devel] [Cross-posted] Re: Gentle Reminder.. (Was: GlusterFS Documentation Improvements - An Update)

2015-07-08 Thread Jeff Darcy
My suggestions: * Differentiate between user doc(installation, administration, feature summary, tools, FAQ/troubleshooting etc) and developer doc(Design doc, developer workflow, coding guidelines etc). I think there's a third category: feature/release planning and tracking. That stuff

[Gluster-devel] Macros and small files (was Re: Gluster and GCC 5.1)

2015-07-06 Thread Jeff Darcy
And in the past, if not now, are contributing factors to small file performance issues. I'm not quite seeing the connection here. Which macros are you thinking of, and how does the fact that they're macros instead of functions make them bad for small-file performance? AFAIK the problem with

Re: [Gluster-devel] Gluster and GCC 5.1

2015-07-02 Thread Jeff Darcy
Or perhaps we could just get everyone to stop using 'inline' I agree that it would be a good thing to reduce/modify our use of 'inline' significantly. Any advantage gained from avoiding normal function-all entry/exit has to be weighed against cache pollution from having the same code repeated

<    1   2   3   4   >