Hi,

Karel Gardas <[email protected]> writes:
> I know that you do a lot for darcs these days, but honestly speaking
> this email looks a little bit rude to me.

Could be, I was being subtle for 6 months without getting anywhere. Maybe the
issue will finally get some attention now.

> First of all, for last month or so I also wondered why there are so many
> buildbots failing. Many just off-line and many on-line but failing on
> configure. Now I think I see the reason. I have tried to keep my
> buildbots (Solaris/OpenBSD) running well for last month or so and during
> that time I've been hit two or even three times by message that my bot
> fails due to configure error. Yes, I know you move darcs forward so you
> need to do some changes (usually updating hashed storage), but it would
> be really great if in case of failing bot due to your change you at
> least contact buildbot owners to update darcs required packages on their
> bots. IMHO this will make situation much more easier also for you and
> you would not need to setup completely parallel infrastructure to what's
> currently running.

Out of the existing 12 slaves:
- 6 are offline
- 3 are missing zlib and hashed-storage
- 1 is failing mysteriously on configure
- 2 are green

But the problem is not missing packages. That's easy fix. The problem is the
overall unreliability of the service. We were unable to have a green buildbot
for straight six months despite asking buildslave owners several times to check
their slaves etc. I don't want to put specific blame on any single slave or
owner (and if anyone took offence: sorry, no offence was meant). The system
simply does not work.

And for the buildbot service as a whole to be useful for me as a developer, it
has to be green all the time, unless we break darcs. The only way I can
currently think of to get that is starting over. If you can achieve that some
other way, I will only be glad.

> It would be really good if you contact buildbot owners on their personal
> email addresses. I think that's the purpose on holding contact address
> on bots anyway. Also please do not do any such critical decision during
> the holiday season. Many people are out from time to time and do not
> care about their service to community that much like for example during
> the school year.

The slaves can be reconnected to the new master at a later time, if their
owners are currently not available.

> I keep my Solaris/OpenBSD buildbots running well, so I don't see any
> reason why they should not serve darcs community in the future. I also
> don't see any reason why you consider them to be "lost".

That's true for the solaris one, which is one of the two green slaves. The
openbsd one has however never been quite reliable -- I can't find a successful
build in recent history. It seems that it is failing in tests with no output,
and I cannot debug that. This is something that the you as the owner would need
to investigate and address if we are to rely on the slave. I am available to
help you, but without access to the machine, I cannot do anything myself.

> Conclusion: as a buildbot owner I see the only reason why so much bots
> are failing, i.e. frequent updates to darcs required packages. Now, I do
> have perhaps naive idea, but what about to make darcs buildable with
> head versions of required packages (mostly hashed-storage). i.e. bot
> will get hashed-storage, build it, test it and then get darcs, build it
> against the previously build hashed storage, test etc. I think this will
>  keep you a lot of bots running w/o a need for intervention and still
> once such support is written you will use it for all the subsequent
> hashed-storage versions automatically...

If we can agree that the buildbot can do runghc Setup copy and runghc Setup
register on hashed-storage (i.e. installing the hashed-storage package into
~/.cabal) that would be great. I can do that change on my new buildmaster. (It
could, in theory, be done on the existing one, but that's pretty tricky, since
every config change involves a roundtrip through Zooko.)

But that alone is not doing anything to fix the offline slaves, and the slaves
failing for mysterious reasons. I really think that the only way to achieve
that is to only keep the slaves that work reliably.

Yours,
   Petr.
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to