Re: build errors on Fedora 24 and CentOS 7

2016-10-13 Thread jg71
* dawuud  wrote:

> > If I can provide information for fixing
> >   allmydata.test.test_magic_folder.RealTest.test_move_tree
> > please let me know. CentOS, Fedora and Slackware seem to be affected¹.
> 
> I wonder if the failed versus pass buildbots have different
> versions of twisted or different Linux kernel versions which have
> slightly different inotify implementations.

FWIW, tox installs what's necessary anyway. I've run the buildslave
in an environment where there had been system wide installs of tahoe
and its deps as well, and it worked.  This one hasn't any yet (which
will prolly change soon, but shouldn't affect the buildslave at all).

The only fsck-foo that comes to mind is buildslave's dep of twisted,
but the twisted version is listed in the logs of each build.

The kernel for the Slackware buildslave is a recent 4.4.x

HTH

-- 
Feel the magic of Tahoe-LAFS.
___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-13 Thread Lukas Pirl
On 10/13/2016 11:09 AM, dawuud wrote as excerpted:
> I wonder if the failed versus pass buildbots have different versions
> of twisted or different Linux kernel versions which have slightly
> different inotify implementations.

For CentOS 7, Fedora 24 and Jessie you can find the kernel version in
the buildbot host information.

In fact, all the slaves I run share the same (instance of a running)
kernel (the kernel of the LXC host). It seems to work for Jessie but
not for the RHELs – maybe you have another idea before you dig in there
since it seems to be unlikely related to the kernel version.

Good luck,

Lukas

-- 
+49 174 940 74 71
GPG: http://lukas-pirl.de/media/attachments/lukas_pirl.pub

___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-13 Thread dawuud

Hi.

about that test_move_tree unit test... I think this indicates a minor bug
in magic-folder which I have briefly documented in this ticket:

"""magic-folder: watched sub-dir stilled watched after being removed from 
magic-folder"""
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2836

> If I can provide information for fixing
>   allmydata.test.test_magic_folder.RealTest.test_move_tree
> please let me know. CentOS, Fedora and Slackware seem to be affected¹.

I wonder if the failed versus pass buildbots have different versions
of twisted or different Linux kernel versions which have slightly
different inotify implementations.

Anyway, we should use the recursive=True inotify watch mode when the platform 
supports
it... AND we must "unwatch" directories when they are moved out of the 
magic-folder.
The test_move_tree test should be fixed so that when we receive an event from a 
directory
which was moved out of the magic-folder it should fail the test instead of pass.

I'll try to work on this today.

Cheers,
David


signature.asc
Description: PGP signature
___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-12 Thread Lukas Pirl
On 10/12/2016 04:20 PM, Brian Warner wrote as excerpted:
> Aha! I'd bet that previous RHEL systems still had ifconfig somewhere, so
> it didn't matter that the code was looking for 'ip' in the wrong place.

That is likely, yes.

If I can provide information for fixing
  allmydata.test.test_magic_folder.RealTest.test_move_tree
please let me know. CentOS, Fedora and Slackware seem to be affected¹.

Best,

Lukas

¹ https://tahoe-lafs.org/buildbot-tahoe-lafs/builders/Centos%207/builds/1
  https://tahoe-lafs.org/buildbot-tahoe-lafs/builders/Fedora%2024/builds/3

https://tahoe-lafs.org/buildbot-tahoe-lafs/builders/Markus%20slackware64%20stable/builds/282
___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-12 Thread Brian Warner
On 10/12/16 9:33 AM, Lukas Pirl wrote:

> Well, after a few minutes stepping around, pdb told me the reason for
> this failure. It was quite trivial in the end: the path to `ip` is just
> different on RHEL-based systems.

Aha! I'd bet that previous RHEL systems still had ifconfig somewhere, so
it didn't matter that the code was looking for 'ip' in the wrong place.

PR merged. Thanks!
 -Brian

___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-12 Thread Lukas Pirl
On 10/12/2016 12:49 AM, Brian Warner wrote as excerpted:
> Or maybe there's something funny with the regexp? Different newline
> convention on that box? Some unicode thing?
> 
> I guess the next step is to modify the tests to log the full output from
> /bin/ip to test.log, or maybe add a separate buildstep to do the same.

Well, after a few minutes stepping around, pdb told me the reason for
this failure. It was quite trivial in the end: the path to `ip` is just
different on RHEL-based systems.

I filed a PR¹ accordingly.

Nevertheless, I guess
  allmydata.test.test_magic_folder.RealTest.test_move_tree
will continue to fail. Let's see.

Best,

Lukas

¹ https://github.com/tahoe-lafs/tahoe-lafs/pull/364 accord
___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-11 Thread Brian Warner
On 10/11/16 10:25 AM, Lukas Pirl wrote:

>> Hmm. What's the full output of "/bin/ip addr" on that box?
> 
> Please see attached (avoids line wraps).

Hmm.. that output matches the regexp when I compare it locally, and
shouldn't have produced the error that the buildbot saw:

allmydata.test.test_iputil.ListAddresses.test_list_async ...
Traceback (most recent call last):
  File
"/home/tahoe/buildslave/Fedora_24/build/src/allmydata/test/test_iputil.py",
line 105, in _check
self.failUnlessIn("127.0.0.1", addresses)
  File
"/home/tahoe/buildslave/Fedora_24/build/.tox/py27/lib/python2.7/site-packages/twisted/trial/_synctest.py",
line 485, in assertIn
% (containee, container))
twisted.trial.unittest.FailTest: '127.0.0.1' not in ['10.10.1.14']
[FAILURE]

Is it possible that the buildslave is running in a different environment
than the one where you ran "/bin/ip addr"? Like, maybe different userids
give different results? I don't know how LXC works well enough to know
if that even makes sense.

Or maybe there's something funny with the regexp? Different newline
convention on that box? Some unicode thing?

I guess the next step is to modify the tests to log the full output from
/bin/ip to test.log, or maybe add a separate buildstep to do the same.

puzzled,
 -Brian

___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-11 Thread Lukas Pirl
(@Brian: now with the list in CC and a note regarding reaching 10.10.…)

Thanks for all the explanations.

On 10/10/2016 04:08 PM, Brian Warner wrote as excerpted:
> If ip/ifconfig is broken entirely, it might still be getting
> 10.10.1.14 from the UDP socket routine.

If that is the case, that would explain it. But wait, both hosts can
also loopback-connect to themselves using the 10.10.… address
(respectively).

For the record, on both host `ifconfig` is not available and
`localhost` resolves to 127.0.0.1 (and not 10.10.…).

> Hmm. What's the full output of "/bin/ip addr" on that box?

Please see attached (avoids line wraps).

Best,

Lukas

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host 
   valid_lft forever preferred_lft forever
6: eth0@if7:  mtu 1500 qdisc noqueue state UP 
qlen 1000
link/ether 00:ff:aa:00:00:13 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.10.1.13/8 brd 10.255.255.255 scope global eth0
   valid_lft forever preferred_lft forever
inet6 fe80::2ff:aaff:fe00:13/64 scope link 
   valid_lft forever preferred_lft forever
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group 
default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host 
   valid_lft forever preferred_lft forever
8: eth0@if9:  mtu 1500 qdisc noqueue state UP 
group default qlen 1000
link/ether 00:ff:aa:00:00:14 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.10.1.14/8 brd 10.255.255.255 scope global eth0
   valid_lft forever preferred_lft forever
inet6 fe80::2ff:aaff:fe00:14/64 scope link 
   valid_lft forever preferred_lft forever
___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-10 Thread Brian Warner
On 10/10/16 2:17 AM, Lukas Pirl wrote:
> On 10/09/2016 11:15 PM, Brian Warner wrote as excerpted:
>> Thanks.. could you check that those two slaves have "127.0.0.1" bound to
>> the loopback interface? One of the failing tests suggests that they do
>> not, and I think the other tests depend upon being able to use 127.0.0.1
>> to connect to themselves.
> 
> They do
> 
> [tahoe@tahoe-buildbot-centos7 root]$ ip addr show lo
> 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
>valid_lft forever preferred_lft forever
> inet6 ::1/128 scope host

Hmm. What's the full output of "/bin/ip addr" on that box?

Tahoe tries a bunch of different commands to find the addresses
(starting with /bin/ip, then a bunch of ifconfig flavors). It applies a
regexp to the output of each (see src/allmydata/util/iputil.py line 164
"_unix_commands"). I'm wondering if there's something unusual about this
host's output, and the regexp isn't correctly matching it.

allmydata.test.test_iputil.ListAddresses.test_list_async is the failure
I'm looking at (using the "fix the simplest thing first in the hopes
that it will help the complicated things" debugging philosophy). It's
reporting 10.10.1.14, but not 127.0.0.1. The test_list_async() function
it calls combines the "/bin/ip" / "ifconfig" output with a routine that
prepares a UDP socket for talking with one of the root DNS servers, then
asking what interface that socket would have used. If ip/ifconfig is
broken entirely, it might still be getting 10.10.1.14 from the UDP
socket routine.

thanks,
 -Brian

___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-09 Thread Lukas Pirl
On 10/09/2016 11:15 PM, Brian Warner wrote as excerpted:
> Thanks.. could you check that those two slaves have "127.0.0.1" bound to
> the loopback interface? One of the failing tests suggests that they do
> not, and I think the other tests depend upon being able to use 127.0.0.1
> to connect to themselves.

They do

[tahoe@tahoe-buildbot-centos7 root]$ ip addr show lo
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host

[root@tahoe-buildbot-centos7 ~]# ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.098 ms

The iptables are completely empty.
As mentioned before, all slaves run as a LXC container but that usually
does not affect connections to 127.0.0.1.

As the user running the buildslave, if I run `python -m
SimpleHTTPServer`, I can access it vie 127.0.0.1.

Any more ideas?

Cheers,

Lukas
___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev


Re: build errors on Fedora 24 and CentOS 7

2016-10-09 Thread Brian Warner
On 10/9/16 3:41 PM, Lukas Pirl wrote:
> Hi Brian and thanks for getting me started with the Tahoe buildbot
> slaves.
> 
> All new slaves seem to be doing now what they are supposed to do.

Thanks so much for setting them up!

> However, Fedora 24 and CentOS 7 complete with failing tests.
> 
> If this is or could be related to the deployment, please let me know.

Thanks.. could you check that those two slaves have "127.0.0.1" bound to
the loopback interface? One of the failing tests suggests that they do
not, and I think the other tests depend upon being able to use 127.0.0.1
to connect to themselves.

> Also, feel free to put the email address
>   ta...@lukas-pirl.de
> in the buildbot slaves' "Admin" field.

That's actually controlled from the buildslave side: edit a file named
"info/admin" in the buildslave's directory. Also edit "info/host" to put
in a description of the machine (OS version, buildbot version, that sort
of thing), which gets displayed on one of the status pages.

thanks,
 -Brian

___
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev