MPP tests

2008-06-16 Thread Giannis Galanis
On my last week at 1cc, I performed several tests on MeshPortalPoint(MPP)
configurations.

Perhaps it is good for all to have an insight on what is possible, and what
isnt with MPP.
(especially after tomorrows network presentation!)

For those who dont know yet,  an XO  acts as an MPP, when it manages to
share its private internet connectivity to other XOs in the mesh.

The MPPs tested managed to successfully share their connectivity through the
msh0 interface.
Their connectivity was via
1) a GSM/usb modem though a ppp0 interface(thanks to Ankur!!)
2) a simple Wifi AP
3) a School Wifi

The common cases of MPP regarded home scenarios, or "under-the-tree"
scenarios(where at least 1 XO had access to an internet connection)
Specifically in the GSM modem case, probably only one XO will have the
device due to cost.. etc

But, it was rarely discussed whether an MPP can be useful at the school(am i
wrong?)
Probably because Schools were initially configured with Meshes, and the MPP
is probably useless there
(if the MPP, XO, School are indeed in the same mesh, then the XO can reach
the School directly)

However, now that Schools are mostly connected with Access points, MPP can
be useful in School as well!

Other XOs in the neighborhood can join a School Wifi via an MPP,
combining the benefits of the AP in the dense environment, and the Mesh in
the scarce environment

***Michali*, I have a question for u:
In the case that some XOs are many hops away(>5 and <10) from the School,
can an XO at halfway act as an MPP to practically increase the ttl?
I can understand that it would probably work if the MPP had two mshX ifaces
on diff channels(the second to be an active antenna)..
Is this a good way to bridge two mesh clouds?


Results:

1) GSM modem / SimpleWifi

* All XOs (including the MPP) had access to the internet
* The client-XOs showed a DNS server(in resolv.conf) of the msh0 address of
the MPP-XO
* jabber:All XOs(including the MPP) could perfectly collaborate via a jabber
server(any publicly routable jabber should work.. i used
schoolserver.laptop.org)
* salut: I disabled gabble in all XOs with "sugar-control-panel -s jabber
foo"
   a) All XOs(including the MPP) shared *Presence information*.. i.e. New XO
arrivals, new activities, who joins the activity..etc
   b) Only the client-XOs could perform actual collaboration!!
why is this happening? I found it rather strange..
I was under the impression that Presence data were very similar to Activity
sharing data..
*Dafydd*, can you explain this? (or anyone from collabora)

2) School Wifi.. tested with media lab 802.11, which is also connected to
schoolserver.laptop.org

All the above facts were also true here.

What makes this case more special than simple Wifi, is that the client-XOs
would share the benefits of being at the school.
However,
* The client-XOs could *not* Register
* They also could *not* resolve "schoolserver"(it is the simplest way to
tell whether the School "services" are accessible)
(it could ping schoolserver.laptop.org, but this is publicly routable
anyway)
* They could ping 172.18.0.1

Basically, they could reach the school server machine, but didnt treat it as
a school!

*Wad*, can you explain this behavior?



yanni




___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: #7188 HIGH Never A: Both telepathies are permanently down

2008-06-04 Thread Giannis Galanis
sorry for the multiple tickets.

i had a problem accessing trac.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: New update.1 build 706

2008-05-28 Thread Giannis Galanis
these is a list some bugs discovered tonight on 706

5848: The mesh circle in the main view was disappered
7121: Chat would not load
7119: Usb stick was too slow to mount(1min)
7118: Letters in all sugar activities became tiny!

**5848 probably isnt linked to the new libertas


On Tue, May 27, 2008 at 10:30 PM, Build Announcer v2 <[EMAIL PROTECTED]>
wrote:

> http://pilgrim.laptop.org/~pilgrim/olpc/streams/update.1/build706
>
> Changes in build 706 from build: 705
>
> Size delta: 0.13M
>
> -kernel 2.6.22-20080312.2.olpc.f3687aa7e09fd65
> +kernel 2.6.22-20080523.1.olpc.28f4cb6e780db07
> -libertas-usb8388-firmware 2:5.110.22.p1-1.fc7
> +libertas-usb8388-firmware 2:5.110.22.p14-1.fc7
> -xorg-x11-drv-evdev 1.2.0-2norel.olpc2
> +xorg-x11-drv-evdev 1.2.0-3norel.olpc2
> -xorg-x11-server-Xorg 1.4-8.olpc2
> +xorg-x11-server-Xorg 1.4-9.olpc2
>
> --- Changes for kernel 2.6.22-20080523.1.olpc.28f4cb6e780db07 from
> 2.6.22-20080312.2.olpc.f3687aa7e09fd65 ---
>  + Patch the kernel, xorg-x11-server, and xorg-x11-drv-evdev packages
>  + Kernel source code taken from the 'stable' branch of
>  + Thanks to Blake Setlow for writing all the code and for pushing me to
>
> --- Changes for xorg-x11-drv-evdev 1.2.0-3norel.olpc2 from
> 1.2.0-2norel.olpc2 ---
>  + Patch the kernel, xorg-x11-server, and xorg-x11-drv-evdev packages
>  + Kernel source code taken from the 'stable' branch of
>  + Thanks to Blake Setlow for writing all the code and for pushing me to
>
> --- Changes for xorg-x11-server-Xorg 1.4-9.olpc2 from 1.4-8.olpc2 ---
>  + Patch the kernel, xorg-x11-server, and xorg-x11-drv-evdev packages
>  + Kernel source code taken from the 'stable' branch of
>  + Thanks to Blake Setlow for writing all the code and for pushing me to
>
> --
> This mail was automatically generated
> See 
> http://dev.laptop.org/~rwh/announcer/update.1-pkgs.htmlfor
>  aggregate logs
> See 
> http://dev.laptop.org/~rwh/announcer/joyride_vs_update1.htmlfor
>  a comparison
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: New network scripts/tools for testing

2008-05-12 Thread Giannis Galanis
On Mon, May 12, 2008 at 10:58 AM, Mikus Grinbergs <[EMAIL PROTECTED]> wrote:

> Some small discrepancies in the output of the new 'olpc-netstatus':
>
> 1) I have a wired connection.  (NO wireless.)  I do not understand why,
> but for some Joyride builds, the wired connection gets assigned to 'eth0',
> and for others it gets assigned to 'eth1'.  My current build (1932) assigns
> it to 'eth1'.  The result is 'olpc-connections' and 'olpc-netstatus' have
> NOTHING to report for 'eth0' (that interface is there, but does not have an
> IPv4 address).
>

olpc-netstatus should work either way. I see that it detected properly that
eth1 is your ethernet.
It should also work if it was the other way around.

it scans all eth*, and checks which has an IP.(now if both have an IP it
willonly choose one)

oh btw *I think*  eth1 shows as the wireless, when you upgrade the build
with the eth/usb adapter plugged in(not 100% sure)

about olpc-connections then this is a bug.
It is not smart enough to determine whether eth0/eth1 is active.
I will make sure this is fixed before i put on the build



>
> 2) My connection goes through a proxy.  The result is that
> 'olpc-connections' and 'olpc-netstatus' show the Proxy-system IP, where they
> claim to be showing the Jabber-system IP.
>

this i dont know how to fix.
perhaps there is nothing i can do. I will have to ask in 1cc


>
> 3) (For "nameserver"?) 'olpc-netstatus' refers to /root/test.  My system
> has no such file.
>

oh this is a terrible mistake!
I was testing with a sample resolv.conf file and I forgot about it.
I updated it properly now on the wiki. Thanx alot!


Thanks alot for the feedback!
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


New network scripts/tools for testing

2008-05-09 Thread Giannis Galanis
The past couple of weeks I have been working on developing several Network
testing scripts,
that make testing a more pleasant experience!

The scripts collect and display information about
the network configuration, telepathies and their status, the neighbor XOs
and the forwarding tables

For details have a look at
http://wiki.laptop.org/go/Network_Resources
(I created this page to group general Network info, including important Wiki
pages, scripts, bugs etc... but now the scripts use 90% of the page!)

a short overview:

*olpc-connections*:  Tracks any change in msh0/eth0, dns, Telepathy status,
jabber, num of XOs in the neighborhood.

*olpc-xos* *-avahi*: Displays the XOs currently seen by Avahi. You may also
run it continously with *-c*. Then it will continuously scan for changes,
and display the list with a timestamp when a change is identified.action usb

*olpc-xos -sugar*: The same as above, but for the sugar XOs. It works with
salut and gabble.
(Note that sometimes avahi and salut show different XOs)

** When running a test that involves collaboration, it is very useful to
have the above scripts run at boot and log to a file.
All changes are timestamped so you can track down bugs much easier

*olpc-mesh*: It collects data from the firmware ioctls, and displays the
complete forwarding in a readable manner.
It may also replace the MAC address with the correspondig Nick name if you
have a mac-nick table.
You may create this table from the neighbor XOs with *olpc-xos -mac*, which
was written for this purpose.

*sugar-xos**: *(This was written by Guillaume in Python) It displays a list
of the sugar XOs.
It is separate so it can be used as a library from olpc-xos,
olpc-connections and olpc-netstatus.
The reason it is split from olpc-xos is that the latter does more
processing(tracks changes), and also works for Avahi,
*and* also because the former is written in python, which I know very little
of!
So i used the first as an input to the second..(perhaps i will clean this in
the future)

*sugar-telepathies*: (Thanks Daf for your help!)This lists the
presenceservice Telepathies and is used as a library
from olpc-connections and olpc-netstatus



Also, I updated the olpc-netstatus and olpc-netlog:

*olpc-netstatus*: (Old versions are on our build several months now).
This tool collects several network info and other info like build, libertas
etc.
It determines which configuration you are connected to(Simple Wifi, School
WIfi, Simple Mesh, School Mesh)
checks which Telepathy is currently active, and whether there is connection
to Jabber.
new stuff:
*checks if a school is present
*reads the Telepathies from Dbus(not the ps list), displays their status
*shows uptime
*shows num of XOs connected

*olpc-log*: (it was previously named as olpc-netlog, and was also present in
our builds long time now)
It gathers all possible logs(messages, activities, dmesg, etc..) and stores
them to a tarball named by S/N and timestamp.
It also collects several files and the output of network commands like
i[f|w]config, route, olpc-nestatus etc
For complete list of stuff logged check olpc-log --help(olpc-*net*log
--help, for older versions)
new stuff:
*includes config file
*includes progress bar(sometimes it might take even 1min)

With the help of Michael, the scripts will be shortly available in the next
joyride.

I would also highly recommend that olpc-connections to be logging by default
at startup(when debug logs are enabled)

Waiting for any recommendations/feedback

yanni
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


If you need to customize an XO automatically, you should use the Action usb stick

2008-04-29 Thread Giannis Galanis
The action USB key offers the capability to customize the XOs nand image
automatically.
It is also possible to set certain function perform at every boot.

It is a very usefull tool to customize quickly many XOs, collect information
to the usb, transfer files to the XO etc..

The required can be found here:
http://wiki.laptop.org/images/a/a9/Actionkey.tar.gz

To prepare the key:
1) Edit the "action" script with commands to be executed once. This could
involve copying files, collecting data
2) Edit the "rc.tweaks" script with commands to be executed at every boot.
This could involve setting variables, running testing tools etc..
3) Copy all files(boot dir, bzimage, action, tc.tweaks) to the root, along
with any other file you need to perform your customization

To install the key:
1) Turn the XO off
2) Boot the XO with the usb stick plugged in without holding any game key
3) Wait until you see "no job control in the shell", and you are done!
4) To turnoff type "exit", or simply hold the power button. Remove the stick
before you boot again


Notes:
* The key only works in activated machines with developer keys
* The path to the usb stick would be /mnt/usb
* Dont erase the export PATH=.. line from the action key
* If you copy files back to he usb, make sure you type sync, or exit before
removing the stick
* If you wanna install an rpm, you should first copy it locally with the
action script. Then set it to install with the rc.tweaks script, and clear
the instruction so it doesnt install the nest time.
* When booting with the action script, the wireless firmware is not loaded
* Thank mstone for creating this amazing tool!

If you need to customize an XO automatically, you should use the Action usb
stick
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


If you need to customize an XO automatically, you should use the Action usb stick

2008-04-29 Thread Giannis Galanis
**this mail was sent a couple of times, due to devel bounces

The action USB key offers the capability to customize the XOs nand image
automatically.
It is also possible to set certain function perform at every boot.

It is a very usefull tool to customize quickly many XOs, collect information
to the usb, transfer files to the XO etc..

The required can be found here:
http://wiki.laptop.org/images/a/a9/Actionkey.tar.gz

To prepare the key:
1) Edit the "action" script with commands to be executed once. This could
involve copying files, collecting data
2) Edit the "rc.tweaks" script with commands to be executed at every boot.
This could involve setting variables, running testing tools etc..
3) Copy all files(boot dir, bzimage, action, tc.tweaks) to the root, along
with any other file you need to perform your customization

To install the key:
1) Turn the XO off
2) Boot the XO with the usb stick plugged in without holding any game key
3) Wait until you see "no job control in the shell", and you are done!
4) To turnoff type "exit", or simply hold the power button. Remove the stick
before you boot again


Notes:
* The key only works in activated machines with developer keys
* The path to the usb stick would be /mnt/usb
* Dont erase the export PATH=.. line from the action key
* If you copy files back to he usb, make sure you type sync, or exit before
removing the stick
* If you wanna install an rpm, you should first copy it locally with the
action script. Then set it to install with the rc.tweaks script, and clear
the instruction so it doesnt install the nest time.
* When booting with the action script, the wireless firmware is not loaded
* Thank mstone for creating this amazing tool!
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: list of laptops connected to jabber

2008-04-22 Thread Giannis Galanis
thanx,

thats exactly what i needed

On Tue, Apr 22, 2008 at 5:56 AM, Guillaume Desmottes <
[EMAIL PROTECTED]> wrote:

> Le lundi 21 avril 2008 à 10:05 +0200, Guillaume Desmottes a écrit :
> > Le samedi 19 avril 2008 à 22:18 +0300, Giannis Galanis a écrit :
> > > In the testbed in peabody, the list of peers seen from the server is
> > > usually a superset of what we see from each individual mesh view.
> > >
> >
> > Could be related to: #6883 #6884 #6888
> >
> > > It would be very useful if we could get the list shown by the analyze
> > > activity, but from the console.
> > >
> > > I believe it would easy to do that from the telepathy-gabble logs. For
> > > every new "arrival" or "departure" there must a specific entry.
> > >
> >
> > Actually there are different levels for this:
> > - Contacts know as online by Gabble
> > - OLPC Buddy in the PS
> > - Buddy displayed by sugar in the mesh view
> >
> > And of course, bugs can occur in each level.
> > telepathy-gabble.log gives us enough information to track the first
> > level (but not easy to read as log can be a mess). presence-service.log
> > for the second. And currently the only way to check 3 is to manually
> > count buddies.
> >
> > I agree with you, more helper would be welcome to debug these kinds of
> > bugs.
>
>
> I wrote a simple script listing all the buddies known by the PS. See
> https://dev.laptop.org/ticket/6918
>
> Maybe we should ship it with images?
>
>
>G.
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: list of laptops connected to jabber

2008-04-19 Thread Giannis Galanis
In the testbed in peabody, the list of peers seen from the server is
usually a superset of what we see from each individual mesh view.

It would be very useful if we could get the list shown by the analyze
activity, but from the console.

I believe it would easy to do that from the telepathy-gabble logs. For
every new "arrival" or "departure" there must a specific entry.



On 4/19/08, Dafydd Harries <[EMAIL PROTECTED]> wrote:
> Ar 18/04/2008 am 23:25, ysgrifennodd Giannis Galanis:
> > When connecting to a jabber server, how can we check the list of XOs that
> > are seen in the mesh view, or the analyze activity?
> >
> > Is checking the gabble log the only way?
> >
> > What records in the log indicate arrival or departure?
> >
> > When testing with 50 or 100 XOs connected it is often impractical to
> detect
> > missing icons, and a commandline tool would be of more help.
>
> There is a presence service monitor tool; I think it's included in the
> Analyze
> activity.
>
> If you log into the the Jabber server, you can just ejabberdctl to list the
> connected users.
>
> --
> Dafydd
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


list of laptops connected to jabber

2008-04-18 Thread Giannis Galanis
When connecting to a jabber server, how can we check the list of XOs that
are seen in the mesh view, or the analyze activity?

Is checking the gabble log the only way?

What records in the log indicate arrival or departure?

When testing with 50 or 100 XOs connected it is often impractical to detect
missing icons, and a commandline tool would be of more help.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Tickets for mesh problems

2008-04-18 Thread Giannis Galanis
We have these network UI bugs:

5904 - GUI problem updating buddies clustered around shared activity
5459 - second circle in sugar home view provides false information


> 5908 - Laptop unable to connect to schoolserver jabber server


Also 4193 - Two XOs were connected to an access point and were still running
salut
is a dup of 5908
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: New update.1 build 699

2008-03-13 Thread Giannis Galanis
Ok.

Should we test automatic suspend by removing the file?
Or we dont consider it a priority any more?


On Fri, Mar 14, 2008 at 12:05 AM, Chris Ball <[EMAIL PROTECTED]> wrote:

> Hi,
>
>   > what does inhibit-idle-suspend do?
>
> It allows you to disable automatic idle suspend while keeping enabled
> the explicit suspend on power button press or lid close.  Previously,
> there was only the inhibit-suspend file that inhibits both of the above.
>
> - Chris.
> --
> Chris Ball   <[EMAIL PROTECTED]>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: New update.1 build 699

2008-03-13 Thread Giannis Galanis
Chris,

what does inhibit-idle-suspend do?

On Wed, Mar 12, 2008 at 11:45 PM, Build Announcer v2 <[EMAIL PROTECTED]>
wrote:

> http://pilgrim.laptop.org/~pilgrim/olpc/streams/update.1/build699
>
> Changes in build 699 from build: 698
>
> Size delta: -1.31M
>
> -kernel 2.6.22-20080304.1.olpc.914fce4d9a8baf3
> +kernel 2.6.22-20080312.2.olpc.f3687aa7e09fd65
> -ohm 0.1.1-6.10.20080119git.fc7
> +ohm 0.1.1-6.11.20080119git.fc7
> -sugar 0.75.13-1.olpc2
> +sugar 0.75.14-1.olpc2
> -sugar-presence-service 0.75.1-1.olpc2
> +sugar-presence-service 0.75.2-1.olpc2
> -Read 44
> -Chat 35
> -Web 86
> -Write 55
> -Record 53
> -Paint 19
>
> --- Changes for ohm 0.1.1-6.11.20080119git.fc7 from
> 0.1.1-6.10.20080119git.fc7 ---
>  + "touch /etc/ohm/inhibit-idle-suspend" to allow sleep without idle.
>
> --- Changes for sugar 0.75.14-1.olpc2 from 0.75.13-1.olpc2 ---
>  + Fix #6671 #5933 #6405
>
> --
> This mail was automatically generated
> See 
> http://dev.laptop.org/~rwh/announcer/update.1-pkgs.htmlfor
>  aggregate logs
> See 
> http://dev.laptop.org/~rwh/announcer/joyride_vs_update1.htmlfor
>  a comparison
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Preparing the XOs for next week's test

2008-02-24 Thread Giannis Galanis
Kim,

The suspend/resume problems will show even with 2 XOs. This cannot be fixed
at the moment.

As Michalis mentioned in another email, testing S/R and mesh scalability
will just break the test. We have to test them invidually.

On Sun, Feb 24, 2008 at 10:35 PM, Kim Quirk <[EMAIL PROTECTED]> wrote:

> Right. But the suspend and resume problems we've seen with the mesh and
> sharing can be recreated on a relatively small number of laptops (<10). So
> we will either fix the problems, or turn off suspend in order to test for
> scaling issues above 50. We have >50 MPs for next weeks testing.
>
> So we should be ok.
>
> Kim
>
>
>
> On Sun, Feb 24, 2008 at 10:12 PM, John Gilmore <[EMAIL PROTECTED]> wrote:
>
> > > Ricardo, if you think there is anything else different with B4s in
> > regards
> > > to network performance, please tell us. I'm not aware of anything in
> > > hardware.
> >
> > They don't suspend.  So if MP's have networking trouble that happens
> > when
> > a laptop suspends, the trouble won't happen on a B4.
> >
> >John
> >
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Preparing the XOs for next week's test

2008-02-24 Thread Giannis Galanis
Kim,

You can see the diffs here:
http://dev.laptop.org/~rwh/announcer/joyride_vs_update1.html
 <http://dev.laptop.org/%7Erwh/announcer/joyride_vs_update1.html>
You will see that there are plenty of new stuff in joyride than just a
telepathy-salut update.

One thing we can do is use 693 and install the specific package, or make a
new Update.1 build that includes it. But we should test it individually
before putting it in the update.1 build.

One thing we can do is have half the XOs with 1721 in one channel,
and the other half with 693 in another channel.

Also, how about using B4s? Is there any effect in the performance except
suspend/resume?
I remember Ricardo saying there were hardware changes related to 4470.
Ricardo, can you confirm this?

If we decide on the build by tonight, I can have all the XOs updated  and
ready by tomorrow.


On Sat, Feb 23, 2008 at 5:08 PM, Kim Quirk <[EMAIL PROTECTED]> wrote:

> Agreed that Read sharing is the highest priority application.
>
> My concern is if there are a lot of other things in joyride, then it will
> take us a long time to get a release out based on joyride.
>
> If we pull the fix for Read back into update.1, (and other things that we
> find next week), then we won't waste time on testing or finding bugs in
> joyride.
>
> Does anyone have a good feel for the differences between today's Update.1and 
> joyride 1721 -- or can someone list the diffs so we can make a decision?
>
> Kim
>
>
>
> On Sat, Feb 23, 2008 at 8:13 AM, Walter Bender <[EMAIL PROTECTED]> wrote:
>
> > Read sharing is a critical feature. Please do test it.
> >
> > -walter
> >
> >
> > On 2/23/08, Morgan Collett <[EMAIL PROTECTED]> wrote:
> > > Giannis Galanis wrote:
> > >  > 2. I will try to update all of them with the build we will agree to
> > >  > initially test with. This would be 693/D13?
> > >  > There is a new version of telepathy-salut in 1721, which apparently
> > only
> > >  > fixes smth related to stream tube flush(which i dont know what it
> > is). I
> > >  > dont believe it important to our test. Other than that Update.1 i
> > think
> > >  > should be ok.
> > >
> > >
> > > As I said in reply to Chris's mail, the salut fix is for Read in
> > #6483.
> > >  If you are going to test sharing PDFs in Read, please use
> > Joyride-1721
> > >  otherwise there is a high chance it won't work at all under any
> > conditions.
> > >
> > >  Morgan
> > >  ___
> > >  Devel mailing list
> > >  Devel@lists.laptop.org
> > >  http://lists.laptop.org/listinfo/devel
> > >
> >
> >
> > --
> > Walter Bender
> > One Laptop per Child
> > http://laptop.org
> >
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Preparing the XOs for next week's test

2008-02-22 Thread Giannis Galanis
A couple of stuff for the next week's test.

1. We have about 45 XOs in the conference room, and we can make it up to 80
by collecting other XOs in the office. Do you think this is enough?

2. I will try to update all of them with the build we will agree to
initially test with. This would be 693/D13?
There is a new version of telepathy-salut in 1721, which apparently only
fixes smth related to stream tube flush(which i dont know what it is). I
dont believe it important to our test. Other than that Update.1 i think
should be ok.

3. I can also disable suspend/resume in all of them in case we decide we
dont wanna have it enabled. It will save alot of time by doing it on monday.

If there is anything you think might be useful to prepare in advance, let me
know!
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Suspended time vs Resumed time in an idle XO

2008-02-22 Thread Giannis Galanis
It is possible indeed. We have to check.

I will try to test how it works with avahi.

On Fri, Feb 22, 2008 at 3:25 PM, Ricardo Carrano <[EMAIL PROTECTED]>
wrote:

> Yanni,
> But we should note that, not everything that expires, does so because of a
> timer.
> A cache entry may have an associated timestamp and expire in timestamp +
> ttl.
>
>
> I have noticed that an idle machine will resume for some time, and suspend
> > again, several times for no reaso
> > The result is that the suspended time extends the timeout. The timeout
> > does not expire relative to the absolute time, but the time the CPU is
> > alive.
> > So if a 10min timeout is interrupted by a 2min suspend, the timeout will
> > expire 12min after the point it was executed.
> > Scott, does this agree with what you expected?
> >
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Suspended time vs Resumed time in an idle XO

2008-02-22 Thread Giannis Galanis
in case you need it, i am resending the script because it was blocked

On Fri, Feb 22, 2008 at 2:55 PM, Giannis Galanis <[EMAIL PROTECTED]> wrote:

> I have noticed that an idle machine will resume for some time, and suspend
> again, several times for no reason.
>
> I wrote a simple script that checks every 1sec whether the machine is
> suspended on not.
> It gives a timeline of Suspended times and Resumed times.
>
>
> A left an XO completely idle overnight for 12h.
>
> The results were:
>
> It resumed about 80 times
> It was resuming every 1m to 10min
> The total suspended time percentage was 90%
>
> Do these numbers seem normal?
>
> Chris was mentioning the other day about the additional power consumed to
> resume the XO.
> I can assume that resuming/suspending at a regular basis is not very power
> efficient.
>
>
> Also, this script made it easy to examine what happens to timeouts that
> are interrupted with suspends.
>
> The result is that the suspended time extends the timeout. The timeout
> does not expire relative to the absolute time, but the time the CPU is
> alive.
> So if a 10min timeout is interrupted by a 2min suspend, the timeout will
> expire 12min after the point it was executed.
> Scott, does this agree with what you expected?
>


suspendtime
Description: Binary data
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Suspended time vs Resumed time in an idle XO

2008-02-22 Thread Giannis Galanis
I have noticed that an idle machine will resume for some time, and suspend
again, several times for no reason.

I wrote a simple script that checks every 1sec whether the machine is
suspended on not.
It gives a timeline of Suspended times and Resumed times.


A left an XO completely idle overnight for 12h.

The results were:

It resumed about 80 times
It was resuming every 1m to 10min
The total suspended time percentage was 90%

Do these numbers seem normal?

Chris was mentioning the other day about the additional power consumed to
resume the XO.
I can assume that resuming/suspending at a regular basis is not very power
efficient.


Also, this script made it easy to examine what happens to timeouts that are
interrupted with suspends.

The result is that the suspended time extends the timeout. The timeout does
not expire relative to the absolute time, but the time the CPU is alive.
So if a 10min timeout is interrupted by a 2min suspend, the timeout will
expire 12min after the point it was executed.
Scott, does this agree with what you expected?


suspendtime.sh
Description: Bourne shell script
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: How XO's know XS

2008-02-20 Thread Giannis Galanis
When the XO connects to a mesh channel, it sends a specific request for a
School server.
If it receives a reply it knows there is SS somewhere in the mesh.  After it
receives the reply it will attempt to connect to it.

If no reply is received within a certain timeout, then the XO will connect
to simple mesh.

On Wed, Feb 20, 2008 at 5:31 PM, Shikhar <[EMAIL PROTECTED]> wrote:

> I was wondering how an individual XO identifies a school server. On the
> wiki I see that '...When a laptop is activated, it is associated in some
> way (TBD) with a school server. "
> (http://wiki.laptop.org/go/XS_Server_Services#Security_and_Identity)
>
> Best
>
> Shikhar
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
On Feb 19, 2008 4:10 PM, Ricardo Carrano <[EMAIL PROTECTED]> wrote:

> Yanni,
>
> >
> > Did a use it otherwise? Because of the effects of xmas tree, the timeout
> > for a failed XO until it's icon is removed is 10-30min.
> >
>
> I am talking about the time it takes for an avahi entry to expire. For
> what you said, is 10 minutes.
>

Oh ok. This is not 10min.

Avahi checks every 10min that its peers are alive.

An active entry will never expire
A "failed" entry will naturally expire in an additional 20min(30 in total).
BUT, it can expire instantly due to xmas tree bug(5501)


>
>
> > Ricardo, do you have anwers to the questions I posted before? :
> >
>
> Let's see:
>
>
> > 1. When a XO resumes, does it send any notification via avahi, that it
> > is back? Because if it doesnt, then other XOs that have cleared it from
> > their lists, they will never search for it.
> >
>
> I believe there is no "I am back" notification different than the normal
> way presence information is exchanged by the protocol.
>

If not, then we have a problem. The other XOs will never know it is here, so
they will never search for it. I think the "are u alive" request is
destination specific.

I will do some sniffing and find out.

>
> >
> > 2. Every scans the network every 10min, to check whether its avahi peers
> > are alive, in multicast packets. Do these packets include the address of the
> > peers/targets? I think they do, unless i am very confused. Couldn't we
> > awake/resume the target XO when it receives these specific packets?
> >
>
> That's the point. Mdns is multicast and the XOs, when suspended, don't
> listen to multicast frames.
>
>
The suspended XO can be setup to wake up by multicast packets. This is
technically possible afaik
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
On Feb 19, 2008 2:48 PM, Ricardo Carrano <[EMAIL PROTECTED]> wrote:

> Yanni,
>
> Timeout is a value, not a range. The effects brought by the timeout may
> manifest in a period (a range).


Did a use it otherwise? Because of the effects of xmas tree, the timeout for
a failed XO until it's icon is removed is 10-30min.


> I believe everyone will agree that 30 minutes is a long time to wait (and
> like Polychronis added) defeat the whole idea of a presence service.
>
> But, what I want to stress is that we are dealing with different issues
> here.
>

> I don't believe this 30 minutes or the xmas tree effect is related to
> suspend/resume. Those seem like bugs somewhere in the stack of software that
> support presence, while the suspend/resume issues are clearly a side effect
> of  the multicast traffic not being "heard" by a suspended XO.
>

There are way too many issues. Theses bugs (30min/xmas tree) enhance the
effects of suspend/resume on the mesh.
I believe that since we have the big test week coming, everyone must be
aware of them, or else noone will interpret the results properly.

The direct suspend/resume bugs are:

1. Why the mesh view  empties after a long suspend, and how this affects the
mesh view
2. Why some times the avahi cache is cleared after  resume.


Ricardo, do you have anwers to the questions I posted before? :
1. When a XO resumes, does it send any notification via avahi, that it is
back? Because if it doesnt, then other XOs that have cleared it from their
lists, they will never search for it.

2. Every scans the network every 10min, to check whether its avahi peers are
alive, in multicast packets. Do these packets include the address of the
peers/targets? I think they do, unless i am very confused. Couldn't we
awake/resume the target XO when it receives these specific packets?


On Feb 19, 2008 3:00 PM, Giannis Galanis <[EMAIL PROTECTED]> wrote:

> The list expires in 10min-30min.
>
> But we cant wait 30min before suspending, it is way too long.
>
>
> On Feb 19, 2008 11:37 AM, Ricardo Carrano <[EMAIL PROTECTED]>
> wrote:
>
> > Yanni,
> >
> > As I posted in the bug, I believe that you are observing the entries on
> > the avahi cache expiring.
> >
> > So, your first scenario would happen when the suspend time is longer
> > than the time it takes for all entries to expire.
> > The second scenario would happen when the suspend time is not long
> > enough to make all cached entries to go away.
>
>
> Oh i see that you mean. But, i think both cases are when the suspend time
> is longer than time to expire.
> The first is UI effect, and might have no relation to salut, but to mesh
> view in general
> The second is an avahi effect, that the avahi cache is chagned
> Both, are in long suspends
>
> >
> > And the third scenario seems related to previous reports you've made on
> > the Xmas tree effect, so not related to suspend/resume.
>
>
> The xmas tree effect appears when XOs leave connection, while others
> return.
> Suspend/resume enhances this effect dramatically, because in 1-2min
> everyone goes away, and they return at random time according to when they
> resume.
>
> In my suspend-salut tests , the xmas tree effect(although NOT related to
> suspend/resume), it affects salut alot more then the other 2 scenarios
>
> My point is that we must fix it anyway. But especially now!!
>
>
> >
> > What do you think?
>
>
> I have 2 questions that will help (me) understand alot about the
> situation:
>
> 1. When a XO resumes, does it send any notification via avahi, that it is
> back? Because if it doesnt, then other XOs that have cleared it from their
> lists, they will never search for it.
>
> 2. Every scans the network every 10min, to check whether its avahi peers
> are alive, in multicast packets. Do these packets include the address of the
> peers/targets? I think they do, unless i am very confused. Couldn't we
> awake/resume the target XO when it receives these specific packets?
>
> we need to do some sniffing
>
>
>
> >
> > On Feb 19, 2008 1:13 PM, Giannis Galanis <[EMAIL PROTECTED]> wrote:
> >
> > >
> > >
> > > On Feb 19, 2008 10:13 AM, Ricardo Carrano <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > >
> > > > I was asking whether it would help to have the wireless module wake
> > > > > us
> > > > > on multicast packets instead of only unicast.  Are you saying that
> > > > > it
> > > > > would?
> > > >
> > > >
> > > > It seems so, though it would, as John points out, make resum

Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
For the protocol to be healthy,

not only you have to wake every 10min to send your request,
but you have to be awake to receive the others' requests.

These are again 10min, but have different offsets. Thats why I believe the
only way would be to have 10off - 10on.

Still, due to bug 5501, if you miss a single request, u are prone to be
deleted right away.

So 10ff-10on might not work either.

In fact the although the requests are every 10min, the icon will hold for
30min in total until it is deleted.
Bug 5501, however, will delete the entry if within the timeframe, a new host
arrives.



On Feb 19, 2008 1:19 PM, Benjamin M. Schwartz <[EMAIL PROTECTED]>
wrote:

> On Tue, 2008-02-19 at 13:11 -0500, Giannis Galanis wrote:
>
> >
> > The wakeup required is T minutes for every T minutes.
> > Actually you would need to be awake  for >T  minutes
> > and suspended for  >
> > So for T=10min, as in this case:
> > 9off, 11on, 9 off, 11on
> >
> > but this is not very effective in terms of suspend/resume
>
> I meant to imply that this would work only if the wireless hardware
> wakes up the system for every broadcast.
>
> --Ben
>
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
On Feb 19, 2008 12:55 PM, Benjamin M. Schwartz <[EMAIL PROTECTED]>
wrote:

> On Tue, 2008-02-19 at 12:29 -0500, Giannis Galanis wrote:
> > The avahi works is that every several minutes(a predetermined timeout)
> > each host will send multicast request for all peers in its list.
> > Then all peers receiving this request will send a multicast reply.
> >
> > The packets are multicast because the mesh is mobile/dynamic so we
> > dont know where the target is, or which is the ideal route
>
> The problem is that with a timeout of T minutes and N laptops, there is
> a wakeup required every T/N minutes, on average?


The wakeup required is T minutes for every T minutes.
Actually you would need to be awake  for >T  minutes
and suspended for   Based on your
> description, it sounds as if this could be fixed by a small change in
> Avahi's timeout behavior.
>
> If I reach the timeout, I send a broadcast saying "Everyone, what's your
> status?".  In reply, all users send a broadcast "My status is X".  All
> peers receive all of these broadcasts, and reset their timers to zero.
> In this way, all laptops wake up together once every T minutes.
>
> Surely the solution is not this simple...
>
> The problem is that the others wont know YOUR status.
I think the confirmation of status is not "announced/beaconed", but
"requested" first.

But someone from collabora must confirm this
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
The list expires in 10min-30min.

But we cant wait 30min before suspending, it is way too long.


On Feb 19, 2008 11:37 AM, Ricardo Carrano <[EMAIL PROTECTED]>
wrote:

> Yanni,
>
> As I posted in the bug, I believe that you are observing the entries on
> the avahi cache expiring.
>
> So, your first scenario would happen when the suspend time is longer than
> the time it takes for all entries to expire.
> The second scenario would happen when the suspend time is not long enough
> to make all cached entries to go away.


Oh i see that you mean. But, i think both cases are when the suspend time is
longer than time to expire.
The first is UI effect, and might have no relation to salut, but to mesh
view in general
The second is an avahi effect, that the avahi cache is chagned
Both, are in long suspends

>
> And the third scenario seems related to previous reports you've made on
> the Xmas tree effect, so not related to suspend/resume.


The xmas tree effect appears when XOs leave connection, while others return.
Suspend/resume enhances this effect dramatically, because in 1-2min everyone
goes away, and they return at random time according to when they resume.

In my suspend-salut tests , the xmas tree effect(although NOT related to
suspend/resume), it affects salut alot more then the other 2 scenarios

My point is that we must fix it anyway. But especially now!!


>
> What do you think?
>

I have 2 questions that will help (me) understand alot about the situation:

1. When a XO resumes, does it send any notification via avahi, that it is
back? Because if it doesnt, then other XOs that have cleared it from their
lists, they will never search for it.

2. Every scans the network every 10min, to check whether its avahi peers are
alive, in multicast packets. Do these packets include the address of the
peers/targets? I think they do, unless i am very confused. Couldn't we
awake/resume the target XO when it receives these specific packets?

we need to do some sniffing



>
> On Feb 19, 2008 1:13 PM, Giannis Galanis <[EMAIL PROTECTED]> wrote:
>
> >
> >
> > On Feb 19, 2008 10:13 AM, Ricardo Carrano <[EMAIL PROTECTED]>
> > wrote:
> >
> > >
> > > I was asking whether it would help to have the wireless module wake us
> > > > on multicast packets instead of only unicast.  Are you saying that
> > > > it
> > > > would?
> > >
> > >
> > > It seems so, though it would, as John points out, make resumes far
> > > more constant. It seems we have to find a creative way out of this tough
> > > choice (automated suspend vs mesh) or face it.
> > >
> > >
> > > >
> > > >
> > > >   > Avahi entries will expire after some time. Suspend will prevent
> > > > it
> > > >   > to update its cache.
> > > >
> > > > Yani's bug report (#6467) suggests that Avahi entries often expire
> > > > immediately upon resume:
> > > >
> > > >   After the XO resumes (probably after beinng suspended for several
> > > >   minutes) all the icons in the mesh view vanish, except the mesh
> > > >   circles.
> > >
> > >
> > > I read this as the avahi-cache  expiring its entries.  Yanni  can  you
> > > put timeframes on this?
> > > Could check how long does it take to expiry an entry (TO) and then
> > > check if:
> > > Suspend time > TO -> all entries vanish
> > > Suspend time << TO -> no entries vanish
> > > Supens time ~ TO -> some entries vanish
> > >
> >
> > There as 2 cases where icons vanish due to suspend.
> >
> > 1st: The moment you resume(it generally happens after long suspends),
> > all icons vanish instantly(APs/XOs). This bug (#6467) suggests that sugar
> > has a problem with suspend resume.
> > The icons slowly reappear. I assume that if the avahi peer list is
> > intact that all XOs return.
> >
> > 2nd: The avahi list smtimes looses some or all of the peers at resume.
> > This is also under 6467, but it seems technicaly different. One possible
> > explanation could be that during suspend th XO resumes several times, but i
> > didnt notice it! And within this time frames it realized that the other
> > suspended XOs are gone, so it cleared its cache. Now when I resumed it
> > myself, I observed that the cache is clean!!
> >
> > Now, regarding the timeouts of avahi. This is a 3rd thing:
> > When an XO leaves the channel we have 4 states:
> >mm:ss
> > 1. 00:00  XO leave the channel(manually/or ti suspended)
> > 2. 10:00  Avahi notices teh XO left

Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
The avahi works is that every several minutes(a predetermined timeout) each
host will send multicast request for all peers in its list.
Then all peers receiving this request will send a multicast reply.

The packets are multicast because the mesh is mobile/dynamic so we dont know
where the target is, or which is the ideal route

On Feb 19, 2008 12:11 PM, Benjamin M. Schwartz <[EMAIL PROTECTED]>
wrote:

> On Tue, 2008-02-19 at 10:02 -0500, John Watlington wrote:
> > We ALWAYS have multicast traffic.   Blindly waking on each
> > received multicast packet will ensure that we only sleep for
> > milliseconds.
>
> What is all this multicast traffic?
> If I am sitting idle on the network, why is there constant multicast
> traffic being sent to me?
>
> I would expect broadcasts for activity share notifications and "Hi, I'm
> new" announcements, but those should be infrequent.
>
> --Ben
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut and Suspend/Resume issues

2008-02-19 Thread Giannis Galanis
On Feb 19, 2008 10:13 AM, Ricardo Carrano <[EMAIL PROTECTED]>
wrote:

>
> I was asking whether it would help to have the wireless module wake us
> > on multicast packets instead of only unicast.  Are you saying that it
> > would?
>
>
> It seems so, though it would, as John points out, make resumes far more
> constant. It seems we have to find a creative way out of this tough choice
> (automated suspend vs mesh) or face it.
>
>
> >
> >
> >   > Avahi entries will expire after some time. Suspend will prevent it
> >   > to update its cache.
> >
> > Yani's bug report (#6467) suggests that Avahi entries often expire
> > immediately upon resume:
> >
> >   After the XO resumes (probably after beinng suspended for several
> >   minutes) all the icons in the mesh view vanish, except the mesh
> >   circles.
>
>
> I read this as the avahi-cache  expiring its entries.  Yanni  can  you put
> timeframes on this?
> Could check how long does it take to expiry an entry (TO) and then check
> if:
> Suspend time > TO -> all entries vanish
> Suspend time << TO -> no entries vanish
> Supens time ~ TO -> some entries vanish
>

There as 2 cases where icons vanish due to suspend.

1st: The moment you resume(it generally happens after long suspends), all
icons vanish instantly(APs/XOs). This bug (#6467) suggests that sugar has a
problem with suspend resume.
The icons slowly reappear. I assume that if the avahi peer list is intact
that all XOs return.

2nd: The avahi list smtimes looses some or all of the peers at resume. This
is also under 6467, but it seems technicaly different. One possible
explanation could be that during suspend th XO resumes several times, but i
didnt notice it! And within this time frames it realized that the other
suspended XOs are gone, so it cleared its cache. Now when I resumed it
myself, I observed that the cache is clean!!

Now, regarding the timeouts of avahi. This is a 3rd thing:
When an XO leaves the channel we have 4 states:
   mm:ss
1. 00:00  XO leave the channel(manually/or ti suspended)
2. 10:00  Avahi notices teh XO left, and reports it as "failed"
3. 30:00  Icon dissappears in the mesh view
4. 60:00  Avahi cache is cleared
Additionally there is a bug(#5501) according to which, is a NEW XO arrives
between states 2 and 3, then instantly ALL "failed" avahi peers are cleared
and the corresponding icons vanish.

So, the 3rd case is the following:

Assume a mesh has e.g. 20 XOs, and I use my XO so it doesnt suspend, but the
rest 19 of them are suspended.
If in >10mins a new XO arrives, then all the 19 XOs instantly vanish from
the mesh.

So the TO time is between 10->30min... but closer to 10min if many XOs
suspend/resume
So if resume time << 10min everything is fine!!



What i dont know is when an XO resumes if it sends any avahi packet no
notify tis presence/return. Because if it doesnt, then the XO wont exist int
he others cache list, so the others wont search for it.
Sjoerd, can you answer this?

This would explain why after resume some XOs take tooo long to see each
other again.
If you combine this with the "2nd" case, you will see that in the natural
case that XOs will resume at random points in time by the user, they will
all clear their cache, unless they resume concurrently.
So in the end, all will have empty caches!!




>
> >
> > Thanks,
> >
> > - Chris.
> > --
> > Chris Ball   <[EMAIL PROTECTED]>
> >
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Salut and Suspend/Resume issues

2008-02-16 Thread Giannis Galanis
There are a couple of important issues/bugs regarding Salut and
Suspend/Resume.

FIRST, there is a "sugar issue", (or at least it seems so).
When an XO resumes after long suspends, all icons(APs, XOs, but not the
meshes) instantly vanish*(#6467)*. Then they slowly reappear. Although with
the APs the situation is pretty straightforward, with the XOs we have
several cases:

   - all XOs in the mesh return almost instantly
   - all or some XOs return slowly one by one
   - nothing returns, and avahi peer list is empty*(#6498)*

It seems that although suspend should keep the previous situation frozen, in
fact the avahi peer list is affected.


SECOND, we have a network issue, which suggests a "war" between
suspend/resume and avahi/salut
Suspend will be interrupted only with unicast packets, but Salut/avahi rely
on multicast packets.

The result is that  when an XO that appears in the mesh view is suspended,
avahi will treat it just as if it has left the mesh.


   - When an XO is being used(not suspended), all other suspended XOs in
   the mesh will start failing 1 by 1
   - From the moment an XO is suspended in about 10-30min the icon will
   vanish.*(#6282)*
   - If within this time new XOs join the mesh than the icon will vanish
   instantly!!*(#5501)*
   - If gradually several removed XOs start to resume, their icons will
   start returning

*As you can see, the XOs have very little chance to even see each
other**

RESULT:
A mesh of several XOs will avoid icons flashing here and there, ONLY if no
XO has been idle for more 10min, which is rather unlikely.

Considering the effects of the FIRST issue, you would practically have to
restart sugar or switch channel back and forth to return to your original
status.

Salut/avahi are very sluggish in handling failed connections, and suspend
resume enhaces this effect.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: New joyride build 1687

2008-02-13 Thread Giannis Galanis
Perhaps it is not important, but the shown changes from olpc-utils 0.67 to
0.68 is wrong.

It includes stuff that were added in previous versions ages ago

On Feb 13, 2008 2:24 AM, Build Announcer v2 <[EMAIL PROTECTED]> wrote:

> http://xs-dev.laptop.org/~cscott/olpc/streams/joyride/build1687
>
> Changes in build 1687 from build: 1686
>
> Size delta: 0.00M
>
> -rainbow 0.7.9-1.olpc2
> +rainbow 0.7.10-1.olpc2
> -xterm 231-1.fc7
> +xterm 232-1.fc7
> -krb5-libs 1.6.1-4.fc7
> +krb5-libs 1.6.1-6.fc7
> -olpc-utils 0.67-1.olpc2
> +olpc-utils 0.68-1.olpc2
> -openldap 2.3.34-6.fc7
> +openldap 2.3.34-7.fc7
> -Terminal 8
> +Terminal 9
> -olpcsudo 1.3-0
>
> --- Changes for rainbow 0.7.10-1.olpc2 from 0.7.9-1.olpc2 ---
>  + Symlink ~/{.macromedia,.adobe} -> ~/.instance to ease
>
> --- Changes for xterm 232-1.fc7 from 231-1.fc7 ---
>  + update to 232
>
> --- Changes for olpc-utils 0.68-1.olpc2 from 0.67-1.olpc2 ---
>  + Import olpc-netstatus 0.4 from Yanni
>  + dlo#5746: Do not try to rename msh0.
>  + dlo#5153: Fix sysfs path to rtap
>  + Use GPLv2+ license tag as nothing in this package is GPLv2-only.
>  + Make preview cleaner robust in the case of a missing datastore
>  + Do not bother running journal cleaner on fresh installations (saves
> time on first boot)
>  + Add a silly TODO list
>  + Bump revision to 0.65
>  + Import olpc-netlog-0.3 and olpc-netstatus-0.3
>  + Add 'clean-previews' and incorporate it into olpc-configure.
>  + 'become_root' script merged upstream.
>  + Update License field to GPLv2 in order to match the COPYING file.
>  + Install a simple 'become_root' script to ease dlo#5537.
>  + Rename RPMDIST to DISTVER and DISTVAR to DIST
>  + dlo#5626: Fix permissions in /home/bernie.
>  + Insert extra spacing at the top for cosmetic reasons
>  + Spacing fixes
>  + Add missing cron job for olpc-pwr-prof
>  + Power profile scripts
>  + Construct Rainbow's spool dir if it doesn't exist - #5033
>  + Ensure /security has reasonable permissions.
>  + Depend on /usr/bin/find
>  + Remove files in $OLPC_HOME before creating them.
>  + Add missing dependencies.
>  + Use /ofw/openprom/model instead of olpc-bios-sig
>  + Add more missing dependencies
>  + Remove stray reference to olpc-bios-sig.c.
>  + Pass absolute paths to rpmbuild
>  + Add back sbin dirs to unprivileged users PATH
>  + Invoke rainbow-replay-spool
>  + Remove stupid 'exit 0' in zzz_olpc.sh that makes bash *exit* rather
> than skip the scriptlet
>  + Depend on tcpdump for olpc-netcapture.
>  + Fix version replacement in spec file
>  + Merge olpc-netstatus 0.2
>  + Merge olpc-netlog 0.2
>  + Really bump revision
>  + Add a couple of new languages
>  + Add missing files
>  + Ensure correct keyboard is loaded even on first boot
>  + Don't create /root/.i18n as it makes us loose the boot time
> optimization
>  + Add code to help us improve boot time
>  + Add VMware configuration.
>  + Fix http://dev.laptop.org/ticket/5320
>  + Display motd in profile, not through /bin/login
>  + Simplyfy setxkb invocation
>  + Add ASCII art for motd (need more translations)
>  + More languages for the motd
>  + Replace fake input driver hack with proper config option.
>  + Fix http://dev.laptop.org/ticket/5114
>  + Simplify test for Geode
>  + Reindent with TABs to match other init scripts
>  + Remove check for A-test boards (the following code is harmelss)
>  + Be a little more verbose on progress.
>  + Fix https://dev.laptop.org/ticket/5217: Update library index
>  + Only run checks on start
>  + Use $OLPC_HOME consistently
>  + Only run hardware configuration on startup.
>  + Fix numeric test on empty flag file.
>  + Bump revision
>  + Add olpc-netcapture to %files
>  + Fix olpc#5195: Console font too small when using pretty boot.
>  + Bump revision
>  + Add autoconf check for PAM
>  + Update spec file
>  + Merge branch 'master' of
> ssh://[EMAIL PROTECTED]/git/projects/olpc-utils
>  + Automatically push to origin on bumprev
>  + Fix bumprev rule
>  + Bump revision
>  + Reorganize variables
>  + Fix http://dev.laptop.org/ticket/4928
>  + Fix permissions on /home/olpc
>  + Bump revision
>  + Pacify automake's portability warnings
>  + Update spec file
>  + Even more aggressive packaging automation
>  + Add script to import srpms in Fedora.
>  + Merge commit 'cscott/master'
>  + Explicitly strip NUL from mfg tags
>  + Add cvs-import.sh to EXTRADIST
>  + Fix https://dev.laptop.org/ticket/4762
>  + Bump revision
>  + Separate out configuration done to /home and /.
>  + Create /home/devkey.html, which can be used to request a developer key.
>  + Automate the release process a bit more.
>  + Approximate XOs DPI on emulators.
>  + ReTAB.
>  + Automate specfile generation some more
>  + Ignore a few more generated files.
>  + Set i18n settings from the new manufacturing data tags
>  + Go back to starting sugar with /usr/bin/sugar.
>  + Bump revision
>  + Add bumprev rule
>  + Merge branch 'master' of
> ssh:/

Re: [Server-devel] Mesh Portal Question

2008-02-12 Thread Giannis Galanis
I believe in the blind table of XO-1, you have to include the anycast
address C027C027C027.
I think the last digits of the address are custom, but not sure though.

Still, in your case, you should only blind XO2 to XO1, and dont forget to
invert the blinding table.

On Feb 11, 2008 11:41 AM, John Watlington <[EMAIL PROTECTED]> wrote:

>
> Waqas,
>Are you explicitly blinding the laptops to force that network
> configuration ?
>
> Can XO-2 talk to XO-1 fine ?   Can XO-1 talk to the server ?
>
> We do this regularly --- it has been tested and works.
>
> John
>
> On Feb 9, 2008, at 6:45 AM, Waqas Toor wrote:
>
> > Hello All,
> >
> > He is my scenario,
> > XS < XO-1 <--- XO-2
> >
> > I am unable to access school server from XO-2 via XO-1 route, I have
> > 656 build on my XOs and server build 150 on my server with 1 active
> > antennae
> >
> > what could be the problem, how to access XS from different hops of XOs
> > as the automatic configuration didn't create route to the server
> >
> > Regards
> >
> > --
> > Waqas Toor
> > member olpc Pakistan team
> > ___
> > Server-devel mailing list
> > [EMAIL PROTECTED]
> > http://lists.laptop.org/listinfo/server-devel
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: [Server-devel] Mesh Portal Question

2008-02-12 Thread Giannis Galanis
I believe in the blind table of XO-1, you have to include the anycast
address C027C027C027.
I think the last digits of the address are custom, but not sure though.

Still, in your case, you should only blind XO2 to XO1, and dont forget to
invert the blinding table.


On Feb 11, 2008 11:41 AM, John Watlington <[EMAIL PROTECTED]> wrote:

>
> Waqas,
>Are you explicitly blinding the laptops to force that network
> configuration ?
>
> Can XO-2 talk to XO-1 fine ?   Can XO-1 talk to the server ?
>
> We do this regularly --- it has been tested and works.
>
> John
>
> On Feb 9, 2008, at 6:45 AM, Waqas Toor wrote:
>
> > Hello All,
> >
> > He is my scenario,
> > XS < XO-1 <--- XO-2
> >
> > I am unable to access school server from XO-2 via XO-1 route, I have
> > 656 build on my XOs and server build 150 on my server with 1 active
> > antennae
> >
> > what could be the problem, how to access XS from different hops of XOs
> > as the automatic configuration didn't create route to the server
> >
> > Regards
> >
> > --
> > Waqas Toor
> > member olpc Pakistan team
> > ___
> > Server-devel mailing list
> > [EMAIL PROTECTED]
> > http://lists.laptop.org/listinfo/server-devel
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: jabber for non-wireless XO ?

2008-01-31 Thread Giannis Galanis
can you please specify:
which jabber you tried to connect to?
which build are you running?

On Jan 20, 2008 7:42 PM, Mikus Grinbergs <[EMAIL PROTECTED]> wrote:

> I don't have any wireless.  I do have a wired ethernet connection to
> a LAN (which in turn uses a proxy to reach the internet).
>
> Even when I specify in sugar-control-panel the name of a real
> server, my XO is not accessing jabber (the field in olpc-netstatus
> is shown blank).  I believe my proxy can correctly pass requests for
> ports 5222-5223.  Does telepathy work with a wired ethernet?  Does
> it have a problem if the connection is through a proxy?
>
> mikus
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut/avahi/meshview issues

2008-01-31 Thread Giannis Galanis
On Jan 31, 2008 10:54 AM, Ricardo Carrano <[EMAIL PROTECTED]>
wrote:

>
> I believe our current salut/avahi issues are described in the following
> > points:
> >
> > 1. I was under the impression that when a peer switches channels it
> > sends a "goodbye signal". And in fact only anorthodoxically removed
> > peers(after crashes/poweroffs by pressing the button etc) would delay to
> > disappear from mesh views.  The 10min TTL is not unreasonable, but it should
> > only be used for a routine check. In fact peers that leave/arrive should
> > inform the mesh instantly. In that case the 10min TLL will only affect only
> > the mesh points with noisy links that their "goodbye" signals will get lost.
> > And these connections are less priority anyway. Also we could send 2/3
> > "goodbye" signals to "ensure" delivery.
>
>
> Mm, it seems that some dbus signal or the respective processing by the PS
> lacks. Is there a NM dbus signal when we change channels? This should be
> easy to determine.
>
> >
It must be very easy for the PS to detect a channel change, or anyway when
the XOs leaves the channel. The point is whether avahi supports such
notifications, so the other peers can instantly remove the entry.


>
> > 2. We should definitely decrease the timeout window between a lost peer
> > being detected, and the actual disappearance from the mesh view. This used
> > to be 10min, now it is 20min, but really, to my experience, if a peer is for
> > more than 1-2min away he aint coming back.
>
>
> For what you describe this does not seem related to the protocol itself,
> right? I believe it is important to achieve our goals without making the
> protocol more chatty.
>
> >
This timeout is client specific, and doesnt affect the protocol itself at
all. There reason this timeout exists(to my knowledge anyway), is that
sometime a peer seems indiscoverable, but in fact it is just the effect of a
poor link. So the peer rejoins shortly after. The effect would be XOs would
move around the mesh view. To solve this issue, we wait for several minutes,
before actually removing the XO.
To my opinion the more we hide from the user, the more she gets confused.
Keeping the icon in the mesh view while the connections is down, just messes
things up.
I also remember that there was the idea of keeping the "lost" icon in the
mesh view, but notifying the user somehow, like change its outline to a
dotted line or smth. But, this is a UI issue


>
> >
> > 3. Should we make the above TTL and timeout to be user specific, or
> > custom anyway?. Will there be a problem if two XOs have different TTL? I
> > would assume that it wont. The idea is that it is a waste of our resources
> > to try to calculate the ideal values of TTL and timeout by asking the
> > collabora team to fix, and fix again. Whereas we can make the test here in
> > 1cc, and find ourselves which suits as best. Is it easy to implement such a
> > patch?
>
>
> I believe it  is useful to have  some controls  in order to help  tuning
> things up.  But not all of them need to be translated in user friendly
> controls. I believe your question would be how we could change this setting
> ourselves. Did I get it right?
>
> Exactly. By no means we need to have this controls user friendly. We only
need the ability to tune them dynamically our selves for testing and
evaluating purposes.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut/avahi/meshview issues

2008-01-31 Thread Giannis Galanis
> > 2. It takes up to 10min for avahi even to detect the inactivity of a
> peer.
> > i.e. If an XOs switches channels, for up to 10min avahi wont even
> know(it
> > used to be 1-2min).
>
> Is this with or without the patch from bug #6162 ? If without, then the
> time it
> takes avahi to discover it should still be 2 mintues. I'd like to how you
> test
> this. Oh and please file a bug, so we can actually track these issues.
>

The patch 6162, as well as the patch of 5501 are in included in the 689/690
that I am testing. So this indeed explains the 10minutes(Actually i just
found out of this bug).


> > 3. It will take a total of about 30min for the XO to vanish from the
> mesh
> > view(this is tooo long!)
>
> Again, file a bug. Needed info here is if there is a time difference
> between
> when avahi marks something as removed, when salut sends out the removed
> signal
> and when it actually disappears from the mesh view.
>

This is now filed as 6282, with all dbusmonitor/avahibrowse logs to
compare.
This case is also an example of a avahi/mesh view inconsistency.
Icons disappear form the mesh view/ but remain for about 1h longer in the
avahi cache
But these details should continue in trac anyway.


>
> > 4. Avahi/mesh view respond independently.
> > The situation used to be that when an entry dissappeared in avahi, it
> > disappeared in mesh view, and the same when new peers arrive.
> > This relation was very consistent.
> > However, now we have the following cases:
> > a) an XO will vanish from the mesh view, but remain "indefinitely" in
> the
> > avahi cache as "failed to resolve"
> > b) sometimes avahi shows alot less peers than the mesh view. The extra
> peers
> > in the mesh view are definitely active since they properly respond to
> > activity joining/sharing.
> > c)sometimes avahi included more active peers than the mesh view.
> > does anyone know why this is happening?
> > Is it a bug?
> > I have logs, if needed, that compare avahi-browse with timestamped
> > dbus-monitor logs, that indicate the inconsistencies.
>
> Well you all list them as undesired behaviour, so i would say they're
> bugs.
>


> > 5. An important improvement is that peers will not generally fail alot
> on
> > their own.
> > So, if many XOs join a mesh channel, and noone goes away, the will not
> start
> > failing. This used to be a common effect after 4-5 XOs. However, i
> noticed
> > once in 1cc, 61 active XOs in the mesh view!
>
> When you say salut, you actually mean avahi. It would help if you could be
> clear on what you mean :) This improvement is probably caused by the fix
> in
> #5501.
>

I mean avahi indeed. In the past these two were very tight to each other.
And i believe that the only direct way to examine salut is by checking the
buddy list in the Analyze activity.
I remember Ricardo had an interesting case were the buddy list included
plenty of XOs, which were also properly sharing in the mesh view, but the
avahi list was empty. Does this seem possible? (unfortunately no log at the
moment)


>
> Anyway for all the bugs you should have filed instead of sending this
> mail, i
> will need tcpdump logs, avahi logs, salut logs and if possible meshview
> logs
> indicating when contacts are removed from the mesh from a machine where
> you say
> the behaviour. Preferably with timestamps


I updated the trac with logs/tcpdumps/dbusmon/screenshots...enjoy!

The reason i send first this email before filing tons of bugs is because i
though it was necessary to describe the big picture, and the current status
of salut. And also to avoid duplicate bugs, or bugs that are in fact
intentional mods.

This conversation was unfortunately directed towards other issues(wireless
difficulties is a sensitive subject at olpc!), but in fact its purpose was
to determine some very specific bugs in salut, that have nothing to do "at
the point" with scalability or robustness of the protocol.  When these are
resolved, we can proceed with scalability, for which i am very confident.

I believe our current salut/avahi issues are described in the following
points:

1. I was under the impression that when a peer switches channels it sends a
"goodbye signal". And in fact only anorthodoxically removed peers(after
crashes/poweroffs by pressing the button etc) would delay to disappear from
mesh views.  The 10min TTL is not unreasonable, but it should only be used
for a routine check. In fact peers that leave/arrive should inform the mesh
instantly. In that case the 10min TLL will only affect only the mesh points
with noisy links that their "goodbye" signals will get lost. And these
connections are less priority anyway. Also we could send 2/3 "goodbye"
signals to "ensure" delivery.

2. We should definitely decrease the timeout window between a lost peer
being detected, and the actual disappearance from the mesh view. This used
to be 10min, now it is 20min, but really, to my experience, if a peer is for
more than 1-2min away he aint coming back.

3. Should we

Salut/avahi/meshview issues

2008-01-29 Thread Giannis Galanis
I understand that salut is not very popular lately since we are drifting
mostly towards infra mode.
Still, it is the preferable way for G1G1 laptops to talk to each other,
since there is no SS, and the public jabber is not guaranteed, or in the
future overcrowded.

I have conducted several tests with a group of 9 XOs blinded with each
other.
The most important issues is the response of the mesh view, when an XO
leaves the mesh.

The results were:

1.  The xmas tree effect is still here.
i.e. XOs occasionally vanish/reappear in differenent positions.
This is because of the following:
When the avahi cache includes several inactive/departed/(reported as failed)
peers,
and a new pear arrives,
then all the inactive peers vanish from the screen instantly. (#5501)
If their inactivity was temporary, then they will reappear shortly in a
different location
If for e.g. 3-4 XOs are (by user internention) moved simultaneously from ch6
to ch11, and then back to ch6, the icons wont have the time to disappear.
BUT, the first to return to ch6 will cause the effect/bug to the others,
which will instantly vanish. Shortly after they will naturally all return
1by1 to ch6 and will reappear in different locations.
There was a patch for this issue(5501), which was included in 678+, but it
has no effect.

2. It takes up to 10min for avahi even to detect the inactivity of a peer.
i.e. If an XOs switches channels, for up to 10min avahi wont even know(it
used to be 1-2min).

3. It will take a total of about 30min for the XO to vanish from the mesh
view(this is tooo long!)

4. Avahi/mesh view respond independently.
The situation used to be that when an entry dissappeared in avahi, it
disappeared in mesh view, and the same when new peers arrive.
This relation was very consistent.
However, now we have the following cases:
a) an XO will vanish from the mesh view, but remain "indefinitely" in the
avahi cache as "failed to resolve"
b) sometimes avahi shows alot less peers than the mesh view. The extra peers
in the mesh view are definitely active since they properly respond to
activity joining/sharing.
c)sometimes avahi included more active peers than the mesh view.
does anyone know why this is happening?
Is it a bug?
I have logs, if needed, that compare avahi-browse with timestamped
dbus-monitor logs, that indicate the inconsistencies.

5. An important improvement is that peers will not generally fail alot on
their own.
So, if many XOs join a mesh channel, and noone goes away, the will not start
failing. This used to be a common effect after 4-5 XOs. However, i noticed
once in 1cc, 61 active XOs in the mesh view! This shows that salut is more
capable then we expected.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: flash usb drives not on journal

2008-01-29 Thread Giannis Galanis
Try to remove the olpc.store from the drive.

I remember there used to be a bug about that, but i cant find it.


2008/1/29 Ricardo Carrano <[EMAIL PROTECTED]>:

> I apologize if this is intended and I missed  the news, but usb  sticks
> are not displaying in the journal anymore (joyride 1608).
>
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Testing the Wireless driver changes

2008-01-17 Thread Giannis Galanis
I see.
But i hope this is not done only because of the airline issue, and that
there are other reasons that it useful to boot with firmware unloaded.

Because as far as the airline issue is concerned, we should not take it tooo
seriously. As long as we have a working solution it is fine. Not many people
will use it anyway.
U see my point?

Also, if it is smth simple we can quickly implement is for Update.1.


On Jan 17, 2008 7:36 PM, David Woodhouse <[EMAIL PROTECTED]> wrote:

>
> On Thu, 2008-01-17 at 19:16 -0500, Giannis Galanis wrote:
> > It must be noted that the important issue of this discussion is how to
> have
> > the radio blocked from BEFORE the XO boots, so as not to be conflicting
> with
> > the airline regulations.
>
> We should change the firmware so that it isn't active automatically as
> soon as it's loaded -- let the driver activate it when it's appropriate.
> Then the decision as to whether the radio is blocked can properly be
> handled in userspace, and the device can be left quiescent if
> appropriate.
>
> --
> dwmw2
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Testing the Wireless driver changes

2008-01-17 Thread Giannis Galanis
It is included in http://wiki.laptop.org/go/Wireless_Driver_README
and if it helps i believe it was removed around september i believe.

It was definitely an active ioctl.

It must be noted that the important issue of this discussion is how to have
the radio blocked from BEFORE the XO boots, so as not to be conflicting with
the airline regulations.

Renaming usb8388.bin works fine, and as expected is kept even after the
reboot(just checked to be sure).
We just need to include it in sugar-control-panel or smth.

When testing in the lab, it is not important which method is the more
appropriate since all cover our need of a turned off radio.

But, still this involves "only documenting" a method for silencing the
radio, so we are legally covered.
In fact, the FAA has a law only on operating in the 800mhz band on an
airplane.
The airlines, based on that  law, developed  regulations that cover all
mobile phones/ wireless devices for reasons of simplification.
What i wanna say, is that as far as we are concerned it perfectly safe and
lawful to use an XO with mesh on, chat and any other stuff on a plane. And
if we wanna play "extra safe" we simply turn the radio off after the reboot.




On Jan 17, 2008 6:26 PM, David Woodhouse <[EMAIL PROTECTED]> wrote:

>
> On Thu, 2008-01-17 at 12:32 -0500, Michail Bletsas wrote:
> > There is an "iwpriv eth0 radiooff/radioon" IOCTL hook in the firmware
> > which was meant to control the radio power directly - it was removed a
> few
> > months ago since it wasn't considered to its thing in the "proper" linux
> > manner.
>
> I looked for it and I couldn't find it. Please could you point me at the
> commit in which it was removed? I'm not entirely sure it ever made it
> into our driver. Certainly it never made it into the upstream driver,
> and the upstream driver is all that really matters, in the long term.
>
> > ** I don't know how "iwconfig eth0 txpower off" is implemented, if it
> > uses the same IOCTL with "iwpriv eth0 radiooff", then it is doing the
> > right thing.
>
> It uses CMD_802_11_RADIO_CONTROL with the RADIO_OFF argument, which I
> believe is the correct thing to do.
>
> --
> dwmw2
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Testing the Wireless driver changes

2008-01-15 Thread Giannis Galanis
David,

There are a couple of issues i would like to address, mostly related to the
new wireless driver.

First, the netstat command:
About 50% of the time it becomes very slow(practically freezes) and spews a
"getnameinfo error".
The result from strace is:
---
.
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("
172.18.0.1")}, 28) = 0
fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
gettimeofday({1200442106, 340565}, NULL) = 0
poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
send(4, "\270\227\1\0\0\1\0\0\0\0\0\0\1e\0017\1c\1e\1c\0010\1e\1f\1f\1f"...,
90, MSG_NOSIGNAL) = 90
poll( 

It seems(according to Bernie)..that netstat makes queries to the DNS server
but it is temporarily down. Still if you execute the command a couple of
time it works again, but is a very regular phenomenon. This should be a
network issue, and not a driver issue, but can you confirm that?

Also, the msh0 interface is named after msh0_rename. Is there a reason for
that? Will this change back to normal in the future? How will it be in
Update1?. This inconsistency causes some issues in the olpc-netstatus
command utility.

Can you also please describe the changes from the user's perspective that
are changed/improved in the new driver. So we know were to start testing
from.
For example,
what is the situation with mesh on or off
is the mesh-start file still in use
are improvements related to 4470

thanx,

yani



On Jan 15, 2008 6:40 PM, Kim Quirk <[EMAIL PROTECTED]> wrote:

> David,
> Yani is back from his time off and finished with his exams (at least for
> now). Before the new year break, he had been working on testing, documenting
> and debugging issues mostly associated with avahi and telepathy, but also
> with wireless. He and Ricardo have been our wireless test experts.
>
> Now that he is back, it would be great if you and Michail can provide some
> thoughts on the highest priority testing that we should do here or at
> Michail's house (for a little more controlled RF setting); so we can try to
> find bugs as quickly as possible.
>
> Also - Ricardo, you might be able to give us some indication of your
> availability for testing and how many laptops you have in Brazil, etc.
>
>
> Thanks,
> Kim
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: The reason we see icons flashing here and there in the mesh view.. i.e. "xmas tree effect"

2007-12-14 Thread Giannis Galanis
The test showed that the effect is not a result of a network failure.
It occurs naturally, every time a new host arrives, while at the same time
another host appears dead.
"Dead" can also mean a host that simply disconnected fro the channel by user
intervention.

The best and simplest way to recreate the effect in any environment(noisy or
not) is to:
1.Connect successfully 3 XOs in the same mesh.
2.Move successfully XO1,XO2 to another channel., and verify the show as
"failed" when running "avahi-browse" in XO3
3.Reconnect at the same time XO1,XO2 to the initial channel.
4.While the XOs are trying to connect(30sec) check they still show are
"Failed" when running "avahi-browse" in XO3
5.Observe the screen in XO3: the icons of XO1,XO2 will jump almost at the
same time.

To my best understanding,
It is not related to a noisy envirnment
Does not require a large number of laptops
Can be recreated in 100% of the times you try the above.
I believe that if the emulator you operate, uses the proper timeouts, you
will see the effect

yani

On Dec 14, 2007 4:31 AM, Sjoerd Simons <[EMAIL PROTECTED]> wrote:

> On Thu, Dec 13, 2007 at 11:18:01PM -0500, Giannis Galanis wrote:
> > THE TEST:
> > 6 XOs connected to channel 11, with forwarding tables blinded only to
> them
> > selves, so no other element in the mesh can interfere.
> >
> > The cache list was scanned continuously on all XOs using a script
> >
> > If  all XOs remained idle, they all showed reliably to each other mesh
> view.
> > Every 5-10 mins an XO showed as dead in some other XOs scns, but this
> was
> > shortly recovered, and there was no visual effect in the mesh view.
>
> Could you provide a packet trace of one of these XO's in this test?
> (Install
> tcpdump and run ``tcpdump -i msh0 -n -s 1500 -w ''.
>
> I'm surprised that with only 6 laptops you hit this case so often.
> Ofcourse the
> RF environment in the OLPC is quite crowded, which could trigger this.
>
> Can you also run: 
> http://people.collabora.co.uk/~sjoerd/mc-test.py<http://people.collabora.co.uk/%7Esjoerd/mc-test.py>
> Run it as ``python mc-test.py server'' on one machine and just ``python
> mc-test.py'' on the others. This should give you an indication of the
> amount of
> multicast packet loss.. Which can help me to recreate a comparable setting
> here by using netem.
>
> > If you switched an XO manually to another channel, again it showed
> "dead" in
> > all others. If you reconnected to channel 11, there is again no effect
> in
> > the mesh view.
> > If you never reconnected, in about 10-15 minutes the entry is deleted,
> and
> > the corresponding XO icon dissapeared from the view.
> >
> > Therefore, it is common and expected for XOs to show as "dead" in the
> Avahi
> > cache for some time for some time.
> >
> > THE BUG:
> > IF a new XO appears(a message is received through Avahi),
> > WHILE there are 1 or more XOs in the cache that are reported as "dead"
> > THEN Avahi "crashes" temporarily and the cache CLEARS.
> >
> > At this point ALL XOs that are listed as dead instantly disappear from
> the
> > mesh view.
>
> Interesting. Could you file an trac bug with this info, with me cc'd ?
>
>  Sjoerd
> --
> Everything should be made as simple as possible, but not simpler.
>-- Albert Einstein
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


The reason we see icons flashing here and there in the mesh view.. i.e. "xmas tree effect"

2007-12-13 Thread Giannis Galanis
I had several tests related to the xmas tree effect we see in the mesh view.


The effect is that some times XOs disappear + reappear to the same or
different position, or simply disappear. More usually it happens for many
XOs simultaneously.

The results i have, clearly indicate that this is an issue an the Avahi
daemon, which is used by the Salut telepathy service. The sugar interface
displayes the information it receives from salut very reliably. This means
that when a host dissapear from the avahi's host list, it vanished instantly
from the mesh view, and the same when a new host arrives.

The Avahi deamon runs below Salut and keeps receives information from other
hosts in the network which also run Avahi deamon.
It keeps a local cache with the recent hosts.
At regular intervals(of 1-2 mins i think), it checks whether the hosts in
the cache are alive. If not, they are recorded as "failed"
The above check can be invoked by "avahi-browse -t -r _presence._tcp"
continuously(instead of waiting for 1-2mins)
After a certain timeout, a failed entry(dead host) will disappear from the
cache, and instantly it will disappear from the mesh view.

This timeouts is pretty long(several minutes), so a host(XO) has the chance
to become alive again with no effect on the mesh view.
This can occur when:
a. the XO's avahi packets dont get through due to high mesh traffic. In this
case the other XOs might either see is as alive, or dead according to the
conditions.
b.the XO's deliberately moved to another channel, or anyway disconnected. In
that case, all othes XOs will see it as dead
>From a client's point of view, the two cases are treated almost the same.

THE TEST:
6 XOs connected to channel 11, with forwarding tables blinded only to them
selves, so no other element in the mesh can interfere.

The cache list was scanned continuously on all XOs using a script

If  all XOs remained idle, they all showed reliably to each other mesh view.
Every 5-10 mins an XO showed as dead in some other XOs scns, but this was
shortly recovered, and there was no visual effect in the mesh view.

If you switched an XO manually to another channel, again it showed "dead" in
all others. If you reconnected to channel 11, there is again no effect in
the mesh view.
If you never reconnected, in about 10-15 minutes the entry is deleted, and
the corresponding XO icon dissapeared from the view.

Therefore, it is common and expected for XOs to show as "dead" in the Avahi
cache for some time for some time.

THE BUG:
IF a new XO appears(a message is received through Avahi),
WHILE there are 1 or more XOs in the cache that are reported as "dead"
THEN Avahi "crashes" temporarily and the cache CLEARS.

At this point ALL XOs that are listed as dead instantly disappear from the
mesh view.
But, of course, some of the "dead" XOs are expected to re-appear shortly.
Specially those that are still in the same mesh channel, but merely failed
to transmit its avahi packets due to traffic load.

Note that if there is only 1 XO that looks dead, but returns, everything is
normal.
But, if there are 2,3.. XOs that look dead, when 1 returns, then:
a. all(the dead ones) disappear from the view
b. the 1 that returned will reappear right after in probably a different
position. i.e. it will "jump"

The avahi-browse command scans realtime the network(i.e. sends requests for
all hosts in its cache list) and runs for a several seconds. If the above
situation occurs, it freezes(this is what i meant by "crashes"). When it is
restarted the cache is cleared from previously dead hosts.

A typical situation that the "xmas tree effect" occurs:
20 XOs are running salut in channel 1. This incuded XOs conencted to
medialab AP, schoolserver, linklocal.
XOs leave the channel continuously.
Concurrently, some connected XOs appear dead for 1 minute or so, and
reappear after short time.

Assume that at some point 5 XOs have either really left, or "seem dead"
anyway

At some point 2 of these XOs are reconnected at the same time to the mesh
channel by someone in the office.
The 2 XOs will "jump" to a different position, whereas the other 3 will
simply vanish

The way I see it, there is very clear/narrow/specific bug in handling the
cache by the avahi daemon,
when new hosts + dead hosts coexist.

I hope the tests have cleared the picture alot

yani
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: connection to jabber.laptop.org

2007-12-13 Thread Giannis Galanis
On Dec 13, 2007 5:08 PM, John Watlington <[EMAIL PROTECTED]> wrote:

>
> On Dec 13, 2007, at 1:05 PM, Giannis Galanis wrote:
>
> > I also installed the rpm in custom machine(not a school server) in
> > 1CC.
> >
> > I must note that the ejabberdctl-extra.diff patch in the wiki page
> > is for another version than 1.1.4.
> >
> > I used the config file ejabberd.cgf which I  got from
> > jabber.laptop.org
> > John, I didnt use the "jtest" account, but "omicron" which danny
> > created for me last week. I got the file from /home/wad
>
> The accounts are not registered in the ejabberd.cfg file.   They are
> kept in database (which can
> be dumped and reloaded using ejabberdctl).  Thus, omicron doesn't
> have an account on your new machine.
>
> > I couldnt register the admin account(is this necessary? because it
> > is not stated in the wiki)
>
> This is absolutely necessary, and was the sticking point for me last
> week on a schoolserver.
>
> > I tried:
> > ejabberdctl register localhost admin admin (according to wiki)
> > ejabberdctl ejabberd register admin localhost admin (according to
> > the previous email)
> > or
> > ejabberdctl register admin localhost admin
> >
> > Every time i received:
> > RPC failed on the node [EMAIL PROTECTED] : nodedown ro similar
>
> Take a look at the command line parameters to ejabberdctl.   I
> wouldn't expect those commands to work.
>

The  commandline parameters of ejabberdctl are not easy to find.  Why do you
think the above commands would not work?
You said you used "ejabberdctl ejabberd register admin localhost admin" and
managed to register.

I also tried
ejabberdctl delete-older-users
ejabberdctl status
ejabberdctl --node localhost status

I received:
RPC failed on the node {1st [EMAIL PROTECTED] : nodedown




> Can anyone from collabora please specify the single correct way to
> configure this, because we will never get it right.

Yes, please.
>
> > Also i couldnt connect to http://yourserver:5280/admin/. Perhaps
> > this is expected since the admin account was not succefully created.
>
> Correct.   You were trying  http://18.85.46.175:5280/admin/, right ?
>
> > I could "telnet 18.85.46.175 5222" from an XO, or "telnet localhost
> > 5222" and successfully connected.
> > Note 18.85.46.175 is the servers IP.
> >
> > I tried to connect to the custom jabber server through an XO by
> > sugar-control-panel -s jabber 18.85.46.175
> > sugar reboot
> >
> > The gabble logs, which i attach, show an initial succefull
> > connection, which failed later on.
> > Also in the server side, the following message poped up:
> > INFO REPORT:
> > [(<0.185.0>:ejabberd_listener:90):(#port<0.388>) Accepted connection
> > ({0,0,0,0,0,65535,46935,5098,56209}) ->
> > ({0,0,0,0,65535,46935,11951,5223})]
> > or similar.
> >
> > The XO was finally connected to salut. However, no other XO has
> > managed to connect to jabber.laptop.org successfully the past week.
> > Is there a reason for this?
>
> That is interesting.   The server is up and running, and thinks that
> 140 of the
> 7500+ registered users are currently using it.   You might try
> restarting it ?
>
> wad


There are 140 people connected to jabber.laptop.org?
With what command can you see this?

Noone at the office has connected recently.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: connection to jabber.laptop.org

2007-12-13 Thread Giannis Galanis
I also installed the rpm in custom machine(not a school server) in 1CC.

I must note that the
ejabberdctl-extra.diffpatch
in the wiki page is for another version than
1.1.4.

I used the config file ejabberd.cgf which I  got from jabber.laptop.org
John, I didnt use the "jtest" account, but "omicron" which danny created for
me last week. I got the file from /home/wad

I couldnt register the admin account(is this necessary? because it is not
stated in the wiki)

I tried:
ejabberdctl register localhost admin admin (according to wiki)
ejabberdctl ejabberd register admin localhost admin (according to the
previous email)
or
ejabberdctl register admin localhost admin

Every time i received:
RPC failed on the node [EMAIL PROTECTED] : nodedown ro similar

Can anyone from collabora please specify the single correct way to configure
this, because we will never get it right.

Also i couldnt connect to http://yourserver:5280/admin/. Perhaps this is
expected since the admin account was not succefully created.

I could "telnet 18.85.46.175 5222" from an XO, or "telnet localhost 5222"
and successfully connected.
Note 18.85.46.175 is the servers IP.

I tried to connect to the custom jabber server through an XO by
sugar-control-panel -s jabber 18.85.46.175
sugar reboot

The gabble logs, which i attach, show an initial succefull connection, which
failed later on.
Also in the server side, the following message poped up:
INFO REPORT:
[(<0.185.0>:ejabberd_listener:90):(#port<0.388>) Accepted
connection({0,0,0,0,0,65535,46935,5098,56209}) ->
({0,0,0,0,65535,46935,11951,5223})]
or similar.

The XO was finally connected to salut. However, no other XO has managed to
connect to jabber.laptop.org successfully the past week. Is there a reason
for this?

Can you please try to connect to this server(18.85.46.175), and see what you
get?

The important matter is to have straight-forward step by step instructions
to set up the jabber server.
Can anyone please provide that?

yianni






On Dec 13, 2007 8:22 AM, John Watlington <[EMAIL PROTECTED]> wrote:

>
> Robert,
>Can you provide any insight into the problem described below ?
> I'm going to try again to get this configured over the weekend.
>
> wad
>
> On Dec 10, 2007, at 1:34 AM, John Watlington wrote:
>
> > On Dec 6, 2007, at 6:39 PM, Robert McQueen wrote:
> >
> >>> Also, I couldnt connect to http://server:5280/admin/ as indicated
> >>> in the
> >>> wiki. Is this really necessary?
> >>
> >> Yes, at the moment you need to log in on the web interface to set
> >> up the
> >> shared roster. Ticket #5310 is about working out how to avoid this.
> >>
> >> Why couldn't you connect? If it was a login problem, register
> >> whichever
> >> account is listed as the admin in the config file, using "ejabberdctl
> >> register admin server password" (to register [EMAIL PROTECTED] with
> >> password
> >> "password").
> >
> > I installed your latest RPM (thanks), and followed the directions
> > in the wiki
> > (Ejabberd_Configuration) about editing the config.  I then
> > restarted ejabberd,
> > but could not get an admin user registered with the server properly.
> >
> > When I try: sudo ejabberdctl register admin schoolserver admin
> > the response is:
> > RPC failed on the node [EMAIL PROTECTED]: nodedown
> >
> > When I try: sudo ejabberdctl ejabberd register admin schoolserver
> > admin
> > the response is:
> > Can't register user "[EMAIL PROTECTED]" at node
> > [EMAIL PROTECTED]: not_allowed
> >
> > When I try: sudo ejabberdctl ejabberd register admin localhost admin
> > it works, after placing that in the config file (and restarting)
> > I'm unable to
> > connect via http://schoolserver:5280/admin/
> > (Giannis, the only person on the admin ACL for jabber.laptop.org is
> >  jtest.   Were you using that name to try to login ?)
> >
> > The problem might be the weird DNS situation of a school server ?
> > If it DHCPs, it accepts a domain name (search ...) and DNS servers
> > from
> > it's ISP.  But it also maintains a local DNS space
> > (.xs.laptop.org)
> > which will (eventually) be supported from the outside through
> > dynamic or static DNS.
> > For example, I have a server which thinks its FQDN is
> > schoolserver.pinewood.net.
> > It also resolves as schoolserver.pinewood.xs.laptop when querying
> > the local
> > named.   What should the ejabberd server name be (or does it
> > matter ?  Are server
> > names in the ejabberd configuration virtual ?)
> >
> > Can you install your latest ejabberd server on
> > schoolserver.laptop.org ?   The
> > server name from a laptop's point of view will be
> > schoolserver.cambridge.xs.laptop.org.
> >
> > Regarding server release timing (your IRC question ?):   They
> > happen whenever there
> > something new to release.   Anybody running FC7 can build a release
> > and test it...
> > We are a little short on QA people right now, but I moved the 1CC
> > schoolserver
> > over to b

Re: connection to jabber.laptop.org

2007-12-06 Thread Giannis Galanis
oh i thought we were using 1.1.3.

I was using the guidelines from the
http://wiki.laptop.org/go/Ejabberd_Configuration

Ricardo applied the patches provided in the page and compiled them.

Is this necessary with 1.1.4?

Now, 1.1.3 runs, but xo's cannot connect. I set the jabber with
sugar-control-panel but the XOs connects to salut
It can telnet to it though.

Also, I couldnt connect to http://server:5280/admin/ as indicated in the
wiki. Is this really necessary?

thanx

yani

On Dec 6, 2007 1:25 PM, Robert McQueen < [EMAIL PROTECTED]>
wrote:

> Yani,
>
> It's no problem, just that Danny said you were logged in and I was
> worried you were changing the configuration or something!
>
> Which version are you putting on the school server? The version on
> jabber.laptop.org is in an RPM at
> ~robot101/ejabberd/F-7/i386/ejabberd-
> 1.1.4-1.1.20071205svn1027.fc7.olpc.i386.rpm
> on that machine.
>
> Regards,
> Rob
>
> Giannis Galanis wrote:
> > Robert,
> >
> > It was me that logged in the jabber. Sorry i did not let u know, i didnt
> > think it was important.
> >
> > I am trying to set up a jabber server at another school server, but i
> > was having continuous erros with the config file.
> > I logged on to copy the config file used by jabber.laptop.org
> > < http://jabber.laptop.org>.
> >
> > It works now.
> >
> > thanx
> >
> > yani
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: when an xo loses connection, how long does it take to disappear from other's neighbor view?

2007-11-17 Thread Giannis Galanis
Yes, i have seen this ticket in the past. To detect whether an XO is
actually there or not, is a simple task to accomplish, and I am currently
working on a simple script that will give a list of the properly connected
XOs, along with the temporarily disconnected.

It is a very useful idea to display this information in the neighbor view,
in terms of a dotted line, or a grey color perhaps.  The problem is that the
bugs are dealt with according to priority, and generally enhancements
although very practical, can cause  other bugs, or take several builds until
they work properly.

Since we are in "code freeze", a quick solution must be implemented to solve
the current situation, ie that it takes up to an hour for a disconnected xo
to dissapear(just reported as #4735).

yani


On Nov 7, 2007 5:49 PM, Eben Eliason <[EMAIL PROTECTED]> wrote:

> > > 1. We need to fix the timeout for icons to disappear. Can we try
> Guillaume's
> > > patch?
> >
> > I hope so. I have a tarball with the patch, but I'm still waiting for
> > Update.1 approval (it's unclear whether I can build RPMs for Joyride
> > before I get Update.1 approval or not). If you're at 1CC, could you
> please
> > annoy the ApprovalForUpdate people in person until they either look at
> their
> > bugs, or confirm whether I'm still allowed to build RPMs in Koji?
>
> Just a mention, since this thread is getting a lot of attention. There
> is an added visual element which should be in play here, according to
> the design.  There should be an intermediate state before XOs
> disappear from the view, as outlined in:
>
> http://dev.laptop.org/ticket/3657
>
>
> > > 2. We need to be able to restart PS. As you say this is not possible,
> but if
> > > we restart sugar will PS restart as well?
> >
> > Yes, that's right (the D-Bus session bus will exit, which causes
> > D-Bus services like PS to exit too unless they've specifically asked not
> to).
> >
> > I see you assigned the bug about "need to be able to cope with PS
> restarts" to
> > yourself. Unless you're planning to implement the necessary Python code
> > in sugar.presence yourself, please don't.
> >
> > I don't think it's feasible to implement correct handling of PS restarts
> in
> > sugar.presence for Update.1, so unless the release engineering team
> > specifically tell me to, I won't be addressing that bug until a later
> > release.
> >
> > > 3. We need to force gabble to run. We have several instances of 4193
> (almost
> > > all XOs connected to schoolserver,AP are running salut). Or at least
> to
> > > force trying to connect to jabber server.
> >
> > Please see my comments on #4193 regarding steps to take to debug (I
> think
> > it's #4193 I commented on - I can't remember bug numbers, and Trac is
> > down at the moment).
> >
> > In summary:
> >
> > * try resolving the server with "getent hosts jabber.laptop.org"
> > * try pinging it with "ping jabber.laptop.org"
> > * try connecting via TCP with "telnet jabber.laptop.org 5222"
> >   (type "hello" and press Enter, if all goes well you should get
> disconnected
> >   with an error message that mentions "XML not well formed")
> >
> > If any of these steps fail, Gabble won't be able to connect either, and
> > there's nothing Gabble can do about it - talk to the Network Manager
> > maintainer instead, since that's the component responsible for getting
> > network connectivity and DNS on the XO.
> >
> > If you check the Gabble log you'll probably find that Gabble is trying
> > to connect, but failing because either it can't resolve
> > jabber.laptop.org in DNS, or it can't get a TCP connection there. That
> was my
> > diagnosis of two of the cases you mentioned in your bug with 3 sets of
> logs
> > (which may have been #4193?). In the third case it looked as though you
> hadn't
> > waited long enough for the log to indicate success or failure.
> >
> > > 4. The process of trying to connect to the jabber server, is done by
> > > telepathy-gabble, or by the presence
> >
> > Depends what you mean. The Presence Service is responsible for choosing
> when
> > to try to connect (at which time it calls the Connect() D-Bus method
> > on Gabble), but it's Gabble that actually opens a TCP socket to the
> Jabber
> > server and tries to talk to it. You can see this in the PS log, for
> > instance:
> >
> > 1194431620.966651 DEBUG s-p-s.telepathy_plugin:  > 0x85f1e14 (telepathy_plugin+TelepathyPlugin at 0x82c8fb0)>:
> connecting...
> > 1194431620.967008 DEBUG s-p-s.telepathy_plugin:  > 0x85f1e14 (telepathy_plugin+TelepathyPlugin at 0x82c8fb0)>: Connect()
> > succeeded
> >
> > (note that "Connect() succeeded" is a bit misleading - it just means
> > that the connection manager has said "OK, I'll try", rather than that it
> > has actually been able to connect.)
> >
> > In the telepathy-gabble log you'll then see something like this:
> >
> > ** (telepathy-gabble:25330): DEBUG: do_connect: calling
> lm_connection_open
> > Going to connect to olpc.collabora.co.uk
> > Trying 195.10

Re: when an xo loses connection, how long does it take to disappear from other's neighbor view?

2007-11-17 Thread Giannis Galanis
Simon,

I think the email i send you is incomplete, my connection was poor and gmail
must have saved the wrong draft. But, 1-2-3, is what i intended to send you.

I also meant to ask, How many times do you try _init_connection before you
assume the connection is down?


I hope so. I have a tarball with the patch, but I'm still waiting for
> Update.1 approval (it's unclear whether I can build RPMs for Joyride
> before I get Update.1 approval or not). If you're at 1CC, could you please
> annoy the ApprovalForUpdate people in person until they either look at
> their
> bugs, or confirm whether I'm still allowed to build RPMs in Koji?


I can definitely try to arrange this. But, can you please send me the
tarball to test it in the mean time?

> 2. We need to be able to restart PS. As you say this is not possible, but
> if
> > we restart sugar will PS restart as well?
>
> Yes, that's right (the D-Bus session bus will exit, which causes
> D-Bus services like PS to exit too unless they've specifically asked not
> to).
>
> I see you assigned the bug about "need to be able to cope with PS
> restarts" to
> yourself. Unless you're planning to implement the necessary Python code
> in sugar.presence yourself, please don't.

I don't think it's feasible to implement correct handling of PS restarts in
> sugar.presence for Update.1, so unless the release engineering team
> specifically tell me to, I won't be addressing that bug until a later
> release.


Ok, i will reassign the bug to presenceservice. As long as restarting sugar
works, we can stick to that for now.


> 3. We need to force gabble to run. We have several instances of 4193
> (almost
> > all XOs connected to schoolserver,AP are running salut). Or at least to
> > force trying to connect to jabber server.
>
> Please see my comments on #4193 regarding steps to take to debug (I think
> it's #4193 I commented on - I can't remember bug numbers, and Trac is
> down at the moment).
>
> In summary:
>
> * try resolving the server with "getent hosts jabber.laptop.org"
> * try pinging it with "ping jabber.laptop.org"
> * try connecting via TCP with "telnet jabber.laptop.org 5222"
>   (type "hello" and press Enter, if all goes well you should get
> disconnected
>   with an error message that mentions "XML not well formed")


The bug is indeed 4193.  I have replied to your post, but as the trac is
down you probably havent seen it.
I made all three tests:

$getent hosts jabber.laptop.org
 2001:4830:2446:ff00:201:6cff:fe07:68ec jabber.laptop.org   <-
frequent reply
 18.85.46.41 jabber.laptop.org  <--rare reply

$ping jabber.laptop.org
 PING jabber.laptop.org (18.85.46.41) 56(84) bytes of data.
 64 bytes from jabber.laptop.org (18.85.46.41): icmp_seq=1 ttl=63 time=
67.4 ms
 ...

$telnet jabber.laptop.org 5222
 blabla... connected
hello
 replied with an xml packet with "xml-not-well-formed" included

so it seems that it is a PS issue. Perhaps it is not waiting long enough, or
doesnt make enough tries when trying to connect. I have reassigned the bug
to presenceservice.


If any of these steps fail, Gabble won't be able to connect either, and
> there's nothing Gabble can do about it - talk to the Network Manager
> maintainer instead, since that's the component responsible for getting
> network connectivity and DNS on the XO.
>
> If you check the Gabble log you'll probably find that Gabble is trying
> to connect, but failing because either it can't resolve
> jabber.laptop.org in DNS, or it can't get a TCP connection there. That was
> my
> diagnosis of two of the cases you mentioned in your bug with 3 sets of
> logs
> (which may have been #4193?). In the third case it looked as though you
> hadn't
> waited long enough for the log to indicate success or failure.
>
> > 4. The process of trying to connect to the jabber server, is done by
> > telepathy-gabble, or by the presence



What I meant here is, Does the PS check if jabber server is accessible, and
then runs telepathy-gabble?, or this is one of the tasks of
telepathy-gabble?, which as I see you replied to

Depends what you mean. The Presence Service is responsible for choosing when
>
> to try to connect (at which time it calls the Connect() D-Bus method
> on Gabble), but it's Gabble that actually opens a TCP socket to the Jabber
> server and tries to talk to it. You can see this in the PS log, for
> instance:
>
> 1194431620.966651 DEBUG s-p-s.telepathy_plugin:  0x85f1e14 (telepathy_plugin+TelepathyPlugin at 0x82c8fb0)>: connecting...
> 1194431620.967008 DEBUG s-p-s.telepathy_plugin:  0x85f1e14 (telepathy_plugin+TelepathyPlugin at 0x82c8fb0)>: Connect()
> succeeded
>
> (note that "Connect() succeeded" is a bit misleading - it just means
> that the connection manager has said "OK, I'll try", rather than that it
> has actually been able to connect.)
>
> In the telepathy-gabble log you'll then see something like this:
>
> ** (telepathy-gabble:25330): DEBUG: do_connect: calling lm_connectio

Re: log-collect / log-send

2007-11-17 Thread Giannis Galanis
Pascal,

I have been working on something similar. It is a console script that gather
networks related logs, and will be available in the next joyride.

At the moment it includes:
var/log/messages
var/log/xorg.0.log
/home/olpc.sugar.logs/presenceservice
/home/olpc.sugar.logs/gabble
/home/olpc.sugar.logs/salut
and the following info:
build
firmware
model
time
mac
ips of all interfaces
network topology
jabber server
salut or gabble

The gzipped tar is ~20kb which is pretty low.

However, other tests(for specific activites for ex.) will require other
logs.

I believe that a complete log activity should have a list of options like:
network logs
kernel logs
activities logs
all logs
...so the user can choose according to the problem

also, the activity should be able to enable All Logs, from the .xinitrc,
.sugar.debug files,
or perhaps the full kernel logs.

I was planning to add the above features in my script, but a sugar activity
is better than a console script.
Since we are working on the same thing we can use each other's help, and
create a single application.

yani




On 10/29/07, Pascal Scheffers <[EMAIL PROTECTED]> wrote:
>
>
> I've created a rough-cut log-collector, it's in d.l.o/git/project/log-
> activity/log-collect.py
>
> For now, it just outputs some system info, tell me what's missing or
> what would be interesting to include?
>
> I don't know yet how to list installed activities... would that be
> just `ls /usr/share/activities/`? Or is there a package list?
>
> And then the main purpose: sending logs to OLPC, either using http-
> post or email or usb-stick or... but what logs should I collect? Just
> all of them? ~/.sugar/default/logs/* and /var/log/* ? Or should it be
> more selective?
>
> And some information from the journal, perhaps?
>
> What about privacy/sensitive information? Will there be any in the
> logs or system info?
>
> - Pascal
>
>
> Current log-activity.py output:
>
> bios-version: Q2C18
> uptime: 434169.21 430235.72
> wireless_mac: 00-17-C4-05-2A-58
> uuid: 8A401F4E-E312-47F9-96C8-A488C99BDA2F
> localization: ??
> kernel_version: Linux version 2.6.22-20071018.1.olpc.d4414541d2be66a
> ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 (Red Hat
> 4.1.1-51)) #1 PREEMPT Thu Oct 18 11:44:14 EDT 2007
> diskfree: 716 MB
> laptop-info-version: 0.1
> memfree: 63496 kB
> serial-number: SHF7250025C
> disksize: 1024 MB
> keyboard: ??-??-??
> olpc_build: OLPC build joyride 58 (stream joyride; variant devel_jffs2)
> country: USA
> board-revision: B4
> motherboard-number: QTFLCA72400063
> POWER_SUPPLY_NAME=olpc-battery
> POWER_SUPPLY_TYPE=Battery
> POWER_SUPPLY_STATUS=Full
> POWER_SUPPLY_PRESENT=1
> POWER_SUPPLY_HEALTH=Good
> POWER_SUPPLY_TECHNOLOGY=LiFe
> POWER_SUPPLY_VOLTAGE_AVG=6792960
> POWER_SUPPLY_CURRENT_AVG=0
> POWER_SUPPLY_CAPACITY=97
> POWER_SUPPLY_CAPACITY_LEVEL=Full
> POWER_SUPPLY_TEMP=2508
> POWER_SUPPLY_TEMP_AMBIENT=4300
> POWER_SUPPLY_ACCUM_CURRENT=8390
> POWER_SUPPLY_MANUFACTURER=BYD
> POWER_SUPPLY_SERIAL_NUMBER=5d0d0100daff
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ip4-address buddy property - still needed?

2007-11-17 Thread Giannis Galanis
The feature, although not usable by the activities, it has other benefits.

By observing the buddy list, you acquire instant information of the network
connection go the users:
when connected to channel 1 for example:
169.254.x.x address are in link-local
172.18.x.x are connected to schoolserver

when connected to a jabber server:
169.254.x.x are connected through an MPP
18x.x.x are media lab
172.18.x.x are connected to schoolserver in olpc
etc

It is information continuously used in network testing, also useful from the
users prespective:
1. in the case of connecting to multiple jabber servers, the user should be
able to tell which XO in the neighbout view belongs to the same school
2. get the geopraphical location of another user

In future versions of the neighbor view, or through other activities, the
user should be able to filter for specific XOs according to location, or
school(in the case he's connected to many servers). Two children in the same
school should be able to recognize each other even if they are connected
through a jabber server, other then the one in the school.

It can also be useful for locating an XO in case of theft.

I have also added a ticket(4405) for adding the public id in the buddy list
properties.

It is a small part of data(both IPs, private and public), which can be
harmfully incorporated in the telepathy services.

Please let me know if you agree,

yani



On 10/25/07, Jim Gettys <[EMAIL PROTECTED]> wrote:
>
> It seems, from your discussion like unless someone grumbles today, this
> should be removed immediately.  And it removed within a week, even if
> someone grumbles...
>  - Jim
>
>
> On Thu, 2007-10-25 at 10:15 +0100, Simon McVittie wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > We still have one set of OLPC-specific patches to Salut (the link-local
> > collaboration backend) that has been rejected upstream, which is the one
> > that adds support for the deprecated ip4-address buddy property. This
> was
> > used during a transitional period to enable simple TCP-based
> collaboration
> > for activities that didn't use Tubes; Sjoerd is reluctant to keep this
> > patch set, because it's meant to have gone away by now!
> >
> > Is anyone still using this property? If not, can we kill it? It was
> > added in Trial-2, and it was meant to be gone by Trial-3 but was left in
> > just in case, so it really ought to disappear. When it does, we can
> > delete some code from Salut and Presence Service.
> >
> > Places it's exposed in the APIs, which I propose to get rid of:
> >
> > PS D-Bus API: Buddy.GetProperties() returns a dict that contains
> > "ip4-address": "10.0.0.1" (or whatever), and Buddy.PropertyChanged
> > signal includes a dict that can contain the same
> >
> > sugar.presence: Buddy has a GLib property "ip4-address" (aka
> > buddy.props.ip4_address) and can emit it in its property-changed
> signal
> >
> > The Read activity appears to be the only thing in my jhbuild that uses
> > ip4-address (#4297). It should be ported to either stream tubes (when
> they're
> > ready in Salut, which should be this or next week) or D-Bus tubes (now).
> >
> > Gabble already supports stream tubes, so stream-tube support can be
> > implemented on a branch and tested against Gabble. Porting from plain
> TCP
> > to stream tubes should be very straightforward; I hope to produce a
> > proof-of-concept patch for Read later today.
> >
> > Simon
> > -BEGIN PGP SIGNATURE-
> > Version: GnuPG v1.4.6 (GNU/Linux)
> > Comment: OpenPGP key: http://www.pseudorandom.co.uk/2003/contact/ or
> pgp.net
> >
> > iD8DBQFHIF7HWSc8zVUw7HYRAvp6AJ9G/Xiw27pPPMm0g02vhXzRhzUxqwCfW27Z
> > nh1B/wqe7GD/xf/YaOPVaw8=
> > =42L7
> > -END PGP SIGNATURE-
> > ___
> > Devel mailing list
> > Devel@lists.laptop.org
> > http://lists.laptop.org/listinfo/devel
> --
> Jim Gettys
> One Laptop Per Child
>
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: T-shirt ideas / feedback

2007-11-17 Thread Giannis Galanis
Seth,

Some pretty nice ideas on that page.

I put on the wiki an alternative design for the 10million shirt inspired by
yours:
10 million laptops per 10 million children


what do you think?


On 10/24/07, Seth Woodworth <[EMAIL PROTECTED]> wrote:
>
> Hello everyone, isforinsects here.
>
> There have been a couple suggestions for t-shirts floating around on the
> wiki and elsewhere.
> http://wiki.laptop.org/go/T-shirts
>
> I think that it's a great way to build community, and increase awareness
> so I mocked up a few ideas in InDesign.
> http://wiki.laptop.org/go/Image:Shirt_10_million.png
> (more to come)
>
> There are several great ideas on the wiki page, and all of them could
> become shirts via cafepress if anyone so cared.  It would also become a
> slight revenue stream for OLPC community building if sold via cafepress or
> similar web-printing outfits.
>
> Does anyone have feedback on design and/or any ideas for implementation?
> I'm not going to go start a store somewhere unless the community is into the
> idea.
>
> Seth
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut (link-local) protocol changing - don't expect interop between versions

2007-11-17 Thread Giannis Galanis
Since you are updating the presence service, it is a could opportunity to
fix the switch from salut to gabble.

When internet connectivity is detected, salut should stop, and gabble should
start right after.  However, this doesnt work properly even on latest
builds, especially when the XO connects through schoolserver.
It has even been documented(bug 4193) that an XO was connected to medialab
AP and was still running Salut. The neighbor view included several XOs and
could share properly.

It is pretty high priority to make this work properly.

Also, when connected to a school server, it is faster to communicate with
others in the mesh through salut, then through jabber. So it can be useful
for the user to force salut even when jabber is available.

yani



On 10/18/07, Simon McVittie <[EMAIL PROTECTED]> wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Just a heads-up for anyone who isn't already aware:
>
> We're replacing the Salut (link-local collaborative backend) "rMulticast"
> protocol with a better version, over the next week or so (bug #4044). This
> is
> an incompatible change; there may in fact be more than one incompatible
> change
> involved, if we have to change the protocol further when it's had
> larger-scale testing.
>
> As a result, until further notice, Salut is not expected to be compatible
> between different versions. Please do not report bugs in link-local
> (serverless) collaboration unless all participants in the activity are
> running exactly the same snapshot of Salut (e.g. the same XO image).
>
> We'll freeze the network protocol again between now and the 1.0 freeze.
> The
> improved rMulticast protocol either fixes, or will enable us to fix,
> #3294,
> #3969, #3465, #3338 and possibly #4127; we might also take the
> opportunity to improve the mDNS part of the protocol.
>
> Checking the version on an OLPC: rpm -q telepathy-salut
>
> Checking the version in jhbuild: ls -d source/telepathy-salut-*, see which
> one has the latest date in the directory name
>
> Regards,
> Simon
> on behalf of the OLPC Telepathy team
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: OpenPGP key: http://www.pseudorandom.co.uk/2003/contact/ or
> pgp.net
>
> iD8DBQFHF4ZmWSc8zVUw7HYRAqtDAJ9AWv5rE8jZzl84zlZW+MRLd6zxqACfRD3z
> OgPyBcBGKb1tZjbY+PT432I=
> =ouwQ
> -END PGP SIGNATURE-
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Salut (link-local) protocol changing - don't expect interop between versions

2007-11-17 Thread Giannis Galanis
Since you are updating the presence service, it is a could opportunity to
fix the switch from salut to gabble.

When internet connectivity is detected, salut should stop, and gabble should
start right after.  However, this doesnt work properly even on latest
builds, especially when the XO connects through schoolserver.
It has even been documented(bug 4193) that an XO was connected to medialab
AP and was still running Salut. The neighbor view included several XOs and
could share properly.

It is pretty high priority to make this work properly.

Also, when connected to a school server, it is faster to communicate with
others in the mesh through salut, then through jabber. So it can be useful
for the user to force salut even when jabber is available.

yani



On 10/18/07, Simon McVittie <[EMAIL PROTECTED]> wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Just a heads-up for anyone who isn't already aware:
>
> We're replacing the Salut (link-local collaborative backend) "rMulticast"
> protocol with a better version, over the next week or so (bug #4044). This
> is
> an incompatible change; there may in fact be more than one incompatible
> change
> involved, if we have to change the protocol further when it's had
> larger-scale testing.
>
> As a result, until further notice, Salut is not expected to be compatible
> between different versions. Please do not report bugs in link-local
> (serverless) collaboration unless all participants in the activity are
> running exactly the same snapshot of Salut (e.g. the same XO image).
>
> We'll freeze the network protocol again between now and the 1.0 freeze.
> The
> improved rMulticast protocol either fixes, or will enable us to fix,
> #3294,
> #3969, #3465, #3338 and possibly #4127; we might also take the
> opportunity to improve the mDNS part of the protocol.
>
> Checking the version on an OLPC: rpm -q telepathy-salut
>
> Checking the version in jhbuild: ls -d source/telepathy-salut-*, see which
> one has the latest date in the directory name
>
> Regards,
> Simon
> on behalf of the OLPC Telepathy team
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: OpenPGP key: http://www.pseudorandom.co.uk/2003/contact/ or
> pgp.net
>
> iD8DBQFHF4ZmWSc8zVUw7HYRAqtDAJ9AWv5rE8jZzl84zlZW+MRLd6zxqACfRD3z
> OgPyBcBGKb1tZjbY+PT432I=
> =ouwQ
> -END PGP SIGNATURE-
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: when an xo loses connection, how long does it take to disappear from other's neighbor view?

2007-11-07 Thread Giannis Galanis
Simon,

I think the email i send you was incomplete, my connection was poor and
gmail must have saved the wrong draft. But, 1-2-3, is what i intended to
send you.

I also meant to ask, How many times do you try _init_connection before you
assume the connection is down?


I hope so. I have a tarball with the patch, but I'm still waiting for
> Update.1 approval (it's unclear whether I can build RPMs for Joyride
> before I get Update.1 approval or not). If you're at 1CC, could you please
> annoy the ApprovalForUpdate people in person until they either look at
> their
> bugs, or confirm whether I'm still allowed to build RPMs in Koji?


I can definitely try to arrange this. But, can you please send me the
tarball to test it in the mean time?

> 2. We need to be able to restart PS. As you say this is not possible, but
> if
> > we restart sugar will PS restart as well?
>
> Yes, that's right (the D-Bus session bus will exit, which causes
> D-Bus services like PS to exit too unless they've specifically asked not
> to).
>
> I see you assigned the bug about "need to be able to cope with PS
> restarts" to
> yourself. Unless you're planning to implement the necessary Python code
> in sugar.presence yourself, please don't.

I don't think it's feasible to implement correct handling of PS restarts in
> sugar.presence for Update.1, so unless the release engineering team
> specifically tell me to, I won't be addressing that bug until a later
> release.


Ok, i will reassign the bug to presenceservice. As long as restarting sugar
works, we can stick to that for now.


> 3. We need to force gabble to run. We have several instances of 4193
> (almost
> > all XOs connected to schoolserver,AP are running salut). Or at least to
> > force trying to connect to jabber server.
>
> Please see my comments on #4193 regarding steps to take to debug (I think
> it's #4193 I commented on - I can't remember bug numbers, and Trac is
> down at the moment).
>
> In summary:
>
> * try resolving the server with "getent hosts jabber.laptop.org "
> * try pinging it with "ping jabber.laptop.org"
> * try connecting via TCP with "telnet jabber.laptop.org 5222"
>   (type "hello" and press Enter, if all goes well you should get
> disconnected
>   with an error message that mentions "XML not well formed")


The bug is indeed 4193.  I have replied to your post, but as the trac is
down you probably havent seen it.
I made all three tests:

$getent hosts jabber.laptop.org
 2001:4830:2446:ff00:201:6cff:fe07:68ec jabber.laptop.org   <-
frequent reply
 18.85.46.41 jabber.laptop.org  <--rare reply

$ping jabber.laptop.org
 PING jabber.laptop.org (18.85.46.41) 56(84) bytes of data.
 64 bytes from jabber.laptop.org (18.85.46.41): icmp_seq=1 ttl=63 time=
67.4 ms
 ...

$telnet jabber.laptop.org 5222
 blabla... connected
hello
 replied with an xml packet with "xml-not-well-formed" included

so it seems that it is a PS issue. Perhaps it is not waiting long enough, or
doesnt make enough tries when trying to connect. I have reassigned the bug
to presenceservice.


If any of these steps fail, Gabble won't be able to connect either, and
> there's nothing Gabble can do about it - talk to the Network Manager
> maintainer instead, since that's the component responsible for getting
> network connectivity and DNS on the XO.
>
> If you check the Gabble log you'll probably find that Gabble is trying
> to connect, but failing because either it can't resolve
> jabber.laptop.org in DNS, or it can't get a TCP connection there. That was
> my
> diagnosis of two of the cases you mentioned in your bug with 3 sets of
> logs
> (which may have been #4193?). In the third case it looked as though you
> hadn't
> waited long enough for the log to indicate success or failure.
>
> > 4. The process of trying to connect to the jabber server, is done by
> > telepathy-gabble, or by the presence



What I meant here is, Does the PS check if jabber server is accessible, and
then runs telepathy-gabble?, or this is one of the tasks of
telepathy-gabble?, which as I see you replied to

Depends what you mean. The Presence Service is responsible for choosing when
>
> to try to connect (at which time it calls the Connect() D-Bus method
> on Gabble), but it's Gabble that actually opens a TCP socket to the Jabber
> server and tries to talk to it. You can see this in the PS log, for
> instance:
>
> 1194431620.966651 DEBUG s-p-s.telepathy_plugin:  0x85f1e14 (telepathy_plugin+TelepathyPlugin at 0x82c8fb0)>: connecting...
> 1194431620.967008 DEBUG s-p-s.telepathy_plugin:  0x85f1e14 (telepathy_plugin+TelepathyPlugin at 0x82c8fb0)>: Connect()
> succeeded
>
> (note that "Connect() succeeded" is a bit misleading - it just means
> that the connection manager has said "OK, I'll try", rather than that it
> has actually been able to connect.)
>
> In the telepathy-gabble log you'll then see something like this:
>
> ** (telepathy-gabble:25330): DEBUG: do_connect: calling lm_connect

Re: when an xo loses connection, how long does it take to disappear from other's neighbor view?

2007-11-06 Thread Giannis Galanis
Thank you all for your replies. They clear the picture a lot.

To summarize:

1. We need to fix the timeout for icons to disappear. Can we try Guillaume's
patch? Also we need to be able to resolve which icons are currently not
avaiable(but still appearing). I believe that failed entries in
_precense._tcp is a complete list. Is this correct?

2. We need to be able to restart PS. As you say this is not possible, but if
we restart sugar will PS restart as well?

3. We need to force gabble to run. We have several instances of 4193 (almost
all XOs connected to schoolserver,AP are running salut). Or at least to
force trying to connect to jabber server.

4. The process of trying to connect to the jabber server, is done by
telepathy-gabble, or by the presence

On 11/6/07, Simon McVittie < [EMAIL PROTECTED]> wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> In reply to your previous mail, "iff" means "if and only if". It's often
> used by mathematicians.
>
> On Tue, 06 Nov 2007 at 03:23:39 -0500, Giannis Galanis wrote:
> > What does proper notification mean? Which are the cases that it happens?
>
>
> If Salut is explicitly asked to disconnect, it will tell Avahi to "delete"
> all its mDNS records (this actually consists of re-sending all the
> records it was advertising, with the Time To Live set to 0 seconds).
> This is sometimes referred to as a "goodbye" packet. See
> http://files.multicastdns.org/draft-cheshire-dnsext-multicastdns.txt
> section 11.2 "Goodbye Packets".
>
> The only time we'll currently do this is when switching off Salut because
> Gabble has connected successfully.
>
> > Probably this is not if an XO moves slowly to a place with poor
> > connectivity.
>
> This is never done in response to network conditions - we can't know that
> we've lost network connectivity until it's too late.
>
> If the Time To Live on our mDNS records expires, that should have the same
>
> effect; however, as Sjoerd explained, we currently ignore that, because
> the 1CC mesh network is apparently unstable enough that the TTL
> sometimes expires even for laptops that are actually present.
>
> > In the case of a temporary(short) disruption of connectictivity, how
> much
> > time does it generally take for it to return? You mentioned that in the
> past
> > XOs were appearing  and disappearing constantly. This implies that the
> > common drop of connectivity is in the scale of few seconds.
>
> You tell me! :-) I don't have enough XOs to replicate the conditions of
> a large mesh network like 1CC, so I can't comment on packet loss rates.
> Perhaps Dan Williams (who used to maintain Presence Service) could help
> you.
>
> > If it is lost
> > for more than a few minutes, than it is not bad for the XO to leave and
> > return.  So I believe that 1h or even 10min are too long timeouts.
>
> I believe we're currently using Avahi's default timeouts, which are
> those recommended in the mDNS draft (linked above). If I'm right about
> that, then we're using 120 second TTLs for the SRV and A records.
>
> Assuming Salut and Avahi follow the draft's recommendations, this means
> that for the records representing activities, buddies and laptops, if we
> haven't seen an annoucement of a particular record, we will:
>
> - - re-query after 96 - 98.4 seconds;
> - - if no reply, re-query after 102 - 104.4 seconds;
> - - if no reply, re-query after 114 - 116.4 seconds;
> - - if no reply, assume the record has vanished after 120 seconds.
>
> (In each of the ranges given for the re-queries, the exact time is
> chosen at random, to avoid simultaneous queries from everyone in the
> network.)
>
> The timeout is reset as soon as we see any announcement of a record.
>
> The only ones whose disappearance matters are the SRV and A records - if
> a TXT record fails to disappear when it shouldn't, we don't really care.
> TXT records have a substantially longer timeout (the draft recommends 75
> minutes).
>
> > There are a couple more things I would like to address:
> >
> > 1. Is there a way to restart the presence service? In that way we can
> > resolve a weird state. Will killing restarting the porcess work?
>
> Only if client code that accesses the PS is amended to cope with this
> (I just filed #4681 to represent this). Until #4681 is closed, if the PS
> was restarted, nothing would work - use Ctrl+Alt+Backspace to restart all
> of
> Sugar. Please see the bug for more details or to reply.
>
> > 2. At what point in the source code, the presence serivce
> > i.will try to connect to the jabber server?
> &g

Re: when an xo loses connection, how long does it take to disappear from other's neighbor view?

2007-11-06 Thread Giannis Galanis
Sjoerd, Guillaume, Simon,

What does proper notification mean? Which are the cases that it happens?

Probably this is not if an XO moves slowly to a place with poor
connectivity.

In the case of a temporary(short) disruption of connectictivity, how much
time does it generally take for it to return? You mentioned that in the past
XOs were appearing  and disappearing constantly. This implies that the
common drop of connectivity is in the scale of few seconds. If it is lost
for more than a few minutes, than it is not bad for the XO to leave and
return.  So I believe that 1h or even 10min are too long timeouts.

There are a couple more things I would like to address:

1. Is there a way to restart the presence service? In that way we can
resolve a weird state. Will killing restarting the porcess work?

2. At what point in the source code, the presence serivce
i.will try to connect to the jabber server?
ii. run gabble?

3. I noticed the dbus diagram is updated. Indeed we have a better picture of
whats happening. But, still we need some more information like:
i. state diagram of the presence service
ii. what type of communication is taking place between NM and PS
iii. when connection is switched from linklocal to schoolserver(for example)
what steps are taking place in the presence service
iv. the internet connectivity is detected by NM and sent to PS, or detected
by PS

yani




On 10/30/07, Sjoerd Simons <[EMAIL PROTECTED] > wrote:
>
> On Fri, Oct 26, 2007 at 02:48:55PM -0400, Giannis Galanis wrote:
> >  Sjoerd,
> >
> > I would like to ask you,
> >
> > you replied at one of the bugs:
>
> Moving from a bugreport to a private mail might not be a great idea..
> Could you
> in the future just put your questions in the bugreport so we can have the
> discussion in a more public fashion :)
>
> > >Salut used to drop the presence of people for which it couldn't resolve
> the
> > extra information, but this seemed to give a lot of problems in the mesh
>
> > (people appearing and
> > >disappearing all the time). So as a workaround we switched to only
> dropping
> > presence iff all info about a node has gone. Which has the downside the
> > nodes that are really
> > >gone can still appear on the mesh view for some time (specifically when
> > they didn't send a proper mdns bye packet or when that was dropped).
> >
> > >iff all info about a node has gone
> > what does this mean?
>
> It means that it is hard to decide when a node has really gone or if the
> network link to a certain node is just (temporarily) bad.
>
> In the OLPC office, the second case apparently happens a lot.
>
> > how often do you refresh?
>
> The refresh is done by avahi. Avahi tries every few minutes. Guillame
> worked on
> a patch to make the effect of being unsure about a user less bad (As in
> assume
> that if your unsure about for a certain period of time their actually
> really
> gone).. It still needs to be finished though.
>
> Which means for an end-users point of view, that if a user went away
> without
> doing proper notification, then they will only stay on the meshview for a
> limited amount of time (Say maximum of 10 minutes instead of the current
> situation of more then an hour)



  Sjoerd
> --
> Kindness is the beginning of cruelty.
> -- Muad'dib [Frank Herbert, "Dune"]
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: log-collect / log-send

2007-10-30 Thread Giannis Galanis
Eduardo,

There is a wiki page with some similar info:
http://wiki.laptop.org/go/Developer_Environment

I just realized that this page is created and edited by you

So you have written scripts for this purpose as well?

I have attached my two scripts. The are written in bash, but they are not
commented

"netstatus" gathers network info like:
mac
ip eth0,msh0,eth1,etc
dns
jabber server
MPP,AP,schoolserver,linklocal
gabble/salut

"netlog" gathers the following:
output from "netstatus"
info file with build,firmware,model
messages
Xorg.0.log (thanx Jim for the comment in the trac)
presenceservice.log
gabble.log
salut.log

yani



On 10/30/07, Eduardo Silva <[EMAIL PROTECTED]> wrote:
>
> Hi Guys,
>
> > I have been working on something similar. It is a console script that
> gather
> > networks related logs, and will be available in the next joyride.
>
> Would be better focus to develop just a main class to collect this
> information and different front-ends as a console script and the UI
> interface under the log activity. In this way we can avoid to
> duplicate code.
>
> Giannis, where is your source code?, can be cool if you and Pascal can
> merge a final python class.
>
> > I was planning to add the above features in my script, but a sugar
> activity
> > is better than a console script.
> > Since we are working on the same thing we can use each other's help, and
> > create a single application.
>
> both can be useful, but using just ONE collector ;)
>
> cheers.
>
> Eduardo.
>


netlog
Description: Binary data


netstatus
Description: Binary data
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: log-collect / log-send

2007-10-30 Thread Giannis Galanis
Pascal,

I have been working on something similar. It is a console script that gather
networks related logs, and will be available in the next joyride.

At the moment it includes:
var/log/messages
var/log/xorg.0.log
/home/olpc.sugar.logs/presenceservice
/home/olpc.sugar.logs/gabble
/home/olpc.sugar.logs/salut
and the following info:
build
firmware
model
time
mac
ips of all interfaces
network topology
jabber server
salut or gabble

The gzipped tar is ~20kb which is pretty low.

However, other tests(for specific activites for ex.) will require other
logs.

I believe that a complete log activity should have a list of options like:
network logs
kernel logs
activities logs
all logs
...so the user can choose according to the problem

also, the activity should be able to enable All Logs, from the .xinitrc,
.sugar.debug files,
or perhaps the full kernel logs.

I was planning to add the above features in my script, but a sugar activity
is better than a console script.
Since we are working on the same thing we can use each other's help, and
create a single application.

yani


On 10/29/07, Pascal Scheffers <[EMAIL PROTECTED]> wrote:
>
>
> I've created a rough-cut log-collector, it's in d.l.o/git/project/log-
> activity/log-collect.py
>
> For now, it just outputs some system info, tell me what's missing or
> what would be interesting to include?
>
> I don't know yet how to list installed activities... would that be
> just `ls /usr/share/activities/`? Or is there a package list?
>
> And then the main purpose: sending logs to OLPC, either using http-
> post or email or usb-stick or... but what logs should I collect? Just
> all of them? ~/.sugar/default/logs/* and /var/log/* ? Or should it be
> more selective?
>
> And some information from the journal, perhaps?
>
> What about privacy/sensitive information? Will there be any in the
> logs or system info?
>
> - Pascal
>
>
> Current log-activity.py output:
>
> bios-version: Q2C18
> uptime: 434169.21 430235.72
> wireless_mac: 00-17-C4-05-2A-58
> uuid: 8A401F4E-E312-47F9-96C8-A488C99BDA2F
> localization: ??
> kernel_version: Linux version 2.6.22-20071018.1.olpc.d4414541d2be66a
> ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 (Red Hat
> 4.1.1-51)) #1 PREEMPT Thu Oct 18 11:44:14 EDT 2007
> diskfree: 716 MB
> laptop-info-version: 0.1
> memfree: 63496 kB
> serial-number: SHF7250025C
> disksize: 1024 MB
> keyboard: ??-??-??
> olpc_build: OLPC build joyride 58 (stream joyride; variant devel_jffs2)
> country: USA
> board-revision: B4
> motherboard-number: QTFLCA72400063
> POWER_SUPPLY_NAME=olpc-battery
> POWER_SUPPLY_TYPE=Battery
> POWER_SUPPLY_STATUS=Full
> POWER_SUPPLY_PRESENT=1
> POWER_SUPPLY_HEALTH=Good
> POWER_SUPPLY_TECHNOLOGY=LiFe
> POWER_SUPPLY_VOLTAGE_AVG=6792960
> POWER_SUPPLY_CURRENT_AVG=0
> POWER_SUPPLY_CAPACITY=97
> POWER_SUPPLY_CAPACITY_LEVEL=Full
> POWER_SUPPLY_TEMP=2508
> POWER_SUPPLY_TEMP_AMBIENT=4300
> POWER_SUPPLY_ACCUM_CURRENT=8390
> POWER_SUPPLY_MANUFACTURER=BYD
> POWER_SUPPLY_SERIAL_NUMBER=5d0d0100daff
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ip4-address buddy property - still needed?

2007-10-26 Thread Giannis Galanis
Several parts for your replies refer to issues we have discussed before.
The tickets 4463,4405,4404,4403 include the new requirements and
enhancements for the presence service, in benefit of user+developer

To summarize

1. User+devel should be able to switch between gabble to salut manually
using the options: auto,salut,gabble (4403)

2. User+devel should be able to connect/disconnect to one (or many, in the
future) jabber server (4463)

3. User+devel should have access to public+private IP (4405)

There are several reasons for each one of these. Now, if you observe them
combined, the importance of IP information is:

user's prespective: assume for example 60 XOs are connected to a public
jabber server(e.g. jabber.laptop.org). Five of these belong to the same
school. They should be able to filter themselves.

devel's perspective:
1. From 1 XO we can test intantly which XO's are connected in a specific
configuration
2. IPs give irreplaceable information regarding whether the XO is connected
to MPP, AP, schoolserver, NAT etc
3.(very important). when an XO is connected is connected to an MPP, we need
to now the name of it. The buddy list links IP with name
and many more

regarding the privacy issue of giving away the IP, regarding that all p2p or
IM offer this capability , it shouldnt be an issue,

yani

On 10/26/07, Sjoerd Simons <[EMAIL PROTECTED]> wrote:
>
> On Fri, Oct 26, 2007 at 12:20:01AM -0400, Giannis Galanis wrote:
> > The feature, although not usable by the activities, it has other
> benefits.
> >
> > By observing the buddy list, you acquire instant information of the
> network
> > connection go the users:
> > when connected to channel 1 for example:
> > 169.254.x.x address are in link-local
> > 172.18.x.x are connected to schoolserver
>
> > when connected to a jabber server:
> > 169.254.x.x are connected through an MPP
> > 18x.x.x are media lab
> > 172.18.x.x are connected to schoolserver in olpc
> > etc
>
> > It is information continuously used in network testing,
> For the link-local case you can just ask avahi for this information
> directly.
>
> For the jabber/server case, i'm unsure why your interested in how other
> nodes are
> connected to the jabber server in the first place.
>
> > also useful from the users prespective:
>
> > 1. in the case of connecting to multiple jabber servers, the user should
> be
> > able to tell which XO in the neighbout view belongs to the same school
>
> Maybe this has changed. But afaik there will be one jabber server per
> school
> (on the school's server) and you can thus look at the users jid.
>
> > 2. get the geopraphical location of another user
> A much better way for doing this would be to integrate some geoclue[0]
> information into
> telepathy. Instead of having each XO's trying to work out where others are
> by
> the small amount of information an ip reveals.
>
> > In future versions of the neighbor view, or through other activities,
> the
> > user should be able to filter for specific XOs according to location, or
> > school(in the case he's connected to many servers). Two children in the
> same
> > school should be able to recognize each other even if they are connected
> > through a jabber server, other then the one in the school.
>
> An xo should always connect to the same jabber server afaik..
>
> > It can also be useful for locating an XO in case of theft.
>
> In the case of theft the jabber server the XO is connecting to always has
> the
> information of where a connection came from (or at least of the last nat
> hop
> and you can work from there). I don't see the point of pushing that info
> to all
> xo's.
>
> > I have also added a ticket(4405) for adding the public id in the buddy
> list
> > properties.
> >
> > It is a small part of data(both IPs, private and public), which can be
> > harmfully incorporated in the telepathy services.
>
> I definately agree that having some information of where in the world your
> buddy's are is something very nice. I disagree that exposing ip addresses
> is
> the way to do it though.
>
>   Sjoerd
> 0: http://www.freedesktop.org/wiki/Software/GeoClue
> --
> Mediocrity finds safety in standardization.
> -- Frederick Crane
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: ip4-address buddy property - still needed?

2007-10-25 Thread Giannis Galanis
The feature, although not usable by the activities, it has other benefits.

By observing the buddy list, you acquire instant information of the network
connection go the users:
when connected to channel 1 for example:
169.254.x.x address are in link-local
172.18.x.x are connected to schoolserver

when connected to a jabber server:
169.254.x.x are connected through an MPP
18x.x.x are media lab
172.18.x.x are connected to schoolserver in olpc
etc

It is information continuously used in network testing, also useful from the
users prespective:
1. in the case of connecting to multiple jabber servers, the user should be
able to tell which XO in the neighbout view belongs to the same school
2. get the geopraphical location of another user

In future versions of the neighbor view, or through other activities, the
user should be able to filter for specific XOs according to location, or
school(in the case he's connected to many servers). Two children in the same
school should be able to recognize each other even if they are connected
through a jabber server, other then the one in the school.

It can also be useful for locating an XO in case of theft.

I have also added a ticket(4405) for adding the public id in the buddy list
properties.

It is a small part of data(both IPs, private and public), which can be
harmfully incorporated in the telepathy services.

Please let me know if you agree,

yani

On 10/25/07, Jim Gettys <[EMAIL PROTECTED]> wrote:
>
> It seems, from your discussion like unless someone grumbles today, this
> should be removed immediately.  And it removed within a week, even if
> someone grumbles...
>  - Jim
>
>
> On Thu, 2007-10-25 at 10:15 +0100, Simon McVittie wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > We still have one set of OLPC-specific patches to Salut (the link-local
> > collaboration backend) that has been rejected upstream, which is the one
> > that adds support for the deprecated ip4-address buddy property. This
> was
> > used during a transitional period to enable simple TCP-based
> collaboration
> > for activities that didn't use Tubes; Sjoerd is reluctant to keep this
> > patch set, because it's meant to have gone away by now!
> >
> > Is anyone still using this property? If not, can we kill it? It was
> > added in Trial-2, and it was meant to be gone by Trial-3 but was left in
> > just in case, so it really ought to disappear. When it does, we can
> > delete some code from Salut and Presence Service.
> >
> > Places it's exposed in the APIs, which I propose to get rid of:
> >
> > PS D-Bus API: Buddy.GetProperties() returns a dict that contains
> > "ip4-address": "10.0.0.1" (or whatever), and Buddy.PropertyChanged
> > signal includes a dict that can contain the same
> >
> > sugar.presence: Buddy has a GLib property "ip4-address" (aka
> > buddy.props.ip4_address) and can emit it in its property-changed
> signal
> >
> > The Read activity appears to be the only thing in my jhbuild that uses
> > ip4-address (#4297). It should be ported to either stream tubes (when
> they're
> > ready in Salut, which should be this or next week) or D-Bus tubes (now).
> >
> > Gabble already supports stream tubes, so stream-tube support can be
> > implemented on a branch and tested against Gabble. Porting from plain
> TCP
> > to stream tubes should be very straightforward; I hope to produce a
> > proof-of-concept patch for Read later today.
> >
> > Simon
> > -BEGIN PGP SIGNATURE-
> > Version: GnuPG v1.4.6 (GNU/Linux)
> > Comment: OpenPGP key: http://www.pseudorandom.co.uk/2003/contact/ or
> pgp.net
> >
> > iD8DBQFHIF7HWSc8zVUw7HYRAvp6AJ9G/Xiw27pPPMm0g02vhXzRhzUxqwCfW27Z
> > nh1B/wqe7GD/xf/YaOPVaw8=
> > =42L7
> > -END PGP SIGNATURE-
> > ___
> > Devel mailing list
> > Devel@lists.laptop.org
> > http://lists.laptop.org/listinfo/devel
> --
> Jim Gettys
> One Laptop Per Child
>
>
> ___
> Devel mailing list
> Devel@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Presence service bugs/enhancements

2007-10-23 Thread Giannis Galanis
Simon,

The following are the current bugs/enhancements regarding the presence
service. They are listed from high to low priority with their corresponding
trac number.

1. The presence service should detect more efficiently the internet
connectivity and switch to gabble when appropriate(4193)

2. In link-local XOs are seen in neighbor view but cannot be shared with.
Sometimes they are not connected to the mesh anymore, but still present. In
some such cases the avahi-browse cannot resolve the services of the
corresponding XO. This is high priority but i dont have a log file in a
blocking case, although i have experienced it in build617(4402)

3. Ability to switch from gabble to salut manually using the options:
auto,salut,gabble(4403)

4. Ability to keep an activity alive when passing from salut to gabble and
vice versa. This can occur automatically when internet connectivity is
dynamically lost or recovered(4404)

5. In gabble, the public IP must be available in the buddy list, or at least
be accessible through the jabber server upon request(4405)

6. The jabber servers should be switchable(to change from one to the other)
in a neater way then accessing the config file and rebooting. This can
probably be invoked by sending smth like ..xmlns:stream="
http://etherx.jabber.org/streams"; to="jabber.laptop.org"as i noticed in
the log files.  If it is simple to apply, can you describe how it can be
done properly?(not on trac)

Thanx

yani
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Network connectivity test update

2007-10-18 Thread Giannis Galanis
Kim, Ricardo,

I have updated the Network connectivity test page(
http://wiki.laptop.org/go/Test_Network_Configuration).

I have added some additional information concerning the IP addresses and the
resolv.conf file, in order to make things more clear.

I updated the connectivity_status script. It can now detect if the XO acts
as an MPP, connects through an Ethernet adapter etc.  It can be useful for
machines running old builds.

I will be waiting for possible suggestions.

yani
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Bugs and more

2007-10-12 Thread Giannis Galanis
ice concerns me, specifically the part about it
> not being accurate. It would be good to spend some more time on this to
> document a broken case and write up the bug; add appropriate logs, etc. It
> is not going to be good if there is a local jabber server (on the school
> server, for instance) and the laptops can't decide if they should be talking
> local-link or through the jabber server. I'm sure that will mess up tubes
> sharing.
>

I agree, I just had another issue in Alex's XOs where two XOs where
connected to MediaLab802.11 but were still running salut. It was displaying
18.85 and 169.254 XOs, which seemed they were all from this room..this bug
is very interesting! I had seen it in the past, but i couldnt describe it.
Even when on AP mode, the avahi, which runs in the background, still creates
the presence list from the mesh that its connected to (ch 6 in this case).It
can be accesed by "avahi-browse -t _presence._tcp"). But , now it included
18.85 and 169 xos(bug 4193)


=> Once you've explored a few of these things from this email, I would like
> you to send it out to the devel mailing list for review. You might get a few
> good suggestions on other things for the status program and then figure out
> how to check it in so we can all use it for debug.
>

yani




On 10/12/07, Kim Quirk <[EMAIL PROTECTED]> wrote:
>
> Yani,
> Lots of good work on this document! I've added my comments inline below.
>
> Copying Alex for comments as well.
>
> Kim
>
>
> On 10/12/07, Giannis Galanis <[EMAIL PROTECTED]> wrote:
> >
> > Kim,
> >
> > A couple of things in case I forget later today,
> >
> > First, concerning the  storage of the WEP keys, deleting the
> > nm/networks.cfg does not work. It is recreated after sometime. You must
> > delete and reboot, before trying to reconnect to the AP. I think it is an
> > important bug that the APs dont refresh in the neighbor screen when they are
> > configured. I had to reboot more than 10 times to finish my tests for
> > different types of WEP keys.(bug 4190)
>
>
> => Could you tell if you needed to reboot the AP and wait for it to settle
> and reboot the XO -- each just once, but you would have to do the order
> properly. It is my expectation that after setting up the AP, it would need
> to be rebooted. And it makes sense to me that the XO would need its file
> removed and then to be rebooted in order to 'see' it as a new AP. This
> should be in the release note. Here is what I *think* the release note
> should say (please make appropriate changes, etc and add it to the
> Kqrelease):
>
> If there is a change to the configuration of your infrastructure AP, then
> after making changes and rebooting the AP; please delete the network config
> file and reboot the XO. (you might want to put this in steps and say how to
> find the config file, etc)
>
>
> > Concerning the authentication via password and not WEP key, it is as I
> > described to you in my last email. In fact each manufacturer has its own
> > hashing algorithm, so it is virtually impossible to try all combinations. We
> > must come up with a convention( e.g. only the airport algorithm or smth)
>
>
> => We are in the business of the laptop and not the AP. So we are not
> going to be able to test with all the Access Points out there and all their
> configurations. What we can do is to document the ones we have tested (and
> the work arounds, as you found with the Airport Extreme); and make sure we
> invite others to add their support notes about any problems or advice for
> working with other APs.
>
> The 3 items(2 circles+battery) in the donut appeared again(4191), and I
> > reported it for the second time, this time as "Wireless" not as "Network
> > Manager" in case it goes through more efficiently. No one replied to the
> > first bug, although it must be very important.
>
>
> => I believe Dan's comments on this is that it is probably just a UI bug
> and not affecting the functioning. If you agree with that, then you should
> put a note in the bug about your thoughts and re-assign it to 'Sugar' so the
> right person will look at it. I would let it remain as 'high' priority until
> we know more. Also, can you put a note in the release notes on this one.
>
> Finally, have a look at this page:
> > http://wiki.laptop.org/go/Test_Network_Configuration
> >
> > It includes a detailed guide of how to examine your network, including
> > MPP, 169.x addresses, Gabble/Salut etc. I  have also included a script
> > which  collects all the useful information(resolv.conf, ip, jabber
> > server.. etc)  and disp