Re: [libreoffice-projects] German community meeting in Essen

2015-03-09 Thread Italo Vignoli
On 09/03/15 12:34, Florian Effenberger wrote:

 Note that most of the slots will be in German language, but of course
 English speakers are welcome as well - exchanging thoughts and ideas
 across countries is usually quite constructive!

Hi Florian, would a talk about the story of the Italian community over
the last four years be of any interest? Of course, more on the growth
factor than on other topics such as fight with AOO (which is still a
concern).

-- 
Italo Vignoli - The Document Foundation
mob IT +39.348.5653829 - mob EU +39.392.7481795
email it...@libreoffice.org - skype italovignoli
email / hangout italo.vign...@gmail.com
GPG Key ID - 0xAAB8D5C0
DB75 1534 3FD0 EA5F 56B5 FDA6 DE82 934C AAB8 D5C0

-- 
To unsubscribe e-mail to: projects+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/projects/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-projects] German community meeting in Essen

2015-03-09 Thread Italo Vignoli
On 09/03/15 13:35, Florian Effenberger wrote:

 Hearing that could of course be interesting! :-) I would just not travel
 from Italy to Essen because of a one hour talk or so. Maybe something
 that can be done via Hangouts?

Of course.

 I you planned to join anyways, sure. :-) We just won't plan in more than
 one hour or so, I guess, as I expect the agenda will be quite filled.

My wife is away during that weekend, so joining would not be a problem,
apart from travel expenses.

-- 
Italo Vignoli - The Document Foundation
mob IT +39.348.5653829 - mob EU +39.392.7481795
email it...@libreoffice.org - skype italovignoli
email / hangout italo.vign...@gmail.com
GPG Key ID - 0xAAB8D5C0
DB75 1534 3FD0 EA5F 56B5 FDA6 DE82 934C AAB8 D5C0

-- 
To unsubscribe e-mail to: projects+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/projects/
All messages sent to this list will be publicly archived and cannot be deleted


[libreoffice-projects] German community meeting in Essen

2015-03-09 Thread Florian Effenberger

Hello,

I'd like to inform you that the German community has planned their 
annual meeting at Linuxhotel in Essen, from


Friday, June 19 to Sunday, June 21

As the last years, we will meet and gather with the community, discuss 
about projects carried out in the past, and plan for the future. An 
agenda is yet to be written.


Everyone is invited to join, of course, to get in touch with community 
members, and contribute to the success of LibreOffice. ;-)


There's also a limited number of rooms available, but we have to book 
right now. So if you're sure you want to join already now, please let us 
know in time. Otherwise, there's a couple of other hotels available in 
Essen, and you can still join during the day.


Note that most of the slots will be in German language, but of course 
English speakers are welcome as well - exchanging thoughts and ideas 
across countries is usually quite constructive!


Looking forward to meeting you!
Florian

--
To unsubscribe e-mail to: projects+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/projects/
All messages sent to this list will be publicly archived and cannot be deleted


[libreoffice-projects] infra analysis and status quo

2015-03-09 Thread Florian Effenberger

Hello,

the infra team has met to review the past issues, and would like to make 
its analysis public to the community:



I. introduction

We have three big servers running, called dauntless, excelsior and 
falco. The first two were put online in October, the last one in 
December. The planned platform was oVirt with Gluster, with CentOS as 
the base system. All servers are comparable in their hardware setup, 
with 256 GB of RAM, one internal and one external Gbit networking card, 
server-grade mainboard with IPMI, several HDDs, and 64 core CPU. All 
three hosts were connected internally via a Gbit link, with oVirt 
managing all of them, and Gluster being the underlying network file system.



II. before going productive

Extensive tests were carried out before going live with the platform. 
The concrete CentOS, oVirt and Gluster version used for production was 
tested twice, on separate and on the actual hardware, with the IPMI and 
BIOS versions used later; that included two desaster simulations, where 
one host was disconnected unannounced and oVirt behaved exactly as 
expected - running the VMs on the other host without interruption.


When the platform was ready, to not endanger anything, first all 
non-critical VMs were migrated, i.e. mainly testing VMs where a downtime 
is not critical, but that still produce quite some load. Working 
exclusively with these was done over several weeks, with no problems 
detected.


After that, several other VMs were migrated, including Gerrit, and the 
system worked fine for weeks with no I/O issues and no downtime.



III. downtime before FOSDEM

The first issues happened from Wednesday, January 28, around 1440 UTC, 
until Thursday, January 29, early morning. Falco was disconnected for up 
to a couple of minutes. The reason of this is still unclear.


- The infra team is looking for an oVirt expert who can help us 
parse the logs to better understand what has happened.


At the same time, on an unrelated reason, a file system corruption on 
excelsior was discovered, which is a failure that has been in existence 
since January 5 already.


- The monitoring scripts, based on snmpd, claimed everything was 
ok. Scripts have already been enhanced to properly report the status.


Each of the errors on their own, if detected in time, would have caused 
no downtime at all.


With that, 2 our of 3 Gluster bricks were down, and the platform was 
stopped. (Gluster is comparable to RAID 5 here.) Gluster detected a 
possible split-brain situation, so a manual recovery was required. The 
actual start of the fix was not complicated, in comparison to other 
networking file systems, and could easily be handled, but the recovery 
took long due to the amount of data already on the platform and the 
internal Gbit connectivity. Depending on the VM, the downtime was 
between 3 and 18 hours. oVirt's database also had issues, which could 
however be fixed. In other words, the reason for most of the downtime 
was not finding a fix, but waiting for it to be completed.


- Situations like these can be less time-consuming with an 
(expensive) internal 10 Gbit connection, or with a (slower, but more 
redundant and cheaper) internal trunked/bonded x * 1 Gbit connection, 
which we will be looking into.


- Work in progress currently is an SMS monitoring system where we 
seek for volunteers to be included in the notification. SMS notification 
is to be sent out in case of severe issues and can be combined with 
Android tools like Alarmbox.


- In the meantime, we have also fine-tuned the alerts and 
thresholds to distinguish messages.


All infra members including several volunteers were working jointly on 
getting the services back together. However, we experienced some issues 
with individual VMs, where it was unclear who is responsible for them, 
and where documentation was partially missing or outdated. It worked out 
in the end, however.


- Infra will enforce a policy for new and existing services. At 
least 2 responsible maintainers are required per service, including 
proper documentation. That will be announced with a fair deadline. 
Services not fulfilling those requirements will be moved from production 
to test. A concrete policy is still to be drafted with the public.


On a side note, we discovered, although it has worked fine for months, 
and survived two desaster simulations, that oVirt does not support 
Gluster running on the internal interface, and the hosted engine and 
management on the external interface. This fact is undocumented in the 
oVirt documentation and was discovered by Alex during FOSDEM, when he 
attended an oVirt talk, where this was mentioned as a side-note.


- An option is to look into SAN solutions, which are not only 
faster, but also probably more reliable. We might have some supporting 
offer here that needs looking into.


During FOSEM, Alex also got in touch with one oVirt and one Gluster 
developer. We also talked to a