# What we did in April 2015

A lot of the work done in April was related to a funny problem we encountered. 
Basically the machine previously used for the OONI data pipeline had run out of 
disk storage (1TB) and the daily batch processing task was requiring more than 
6 hours to run. During this time the database and other important services on 
the data pipeline were not responsive, so we concluded that we had to start 
looking into a way to start scaling our infrastructure “horizontally”.
Although the amount of data that we are currently ingesting (on average 1-2 GB 
per day) does not necessarily require a big data like solution, we expect this 
value to increase at least by 1 or 2 orders of magnitude (20-200 GB per day). 
Given the fact that we had just recently had to move to another more powerful 
machine, we concluded it was ideal to try and tackle this problem looking at 
the future.
On this matter we:

* Experimented with various big data solutions and implemented some patches for 
the existing tools:
https://github.com/mumrah/kafka-python/pull/376
https://github.com/spotify/luigi/pull/910
https://github.com/Parsely/streamparse/pull/142

* We got in contact with various different vendors of big data cloud and bare 
metal solutions in order to evaluate their offering and see if it would be 
possible to receive sponsorship from them.

* We started working on a hadoop based pipeline implementation:
https://github.com/hellais/ooni-pipeline-ng

Moreover:

* We worked on organising the OONI hackathon and concluded that it would be 
ideal to postpone it to the end of summer (probably around September).

* We finished implementing an alpha prototype of libight for iOS that allows 
the user to run 3 basic OONI tests:
https://github.com/TheTorProject/libight-ios

* Update the bouncer to point to another mlab-ns server

* Release ooniprobe 1.3.1 and include it inside of debian stretch

* Publish the new OONI website: https://ooni.torproject.org/

* We implemented OONI tests for some censorship circumvention tools and 
analysed how they work:

meek: https://github.com/TheTorProject/ooni-probe/pull/387
https://github.com/TheTorProject/ooni-spec/pull/38

lantern: https://github.com/TheTorProject/ooni-probe/pull/388
https://github.com/TheTorProject/ooni-spec/pull/40

~ Arturo

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
tor-reports mailing list
[email protected]
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-reports

Reply via email to