Things have been fairly stable for the last 48 hours, so here's a report on the 
current status:

Work done:

* Created a new API to write incoming test reports to a MySQL database
    * This removes Amazon SimpleDB and saves us $250/mo
    * In the future, we will be able to reduce our total disk usage through 
removing duplicate data
* Translated the Metabase reports to the new test report format
    * The new test report format has more fields for future expansion, 
including places for testers to report on all the dependencies of the 
distribution they tested
    * The API is doing this transparently
    * Another script is migrating all the existing data to the new format
* Started processing incoming test reports in parallel using the Minion job 
runner
    * Incoming reports generate a job on the queue
    * Worker processes process individual reports
    * This work can be spread to multiple machines if we can get access to 
hardware

Outstanding issues around this migration:

* The test_report table is latin1, and some reports are submitted with UTF-8 
characters.
    * Mitigation: `ascii => 1` in the serializer_options for the JSON column
    * Future Change: Make this table UTF-8 safe
* The Amazon Metabase instances are still up
    * After the next week or two of stable operation I will be shutting these 
down
* The original CPAN Testers generate process is still running
    * Once the Metabase instance is shut down, these processes will be removed 
from cron
* The Minion task runner must be moved to MySQL
    * Presently it is using SQLite and lock timeouts are an occasional annoyance
    * Moving to MySQL will let us have multiple machines running Minion workers.
    * MySQL allows for greater concurrency accessing the database (to insert 
new jobs and update job status)
    * The queue can grow to >10,000 unprocessed reports during the day, and I'd 
like to keep that from happening.
* The Minion task runner needs monitoring
    * Queue size will be a good indicator of how the system is doing
    * This will be a lot easier when Minion is using MySQL
* The test report and processed test report tables need monitoring
    * Right now there's a manual monitor in that Andreas e-mails me once every 
couple weeks to tell me that report processing has stopped
    * Counting the number of each and comparing the two should be a good 
indicator
    * InnoDB tables have trouble with `SELECT COUNT(*) FROM <table>` though...
        * These tables could be altered to add auto increment ID fields which 
could be a fast indicator of table size
* Some existing processes are still using the MySQL Metabase cache:
    * These processes are:
        * The original generate process
        * The view-report.cgi which views the full text of the report
    * Moving these processes to using the new test report format will improve 
performance
    * Then we can delete the Metabase cache to free up disk space

Thanks to:

* Joel Berger for writing the CPAN::Testers::Backend::ProcessReports module at 
the 2017 Perl QA Summit
* Andreas König and Slaven Rezić for their help troubleshooting backwards 
compatibility issues and report processing issues
* Barbie for finding some missing parts from the new report processor and 
quickly writing new scripts to fix them
* Everyone who helped test the new API code before this migration (Chris 
Williams, Ioan Rogers)

Next Steps:

* The machine is occasionally overloaded due to view-report.cgi, which causes 
timeouts when submitting reports for 5-10 minutes, so changing this to be a 
daemon that uses the new report format is a high priority.

Doug Bell
[email protected]



> On Aug 12, 2017, at 2:40 PM, Doug Bell <[email protected]> wrote:
> 
> The Metabase API has been changed over. Some bugs were fixed and everything 
> appears stable. If there are any problems, I will revert to the old Metabase 
> API. If I am unresponsive, testers can point their machines to 
> `metabase-old.cpantesters.org <http://metabase-old.cpantesters.org/>` to 
> reach the old Metabase API. All the old infrastructure is still ticking over 
> and will continue to do so until the new things have been stable for at least 
> a week.
> 
> Moving away from EC2 is a fairly major cost savings: What we were paying 
> $250/mo for we can now pay $0/mo for. That said, the Metabase section of the 
> site has been its own server for a long time, and now we're adding its work 
> to our one existing server that already does everything else except the 
> database.
> 
> To help spread the work out, if anyone still has one or two older, outdated 
> servers sitting in a rack somewhere doing nothing and could make them 
> available to the CPAN Testers project, let me know. I know people have 
> volunteered hardware before, but I wasn't in a place where I could easily 
> make use of it. Now that the new Metabase API exists, and now that I am very 
> close to a Minion job-queue-based backend processing system, I can more 
> easily spread work across multiple machines.
> 
> For donating, you'll get a place on the sponsors list 
> (http://iheart.cpantesters.org <http://iheart.cpantesters.org/>), and you'll 
> help continue making Perl and CPAN a community of people collaborating on 
> stable, useful software projects.
> 
> If there are any other questions, problems, or concerns, please let me know.
> 
> Thanks,
> 
> 
> 
> Doug Bell
> [email protected] <mailto:[email protected]>
> 
> 
> 
>> On Aug 9, 2017, at 3:03 PM, Doug Bell <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> [Cross-posting from the CPAN Testers blog: 
>> http://blog.cpantesters.org/diary/209 
>> <http://blog.cpantesters.org/diary/209>]
>> 
>> Summary: I will be doing work on the Metabase API on 2017-08-12. Writing 
>> test reports may be unresponsive for a few minutes, and there may be bugs. 
>> Please let me know if there are any problems submitting test reports.
>> 
>> I have completed the processing script for the new test report format. This 
>> was the last step in moving the Metabase API away from Amazon and on to our 
>> MySQL cluster for cost and stability reasons: Amazon SimpleDB is too 
>> expensive, and its limitations for our purposes outweigh its costs. We have 
>> always maintained a copy of the Metabase data in our MySQL database, and 
>> there's no real need to continue having two live copies of the same data 
>> (especially when one of the copies costs money every time you ask for a 
>> piece of data).
>> 
>> This Saturday, 2017-08-12, around 1:00 PM US/Central (18:00 UTC), I will be 
>> switching DNS over to the new, backwards-compatible Metabase API which 
>> writes to our MySQL database. A few months ago, I asked for testers to try 
>> this new API out, and everything went well (thanks to everyone who helped 
>> with that). The new API works the same as the old API: No changes are needed 
>> for your testers or anyone consuming the minimal data feeds out of the 
>> Metabase API (the log.txt view).
>> 
>> Since this is only a DNS change, the downtime for the change should be zero 
>> as DNS propagates and your testers are pointed at the new IP address. Since 
>> it's possible for me to mess up this change, there may be some downtime. 
>> Since all software has bugs, there may be some downtime if any bugs are 
>> revealed by all the testers being migrated to the new API.
>> 
>> This change (and all the work around this change) sets up the project for 
>> new changes down the road:
>> 
>> * Speeding up report processing by triggering individual report processing 
>> jobs as reports are submitted
>> * Distributing those processing jobs over multiple machines to improve 
>> performance
>> * Making the test report text available immediately after submission instead 
>> of having to wait for backend processing jobs
>> 
>> If you have any questions, feel free to reply to this thread or to me 
>> directly.
>> 
>> Doug Bell
>> [email protected] <mailto:[email protected]>
>> 
>> 
>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to