Author: sebb
Date: Wed Oct 14 23:44:40 2015
New Revision: 1708717
URL: http://svn.apache.org/viewvc?rev=1708717&view=rev
Log:
Run parseprojects.py as a cron job
Added:
comdev/projects.apache.org/scripts/cronjobs/parseprojects.py
- copied unchanged from r1704813,
comdev/projects.apache.org/scripts/import/parseprojects.py
Removed:
comdev/projects.apache.org/scripts/import/parseprojects.py
Modified:
comdev/projects.apache.org/STRUCTURE.txt
comdev/projects.apache.org/scripts/README.txt
Modified: comdev/projects.apache.org/STRUCTURE.txt
URL:
http://svn.apache.org/viewvc/comdev/projects.apache.org/STRUCTURE.txt?rev=1708717&r1=1708716&r2=1708717&view=diff
==============================================================================
--- comdev/projects.apache.org/STRUCTURE.txt (original)
+++ comdev/projects.apache.org/STRUCTURE.txt Wed Oct 14 23:44:40 2015
@@ -59,6 +59,7 @@ crontab -l -u www-data:
00 00 * * * cd /var/www/projects.apache.org/scripts/cronjobs &&
./python3logger.sh countaccounts.py
00 00 * * * cd /var/www/projects.apache.org/scripts/cronjobs &&
./python3logger.sh parsereleases.py
00 01 * * * cd /var/www/projects.apache.org/scripts/cronjobs &&
./python3logger.sh parsecommitteeinfo.py
+00 01 * * * cd /var/www/projects.apache.org/scripts/cronjobs &&
./python3logger.sh parseprojects.py
10 4 * * * cd /var/www/projects.apache.org/site/json && ( svn status | awk
'/^\? / {print $2}' | xargs -r svn add )
Modified: comdev/projects.apache.org/scripts/README.txt
URL:
http://svn.apache.org/viewvc/comdev/projects.apache.org/scripts/README.txt?rev=1708717&r1=1708716&r2=1708717&view=diff
==============================================================================
--- comdev/projects.apache.org/scripts/README.txt (original)
+++ comdev/projects.apache.org/scripts/README.txt Wed Oct 14 23:44:40 2015
@@ -43,22 +43,8 @@ various sources:
out: json/foundation/releases.json
+ json/foundation/releases-files.json
-
-2. importing data (import)
-
-- parsecommittees.py: Parses committee-info.txt to detect new and retired
committees and imports PMC data (RDF) from
- PMC data files
- No longer needed, use parsecommitteeinfo instead
-
- parseprojects.py: Parses existing projects RDF(DOAP) files and turns them
into JSON objects.
in: data/projects.xml + projects' DOAP files
out: site/json/projects/*.json - JSON versions of DOAP files
+ site/json/foundation/projects.json - combined listing of all projects
+ site/doap/{committeeId}/{project}.rdf - these are exact copies of the
DOAPs listed in data/projects.xml
-
-NOTICE: what prevents import scripts to be added to cron?
-1. parse committees.py requires committee-info.txt, which is not available on
project-vm (require authentication)
- Whimsy now supplies a JSON version of CI
-2. both scripts not only update files but sometimes need to add new files (new
committees or new projects) or move
- (projects going to Attic or retired committees)
- TODO: any reason why scripts should not do this automatically?