Hello

I am still running into many issues with smw_RefreshData.php for a refresh
of pages with Virtuoso as a SPARQLstore.

Some issues are caused with my data and some need more troubleshooting
(plain 'request denied' from the php script but the SPARQL query itself
works fine directly in Virtuoso).

I am looking for suggestions from anyone familiar with the refresh process
about where would be a good place to start patching for two new options:

1- I would like the script to skip pages that fail to refresh for whatever
reason and log them somewhere so that the refresh can resume with the next
available page.

2- I would like to improve the way the script runs in batch mode and allow
to save the last updated page so I can do things like processing 1000 page
refresh a day in a script,

Right now, the script just stops randomly when it runs into pages with
issues. Since I have more than 300,000 page IDs to process, running through
the whole thing with a few seconds delay between pages would require days.
It is not possible to sit there and babysit that script to restart it every
time it fails.

I don't mind looking into the code and proposing fixes to github. I just
need a good place to start between the many layers of the script.

-- 
- Laurent Alquier
http://www.linfa.net
------------------------------------------------------------------------------
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to