On 30 June 2015 at 03:00, jessamyn c. west jessa...@gmail.com wrote:
Apologies for a late reply, it seems like you've done some great work.
We can work to update the documentation.
Thanks, I'm happy to be able to help out!
What should people like me who want to contribute to the codebase be
working
on, and who should we liaise with at OL/IA to make sure we are making
useful
contributions?
I'm probably the right liasing person. Right now there is no active OL
development, the main developer left in December and the backup
developer is on his way out if he hasn't left already (not really
project related afaict) and it's unclear whether even maintenance
requests/tickets are being worked on, so I've been doing triage on
bugs in addition to mostly just answering the support email that
continues to come,.
An IA dev, Giovanni, merged my test fixes, and an old pull request of
Ben's, which fixed some reported rdf issues. That made me realise that
getting code merged is only the first step. The rdf changes have not been
deployed as far as I can tell, so knowing what is required for a production
release is the next piece to get any new fixes or features out there and
usable.
The biggest challenge I have facing me now is the spam issue. We have
a bunch of people adding spammy entries in Korean and no good way to
either keep them out or even go back and bulk delete pattern-matched
spam. I'd love to have a tool that does this. I'm not sure if a person
working on the code would also need admin privs in terms of their
account so please let me know if that's a stumbling block.
I noticed the spamming, which seems to be ongoing, and spent a bit of time
looking into how that was possible, it seems like there is some level of
human intervention required to beat the re-captchas, but it's something
that spammers frequently do. The re-captchas on Openlibrary may not be
using the latest tricks available from Google, so there might be a way to
improve the spam protection to some extent, but from what I've read so far
nothing seems bullet proof, unfortunately.
For getting a list of the spam entries I found an undocumented(?) api call
to filter recent changes by the author which shows the adds and edits. Some
recent example from spam accounts:
https://openlibrary.org/recentchanges.json?author=/people/aesdfaff
https://openlibrary.org/recentchanges.json?author=/people/dxcvxvcv
https://openlibrary.org/recentchanges.json?author=/people/cutecutie498
Starting with a list of spam users and then collecting all their added
works shouldn't be too difficult. What is the current method of removing
unwanted entries? I could have a go at something that produces lists of OL
ids of spam based on user ids and send them through if you don't already
have a way to filter them.
Just getting a group of developers looking at the code would be
incredibly helpful on my end, to have someone to bounce minor issues
off of. Let me know how I can help.
You mentioned an issue with the waiting lists as a priority, I couldn't
find a github issue with the details though. I had to fix up the waiting
list tests, so I could keep digging around there to see what I can improve.
I'm worried now that getting updates released might be a much harder goal
that getting code merged if there aren't any IA devs to oversee the release
process, or support if something should go wrong. Merging code on github is
one thing, getting it released sounds like it could be close to impossible
if there isn't a currently functioning pipeline. There's a `production`
branch in github that is very far behind the current master (last update
2011!), I'm not sure exactly what code is in production as of now, but I
thought it had been updated since 2011?
Charles.
___
Ol-discuss mailing list - Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/ol-discuss@archive.org/
To unsubscribe from this mailing list, send email to
ol-discuss-unsubscr...@archive.org