On 30 June 2015 at 03:00, jessamyn c. west <jessa...@gmail.com> wrote:
> Apologies for a late reply, it seems like you've done some great work. > We can work to update the documentation. > > Thanks, I'm happy to be able to help out! > > What should people like me who want to contribute to the codebase be > working > > on, and who should we liaise with at OL/IA to make sure we are making > useful > > contributions? > > I'm probably the right liasing person. Right now there is no active OL > development, the main developer left in December and the backup > developer is on his way out if he hasn't left already (not really > project related afaict) and it's unclear whether even maintenance > requests/tickets are being worked on, so I've been doing triage on > bugs in addition to mostly just answering the support email that > continues to come,. > > An IA dev, Giovanni, merged my test fixes, and an old pull request of Ben's, which fixed some reported rdf issues. That made me realise that getting code merged is only the first step. The rdf changes have not been deployed as far as I can tell, so knowing what is required for a production release is the next piece to get any new fixes or features out there and usable. > The biggest challenge I have facing me now is the spam issue. We have > a bunch of people adding spammy entries in Korean and no good way to > either keep them out or even go back and bulk delete pattern-matched > spam. I'd love to have a tool that does this. I'm not sure if a person > working on the code would also need admin privs in terms of their > account so please let me know if that's a stumbling block. > > I noticed the spamming, which seems to be ongoing, and spent a bit of time looking into how that was possible, it seems like there is some level of human intervention required to beat the re-captchas, but it's something that spammers frequently do. The re-captchas on Openlibrary may not be using the latest tricks available from Google, so there might be a way to improve the spam protection to some extent, but from what I've read so far nothing seems bullet proof, unfortunately. For getting a list of the spam entries I found an undocumented(?) api call to filter recent changes by the author which shows the adds and edits. Some recent example from spam accounts: https://openlibrary.org/recentchanges.json?author=/people/aesdfaff https://openlibrary.org/recentchanges.json?author=/people/dxcvxvcv https://openlibrary.org/recentchanges.json?author=/people/cutecutie498 Starting with a list of spam users and then collecting all their added works shouldn't be too difficult. What is the current method of removing unwanted entries? I could have a go at something that produces lists of OL ids of spam based on user ids and send them through if you don't already have a way to filter them. > Just getting a group of developers looking at the code would be > incredibly helpful on my end, to have someone to bounce minor issues > off of. Let me know how I can help. > You mentioned an issue with the waiting lists as a priority, I couldn't find a github issue with the details though. I had to fix up the waiting list tests, so I could keep digging around there to see what I can improve. I'm worried now that getting updates released might be a much harder goal that getting code merged if there aren't any IA devs to oversee the release process, or support if something should go wrong. Merging code on github is one thing, getting it released sounds like it could be close to impossible if there isn't a currently functioning pipeline. There's a `production` branch in github that is very far behind the current master (last update 2011!), I'm not sure exactly what code is in production as of now, but I thought it had been updated since 2011? Charles.
_______________________________________________ Ol-discuss mailing list - Ol-discuss@archive.org http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss Archives: http://www.mail-archive.com/ol-discuss@archive.org/ To unsubscribe from this mailing list, send email to ol-discuss-unsubscr...@archive.org