Re: [ol-discuss] Ol-discuss Digest, Vol 88, Issue 1

Charles Horn Thu, 02 Jul 2015 04:36:07 -0700

On 30 June 2015 at 03:00, jessamyn c. west <jessa...@gmail.com> wrote:


> Apologies for a late reply, it seems like you've done some great work.
> We can work to update the documentation.
>
>
Thanks, I'm happy to be able to help out!



> > What should people like me who want to contribute to the codebase be
> working
> > on, and who should we liaise with at OL/IA to make sure we are making
> useful
> > contributions?
>
> I'm probably the right liasing person. Right now there is no active OL
> development, the main developer left in December and the backup
> developer is on his way out if he hasn't left already (not really
> project related afaict) and it's unclear whether even maintenance
> requests/tickets are being worked on, so I've been doing triage on
> bugs in addition to mostly just answering the support email that
> continues to come,.
>
>
An IA dev, Giovanni, merged my test fixes, and an old pull request of
Ben's, which fixed some reported rdf issues. That made me realise that
getting code merged is only the first step. The rdf changes have not been
deployed as far as I can tell, so knowing what is required for a production
release is the next piece to get any new fixes or features out there and
usable.



> The biggest challenge I have facing me now is the spam issue. We have
> a bunch of people adding spammy entries in Korean and no good way to
> either keep them out or even go back and bulk delete pattern-matched
> spam. I'd love to have a tool that does this. I'm not sure if a person
> working on the code would also need admin privs in terms of their
> account so please let me know if that's a stumbling block.
>
>
I noticed the spamming, which seems to be ongoing, and spent a bit of time
looking into how that was possible, it seems like there is  some level of
human intervention required to beat the re-captchas, but it's something
that spammers frequently do. The re-captchas on Openlibrary may not be
using the latest tricks available from Google, so there might be a way to
improve the spam protection to some extent, but from what I've read so far
nothing seems bullet proof, unfortunately.

For getting a list of the spam entries I found an undocumented(?) api call
to filter recent changes by the author which shows the adds and edits. Some
recent example from spam accounts:

https://openlibrary.org/recentchanges.json?author=/people/aesdfaff
https://openlibrary.org/recentchanges.json?author=/people/dxcvxvcv
https://openlibrary.org/recentchanges.json?author=/people/cutecutie498

Starting with a list of spam users and then collecting all their added
works shouldn't be too difficult. What is the current method of removing
unwanted entries? I could have a go at something that produces lists of OL
ids of spam based on user ids and send them through if you don't already
have a way to filter them.


> Just getting a group of developers looking at the code would be
> incredibly helpful on my end, to have someone to bounce minor issues
> off of. Let me know how I can help.
>

You mentioned an issue with the waiting lists as a priority, I couldn't
find a github issue with the details though. I had to fix up the waiting
list tests, so I could keep digging around there to see what I can improve.

 I'm worried now that getting updates released might be a much harder goal
that getting code merged if there aren't any IA devs to oversee the release
process, or support if something should go wrong. Merging code on github is
one thing, getting it released sounds like it could be close to impossible
if there isn't a currently functioning pipeline. There's a `production`
branch in github that is very far behind the current master (last update
2011!), I'm not sure exactly what code is in production as of now, but I
thought it had been updated since 2011?

Charles.

_______________________________________________
Ol-discuss mailing list - Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/ol-discuss@archive.org/
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org

Re: [ol-discuss] Ol-discuss Digest, Vol 88, Issue 1

Reply via email to