Re: [ol-discuss] Ol-discuss Digest, Vol 88, Issue 1

2015-07-02 Thread Charles Horn
On 30 June 2015 at 03:00, jessamyn c. west jessa...@gmail.com wrote:

 Apologies for a late reply, it seems like you've done some great work.
 We can work to update the documentation.


Thanks, I'm happy to be able to help out!



  What should people like me who want to contribute to the codebase be
 working
  on, and who should we liaise with at OL/IA to make sure we are making
 useful
  contributions?

 I'm probably the right liasing person. Right now there is no active OL
 development, the main developer left in December and the backup
 developer is on his way out if he hasn't left already (not really
 project related afaict) and it's unclear whether even maintenance
 requests/tickets are being worked on, so I've been doing triage on
 bugs in addition to mostly just answering the support email that
 continues to come,.


An IA dev, Giovanni, merged my test fixes, and an old pull request of
Ben's, which fixed some reported rdf issues. That made me realise that
getting code merged is only the first step. The rdf changes have not been
deployed as far as I can tell, so knowing what is required for a production
release is the next piece to get any new fixes or features out there and
usable.



 The biggest challenge I have facing me now is the spam issue. We have
 a bunch of people adding spammy entries in Korean and no good way to
 either keep them out or even go back and bulk delete pattern-matched
 spam. I'd love to have a tool that does this. I'm not sure if a person
 working on the code would also need admin privs in terms of their
 account so please let me know if that's a stumbling block.


I noticed the spamming, which seems to be ongoing, and spent a bit of time
looking into how that was possible, it seems like there is  some level of
human intervention required to beat the re-captchas, but it's something
that spammers frequently do. The re-captchas on Openlibrary may not be
using the latest tricks available from Google, so there might be a way to
improve the spam protection to some extent, but from what I've read so far
nothing seems bullet proof, unfortunately.

For getting a list of the spam entries I found an undocumented(?) api call
to filter recent changes by the author which shows the adds and edits. Some
recent example from spam accounts:

https://openlibrary.org/recentchanges.json?author=/people/aesdfaff
https://openlibrary.org/recentchanges.json?author=/people/dxcvxvcv
https://openlibrary.org/recentchanges.json?author=/people/cutecutie498

Starting with a list of spam users and then collecting all their added
works shouldn't be too difficult. What is the current method of removing
unwanted entries? I could have a go at something that produces lists of OL
ids of spam based on user ids and send them through if you don't already
have a way to filter them.


 Just getting a group of developers looking at the code would be
 incredibly helpful on my end, to have someone to bounce minor issues
 off of. Let me know how I can help.


You mentioned an issue with the waiting lists as a priority, I couldn't
find a github issue with the details though. I had to fix up the waiting
list tests, so I could keep digging around there to see what I can improve.

 I'm worried now that getting updates released might be a much harder goal
that getting code merged if there aren't any IA devs to oversee the release
process, or support if something should go wrong. Merging code on github is
one thing, getting it released sounds like it could be close to impossible
if there isn't a currently functioning pipeline. There's a `production`
branch in github that is very far behind the current master (last update
2011!), I'm not sure exactly what code is in production as of now, but I
thought it had been updated since 2011?

Charles.
___
Ol-discuss mailing list - Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/ol-discuss@archive.org/
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org

Re: [ol-discuss] Ol-discuss Digest, Vol 88, Issue 1

2015-07-02 Thread Charles Horn
On 30 June 2015 at 04:29, Samuel Klein meta...@gmail.com wrote:


 On Jun 29, 2015 2:15 PM, Tom Morris tfmor...@gmail.com wrote:
 
 
  There's a bunch of stuff that could be done to streamline the spam
 flagging and processing, do more automated spam detection, etc.  The
 current web form reporting and lone pioneer cleaning up doesn't scale...
 
  It's sad to see because it wouldn't really take that much effort to make
 it a thriving and vibrant site, but IA just doesn't care.

 I was just in SF running a CODEX hackathon, and met some IA folk who
 talked about how to make OL awesome again + better integrated  supported.

 So at least some people are discussing caring. And I believe they're
 looking to hire ~5 developers across the Archive.

 A list of priorities — from the community of OL users and would-be
 contributors — as a subset of the many many open bugs and requests — could
 be helpful to any new push.

This sounds like great news. Is there a way to connect with these IA folk,
or just let them know we are interested in supporting a revival of the
project?

Charles.
___
Ol-discuss mailing list - Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/ol-discuss@archive.org/
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org

Re: [ol-discuss] Ol-discuss Digest, Vol 88, Issue 1

2015-06-11 Thread Charles Horn
 recently picked up Python. I have a passion for books, and large open
data resources. I've been lurking around Open Library for years, but
haven't used it as much as I would like for reasons that seem quite common,
i.e. it's often difficult/impossible to fix something you find that is
broken. My initial interest in OL was in its listing of antiquarian
classical Greek books, I wanted to use the lists feature as a basic
collection manager (I use, contribute, and like discogs.com for managing my
record collection).  I quickly discovered all the issues in how the
combination of antiquarian publications and non-latin characters make
cataloging difficult. There's still a lot I have to learn about the
librarianship of that, but I'm probably better placed to help code wise!

Regards,
Charles Horn.

https://openlibrary.org/people/hornc
___
Ol-discuss mailing list - Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/ol-discuss@archive.org/
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org

Re: [ol-discuss] OL cover puzzle

2013-12-22 Thread Charles Horn
This looks to be the original source of the Penguin cartoon:
http://www.jgoode.com/big-mouth-strikes-again-this-time-its-penguins/

I don't know what this designer has to do with Giotto though.

Charles.


On 23 December 2013 12:02, Karen Coyle kco...@kcoyle.net wrote:

 This book in OL:

 https://openlibrary.org/books/OL22985410M/Giotto.

 Is this book in Amazon:


 http://www.amazon.com/s/ref=nb_sb_noss/186-9380498-5837622?url=search-alias%3Dapsfield-keywords=0789448513

 Can anyone figure out why the cover art is a penguin cartoon?

 (If we had a little bit of prize money, we could make a real game out of
 these kinds of anomalies)

 kc
 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet
 ___
 Ol-discuss mailing list - Ol-discuss@archive.org
 http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
 Archives: http://www.mail-archive.com/ol-discuss@archive.org/
 To unsubscribe from this mailing list, send email to
 ol-discuss-unsubscr...@archive.org

___
Ol-discuss mailing list - Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
Archives: http://www.mail-archive.com/ol-discuss@archive.org/
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org