Re: [OPEN-ILS-GENERAL] SPAM Re: Awesome Box Integration

McCanna, Terran Fri, 26 Sep 2014 08:43:29 -0700

We anonymize our data as much as we can without causing problems in the system. 
We archive old circulations and do not even allow patrons to opt in to store 
their reading histories. Yes, there are still remnants of the data there, and 
we obviously can't clear out data related to current transactions or to fines, 
but it is our intention to maintain as much patron privacy as possible. In 
fact, we take patron privacy far more seriously than most of our patrons do. 
System administrators that have direct access to the data can get all sorts of 
information if they take the time to find it, but the patron's complete circ 
history isn't available in their record or through a report. This limits the 
amount of information that any staff person can find out about anyone else, and 
it ensures that any Patriot Act or other information requests for circulation 
history have to come to the managing office where we can ensure that the 
request meets the letter of the law and that frontline circ staff at some 
remote branch aren't being pressured into giving out all of that information to 
members of government or law enforcement that should not have access to it 
without following proper procedures. I would hope that all library systems are 
taking equal care.


As long as the Awesome Box functionality is designed to hide its ties to 
individual patrons, then I think it's great, but I think it would do a 
disservice to patrons to simply dismiss privacy issues out of hand.


Terran McCanna 
PINES Program Manager 
Georgia Public Library Service 
1800 Century Place, Suite 150 
Atlanta, GA 30345 
404-235-7138 
[email protected] 

----- Original Message -----
From: "Rogan Hamby" <[email protected]>
To: "Evergreen Discussion Group" <[email protected]>
Sent: Friday, September 26, 2014 10:55:27 AM
Subject: Re: [OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration


FWIW, there isn't any reason for patron data to be exposed and privacy issue on 
a display level here. The privacy discussion is really a distraction from the 
Awesome Box discussion in my opinion. Some libraries may anonymize (or wipe) 
older data while others don't but that data existing and using it under the 
hood is a totally different thing from exposing it users (my point). Now if you 
do wipe it you obviously don't want to suddenly have features that depend on 
it, an important point for those who do wipe it (and I wonder if their 
libraries are expressly exempt from record retention laws) but that was Kathy's 
point about configurability. And even if you did use historical circulations 
integrated for awesome box that doesn't mean it has to be used the same way for 
all type of users with different anonymization of data. Of course, I doubt that 
some who think their data is wiped understand that it probably is not. 
Evergreen does not natively erase or anonymize old information, it's just 
inaccessible to casual users, which is not the same as not existing. That's a 
fairly common mistake for users not familiar with the database layer. 


Clear as mud? So, as I said I suspect that if we don't want to completely 
derail this with tangents it's probably best to put the privacy issue aside and 
look at Awesome box features not tied to patron specific data. 






On Fri, Sep 26, 2014 at 10:44 AM, Ruth Frasur < [email protected] 
> wrote: 



I don't have anything of value to add to this other than while, of course, I 
love the idea of reader recommendations and Awesome Box integration in any 
form, I also think there would HAVE to be some type of anonymizing (sp?) of 
patron data. I don't think this is impossible BUT, as Rogan has said, there is 
a definite danger of project creep. My suggestion, fwiw, is to find some 
first/second step for Awesome Box integration and focus more on building a 
foundation (that may or may not have truly visible/useful features for end 
users) on which others (or other projects) could expand. 




On Thu, Sep 25, 2014 at 11:56 PM, Rogan Hamby < [email protected] > 
wrote: 


I'm concerned with project creep as well as I noted in one of early missives. 
If this is stored independent of patron data (which actually I think it should) 
then I think we should also track circs since the feature was turned on so it 
could say "3 out of 4" people found it awesome. 


Stepping back a bit to recommendations and anonymizing records, we don't 
anonymize historical circs. We don't expose that data and take staff level 
access to it pretty seriously. Due to varying state and county regulations 
dictating minimum record retentions we're still at least 2 years out from being 
to safely wipe our oldest records. Maybe more. 


And anonymizing it closes certain opportunities. Some are mundane like 
addressing old conflicts and billing questions but those can be big in their 
own right. As the circ manager who talks to the upset patron I may have a 
different point of view on that. :) 


Analyzing circulation patterns is far more interesting though and I am long 
term interested in recommendations. In the age of Anazin, Netflix and everyone 
else this is not just valuable but expected. It's perhaps the patron request I 
hear most. 



Coupled with some holds features it would be a great great boon for home bound 
services which I feel are a critical function of libraries, at least in my 
state where it's a strong traditional service. I assume elsewhere as well 
though I know mileage varies. 


And it was the building block of several functions that GA PINES identified as 
critical for TBS support during the Loblolly conference. We may never fully 
support TBS programs in Evergreen but I thought GA PINES collected a lot of 
great ideas and input there and would hate to discard that. 



On Thursday, September 25, 2014, Kathy Lussier < [email protected] > wrote: 




Hi all, 

Great discussion so far! 

We had a bit of a discussion about privacy concerns in IRC after Terran sent 
her original message. One approach we were discussing was storing the awesome 
tags in an anonymous fashion, except in cases where patrons have opted into 
saving their circ history. In those cases, the user has already consented to 
having this information saved and could have a more enhanced experience with 
the recommendation engine. Others who were part of the discussion could 
elaborate or correct me if I'm not articulating the ideas correctly. The 
discussion can be found at 
http://irc.evergreen-ils.org/evergreen/2014-09-25#i_126632 . 

In relation to genres, Vanya said: 



Maybe, as a solution to that, we can have a hierarchical algorithm for 
categorizing. In other words, we can allow the administrator to decide whether 
the categorization comes all the way down to genres, or just takes into account 
the overall weight of the user's awesome tag. 
I like the idea of making this configurable, because there may be systems where 
data identifying genre is a little more clear cut. Better yet, how about if we 
allow an Evergreen site to define the categories that are used. Some sites may 
use the MARC fixed fields for fiction/non-fiction. Other sites may decided that 
values stored in the 655 MARC field work for them. 

Is there something already exists in Evergreen that we could leverage for this 
purpose? My first thought was MVF. 

I do have one general recommendation speaking with my OPW admin hat on. It 
really is a general recommendation for any of the OPW candidates who might be 
following along. I mentioned in IRC today that I'm not a developer, but I've 
managed a lot of development projects, and one thing I try to watch out for is 
project creep. As we continue to talk about the project and think of new 
configuration options to make it a more flexible project, it can also become a 
very large project that isn't as easy to manage. 

Therefore, as you think through how you plan to implement the project, I 
recommend breaking it up into distinct milestones. You might want to start with 
smaller tasks as you ease into the project (e.g. collecting the awesome tags 
and sending them along to the Awesome Box site), and then move on to the larger 
components once you become more familiar with the system. 

Kathy 


Kathy Lussier
Project Coordinator
Massachusetts Library Network Cooperative (508) 343-0128 [email protected] 
Twitter: http://www.twitter.com/kmlussier #evergreen IRC: kmlussier On 
9/25/2014 6:40 PM, Tim Spindler wrote: 



Overall, I really like the ideas talked about but I agree with Terran that 
something would have to be done with circ data related to patrons. We use the 
purge function to anonymize our patron data but I could see other ways of 
dealing with this. We also have retention policies related to retaining patron 
circulation data. 



On Thu, Sep 25, 2014 at 4:54 PM, Rogan Hamby < [email protected] > 
wrote: 



I suppose I don't understand the concern on your part as at that level if 
someone could access the raw db they could just query someone's circulation 
history, fine payments, etc... since those are recorded as transactions unless 
you're doing something to anonymize or wipe those as soon as they're done. Even 
then someone could see all current transactions at that level. 








On Thu, Sep 25, 2014 at 4:33 PM, McCanna, Terran < 
[email protected] > wrote: 


This relies on the circulation and rating data still being tied to the patron 
in the system, though - yes, it'd be on the database side and not on public 
view, but it's still creating a picture of a patron's reading history that has 
privacy implications. Of course, this feature should be set for systems to 
enable or disable, so that systems that are concerned about privacy simply 
won't turn it on. (PINES, for example, limits the retention of circulation 
history in the system as much as we can because of our privacy policies, so any 
feature that is linked to a patron's history would be unusable for us.) 

If ranking data were stored completely independently of the patron, then 
library systems would be able to use it without privacy concerns, and patrons 
wouldn't even need to be logged in to use it - but then it wouldn't be able to 
give completely customized recommendations to a specific patron, either. It's a 
definite tradeoff. 


Terran McCanna 
PINES Program Manager 
Georgia Public Library Service 
1800 Century Place, Suite 150 
Atlanta, GA 30345 
404-235-7138 
[email protected] 

----- Original Message ----- 
From: "Vanya Jauhal" < [email protected] > 
To: "Evergreen Discussion Group" < [email protected] > 


Sent: Thursday, September 25, 2014 3:41:02 PM 
Subject: Re: [OPEN-ILS-GENERAL] Awesome Box Integration 



Hello Rogan 

This is exactly what I had in mind. All the recommendation processing will take 
place in background, and all the user will see is a recommendation and not the 
information of any other patron. This way his experience with Awesome Box will 
get enhanced. 


And yes, we can maybe, start off with some broad level genres, like, as you 
mentioned, fiction, non-fiction, documentaries, etc. Then, depending upon the 
infrastructure of the system and the response of that categorization, we can 
build upon the algorithm accordingly. 


You are right- it would be a big task in itself, but since the number of 
parameters involved are few and explicit, it gets simplified to an extent. 






On Fri, Sep 26, 2014 at 12:50 AM, Rogan Hamby < [email protected] > 
wrote: 



I don't see an issue with doing analysis of circulation patterns on the backend 
so long as nothing identifying is exposed. 


For example, if all I saw as a patron was a tab in my opac that said "you 
thought The Yiddish Policeman's Union was Awesome! Some others do did also 
thought this was Awesome .... " I don't see that as different from doing the 
same thing with circulations. It's not telling patrons even what the points of 
comparison were unless they only had a single item in their circulation history 
and even then it doesn't tell them how many other patrons, how much, etc.... 


I'm dubious about subject headings also but wouldn't want to dismiss it out of 
hand. It might work. Without doing some experimenting I could see it going 
either way. Some fixed fields I could see working, like fiction and 
non-fiction. Age groups? Well, at least I can tell you I can't rely on those in 
my catalog. :) 


However, I also worry that reading recommendations based on circulation history 
could easily grow into a much more complicated task, especially depending on 
how we deliver those recommendations. Looking at a single boolean value tied to 
the user and item (circ table?) could still be quite a project by itself 
especially once all the useful bits and pieces are built in. 









On Thu, Sep 25, 2014 at 2:37 PM, McCanna, Terran < 
[email protected] > wrote: 


Agreed - it's a great idea in theory, but I'm not sure how well it would work 
in actual practice. Even in a single library, genre subject headings are 
usually pretty inconsistent in the MARC records because of copy cataloging, and 
that usually gets even more inconsistent in a consortium of libraries. Perhaps 
it could be partially weighted on genre subject headings, but not overly 
reliant on them? It might be worth considering the fixed field values for 
fiction vs. non-fiction and for age groups, too. 

I love the idea of providing recommendations based on other people that have 
similar taste ("other people that liked this book also liked these books...") 
but if the data is tied to actual patrons (and I'm not sure how it couldn't be) 
then quite a few library systems would face legal privacy issues and wouldn't 
be able to use it. We're currently using a commercial service to pull in 
reading recommendations because the recommendations can't be tied back to any 
of our patrons. 


Terran McCanna 
PINES Program Manager 
Georgia Public Library Service 
1800 Century Place, Suite 150 
Atlanta, GA 30345 
404-235-7138 
[email protected] 



----- Original Message ----- 
From: "Rogan Hamby" < [email protected] > 
To: "Evergreen Discussion Group" < [email protected] > 
Sent: Thursday, September 25, 2014 2:02:58 PM 
Subject: Re: [OPEN-ILS-GENERAL] Awesome Box Integration 


I can see some challenges to tracking genre and I'd be hesitant to put too much 
value on it. There are ways to catalog it but in my experience actually relying 
on it being in records (much less being consistent) is very unreliable in 
organizations that do a lot of copy cataloging / don't have centralized and 
controlled cataloging and there quite a few in that boat. 


That concern aside, I've always thought this would be a fun and potentially 
valuable thing to add. 


On Thu, Sep 25, 2014 at 1:44 PM, Vanya Jauhal < [email protected] > wrote: 











Hello everyone 

I'm Vanya, from India. I'm a candidate for OPW Round9 internship with 
evergreen. 

While discussing the idea of Awesome Box integration with Evergreen, Kathy and 
I discussed the possibility of making the Evergreen support for Awesome Box 
more interpretive using Artificial Intelligence. 

What if we could train the system to give weightage to people's "awesome" tags 
on items, depending upon how much their previous tags are appreciated by other 
people. 

For example: Let's say you tag a book to be awesome. Now, if 100 other people 
check that book in, and (lets say) 80 of them also tag it to be awesome- it 
will mean that your opinion matches a majority of people. On the other hand, if 
100 other people check that book in and (say) only 5 of them tag it as awesome, 
this would mean that your awesome tag is not in coherence with the majority. 
So, in the former case, your awesome tag can be given more weightage as 
compared to the latter. 

Also, the weightage may vary according to genres. So- you may have a good taste 
in mystery books but your taste in classical literature might not be the same 
as the majority crowd. So- the weightage of your awesome tag in mystery would 
be higher than classical literature. 

We can even extend it to provide recommendations to users depending on their 
coherence with other users with similar taste. 

I am looking forward to your suggestions and feedback on this. 

Thank you for your time 

Vanya 




-- 



Rogan Hamby, MLS, CCNP, MIA 
Managers Headquarters Library and Reference Services, 
York County Library System 


“You can never get a cup of tea large enough or a book long enough to suit me.” 
― C.S. Lewis 




-- 



Rogan Hamby, MLS, CCNP, MIA 
Managers Headquarters Library and Reference Services, 
York County Library System 


“You can never get a cup of tea large enough or a book long enough to suit me.” 
― C.S. Lewis 




-- 



Rogan Hamby, MLS, CCNP, MIA 
Managers Headquarters Library and Reference Services, 
York County Library System 


“You can never get a cup of tea large enough or a book long enough to suit me.” 
― C.S. Lewis 


-- 
Tim Spindler 
[email protected] 




P Go Green - Save a tree! Please don't print this e-mail unless it's really 
necessary. 




-- 



Rogan Hamby, MLS, CCNP, MIA 
Managers Headquarters Library and Reference Services, 
York County Library System 


“You can never get a cup of tea large enough or a book long enough to suit me.” 
― C.S. Lewis 




-- 
Ruth Frasur 
Director of the Historic(ally Awesome) Hagerstown - Jefferson Township Library 
10 W. College Street in Hagerstown, Indiana (47346) 
p (765) 489-5632 ; f (765) 489-5808 

Our Kickin' Website Our Rockin' Facebook Page and Stuff I'm Reading 





-- 



Rogan Hamby, MLS, CCNP, MIA 
Managers Headquarters Library and Reference Services, 
York County Library System 


“You can never get a cup of tea large enough or a book long enough to suit me.” 
― C.S. Lewis

Re: [OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration

Reply via email to

Re: [OPEN-ILS-GENERAL] SPAM Re: Awesome Box Integration