[OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration
Hello Everyone As Kathy mentioned, making the recommendation system configurable would make the system more flexible and the administrators can decide their settings depending upon their respective needs. As far as the privacy issues are concerned, we can initially build the model on people who volunteer. And yes, we will have to implement it in blocks. And as you mentioned, collecting the awesome tags and sending them along to the Awesome Box site can be the starting point. And the ML part can be built simultaneously. I am currently working on Detecting and Categorizing Circles in Social Networks, which also works on a similar principle, so I guess training the data as it gets collected will not be a very huge overhead. Also I think Rogan has a very good point. We can store the data independent of the patron data. As in every user can be given a user id which will be used for all his further transaction records. This thing will kind of anonymise the user information. And the access to the table linking every user with corresponding user id can be restricted as per the needs of the organization. So, the suggestion system can be built on this without loss of any information. Also this will help us to give proper recommendations to users which otherwise would not have been possible. What do you all think? On Fri, Sep 26, 2014 at 9:26 AM, Rogan Hamby rogan.ha...@yclibrary.net wrote: I'm concerned with project creep as well as I noted in one of early missives. If this is stored independent of patron data (which actually I think it should) then I think we should also track circs since the feature was turned on so it could say 3 out of 4 people found it awesome. Stepping back a bit to recommendations and anonymizing records, we don't anonymize historical circs. We don't expose that data and take staff level access to it pretty seriously. Due to varying state and county regulations dictating minimum record retentions we're still at least 2 years out from being to safely wipe our oldest records. Maybe more. And anonymizing it closes certain opportunities. Some are mundane like addressing old conflicts and billing questions but those can be big in their own right. As the circ manager who talks to the upset patron I may have a different point of view on that. :) Analyzing circulation patterns is far more interesting though and I am long term interested in recommendations. In the age of Anazin, Netflix and everyone else this is not just valuable but expected. It's perhaps the patron request I hear most. Coupled with some holds features it would be a great great boon for home bound services which I feel are a critical function of libraries, at least in my state where it's a strong traditional service. I assume elsewhere as well though I know mileage varies. And it was the building block of several functions that GA PINES identified as critical for TBS support during the Loblolly conference. We may never fully support TBS programs in Evergreen but I thought GA PINES collected a lot of great ideas and input there and would hate to discard that. On Thursday, September 25, 2014, Kathy Lussier kluss...@masslnc.org wrote: Hi all, Great discussion so far! We had a bit of a discussion about privacy concerns in IRC after Terran sent her original message. One approach we were discussing was storing the awesome tags in an anonymous fashion, except in cases where patrons have opted into saving their circ history. In those cases, the user has already consented to having this information saved and could have a more enhanced experience with the recommendation engine. Others who were part of the discussion could elaborate or correct me if I'm not articulating the ideas correctly. The discussion can be found at http://irc.evergreen-ils.org/evergreen/2014-09-25#i_126632. In relation to genres, Vanya said: Maybe, as a solution to that, we can have a hierarchical algorithm for categorizing. In other words, we can allow the administrator to decide whether the categorization comes all the way down to genres, or just takes into account the overall weight of the user's awesome tag. I like the idea of making this configurable, because there may be systems where data identifying genre is a little more clear cut. Better yet, how about if we allow an Evergreen site to define the categories that are used. Some sites may use the MARC fixed fields for fiction/non-fiction. Other sites may decided that values stored in the 655 MARC field work for them. Is there something already exists in Evergreen that we could leverage for this purpose? My first thought was MVF. I do have one general recommendation speaking with my OPW admin hat on. It really is a general recommendation for any of the OPW candidates who might be following along. I mentioned in IRC today that I'm not a developer, but I've managed a lot of development projects, and one thing I try to watch out
[OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration
I don't have anything of value to add to this other than while, of course, I love the idea of reader recommendations and Awesome Box integration in any form, I also think there would HAVE to be some type of anonymizing (sp?) of patron data. I don't think this is impossible BUT, as Rogan has said, there is a definite danger of project creep. My suggestion, fwiw, is to find some first/second step for Awesome Box integration and focus more on building a foundation (that may or may not have truly visible/useful features for end users) on which others (or other projects) could expand. On Thu, Sep 25, 2014 at 11:56 PM, Rogan Hamby rogan.ha...@yclibrary.net wrote: I'm concerned with project creep as well as I noted in one of early missives. If this is stored independent of patron data (which actually I think it should) then I think we should also track circs since the feature was turned on so it could say 3 out of 4 people found it awesome. Stepping back a bit to recommendations and anonymizing records, we don't anonymize historical circs. We don't expose that data and take staff level access to it pretty seriously. Due to varying state and county regulations dictating minimum record retentions we're still at least 2 years out from being to safely wipe our oldest records. Maybe more. And anonymizing it closes certain opportunities. Some are mundane like addressing old conflicts and billing questions but those can be big in their own right. As the circ manager who talks to the upset patron I may have a different point of view on that. :) Analyzing circulation patterns is far more interesting though and I am long term interested in recommendations. In the age of Anazin, Netflix and everyone else this is not just valuable but expected. It's perhaps the patron request I hear most. Coupled with some holds features it would be a great great boon for home bound services which I feel are a critical function of libraries, at least in my state where it's a strong traditional service. I assume elsewhere as well though I know mileage varies. And it was the building block of several functions that GA PINES identified as critical for TBS support during the Loblolly conference. We may never fully support TBS programs in Evergreen but I thought GA PINES collected a lot of great ideas and input there and would hate to discard that. On Thursday, September 25, 2014, Kathy Lussier kluss...@masslnc.org wrote: Hi all, Great discussion so far! We had a bit of a discussion about privacy concerns in IRC after Terran sent her original message. One approach we were discussing was storing the awesome tags in an anonymous fashion, except in cases where patrons have opted into saving their circ history. In those cases, the user has already consented to having this information saved and could have a more enhanced experience with the recommendation engine. Others who were part of the discussion could elaborate or correct me if I'm not articulating the ideas correctly. The discussion can be found at http://irc.evergreen-ils.org/evergreen/2014-09-25#i_126632. In relation to genres, Vanya said: Maybe, as a solution to that, we can have a hierarchical algorithm for categorizing. In other words, we can allow the administrator to decide whether the categorization comes all the way down to genres, or just takes into account the overall weight of the user's awesome tag. I like the idea of making this configurable, because there may be systems where data identifying genre is a little more clear cut. Better yet, how about if we allow an Evergreen site to define the categories that are used. Some sites may use the MARC fixed fields for fiction/non-fiction. Other sites may decided that values stored in the 655 MARC field work for them. Is there something already exists in Evergreen that we could leverage for this purpose? My first thought was MVF. I do have one general recommendation speaking with my OPW admin hat on. It really is a general recommendation for any of the OPW candidates who might be following along. I mentioned in IRC today that I'm not a developer, but I've managed a lot of development projects, and one thing I try to watch out for is project creep. As we continue to talk about the project and think of new configuration options to make it a more flexible project, it can also become a very large project that isn't as easy to manage. Therefore, as you think through how you plan to implement the project, I recommend breaking it up into distinct milestones. You might want to start with smaller tasks as you ease into the project (e.g. collecting the awesome tags and sending them along to the Awesome Box site), and then move on to the larger components once you become more familiar with the system. Kathy Kathy Lussier Project Coordinator Massachusetts Library Network Cooperative(508) 343-0128kluss...@masslnc.org Twitter:
Re: [OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration
FWIW, there isn't any reason for patron data to be exposed and privacy issue on a display level here. The privacy discussion is really a distraction from the Awesome Box discussion in my opinion. Some libraries may anonymize (or wipe) older data while others don't but that data existing and using it under the hood is a totally different thing from exposing it users (my point). Now if you do wipe it you obviously don't want to suddenly have features that depend on it, an important point for those who do wipe it (and I wonder if their libraries are expressly exempt from record retention laws) but that was Kathy's point about configurability. And even if you did use historical circulations integrated for awesome box that doesn't mean it has to be used the same way for all type of users with different anonymization of data. Of course, I doubt that some who think their data is wiped understand that it probably is not. Evergreen does not natively erase or anonymize old information, it's just inaccessible to casual users, which is not the same as not existing. That's a fairly common mistake for users not familiar with the database layer. Clear as mud? So, as I said I suspect that if we don't want to completely derail this with tangents it's probably best to put the privacy issue aside and look at Awesome box features not tied to patron specific data. On Fri, Sep 26, 2014 at 10:44 AM, Ruth Frasur direc...@hagerstownlibrary.org wrote: I don't have anything of value to add to this other than while, of course, I love the idea of reader recommendations and Awesome Box integration in any form, I also think there would HAVE to be some type of anonymizing (sp?) of patron data. I don't think this is impossible BUT, as Rogan has said, there is a definite danger of project creep. My suggestion, fwiw, is to find some first/second step for Awesome Box integration and focus more on building a foundation (that may or may not have truly visible/useful features for end users) on which others (or other projects) could expand. On Thu, Sep 25, 2014 at 11:56 PM, Rogan Hamby rogan.ha...@yclibrary.net wrote: I'm concerned with project creep as well as I noted in one of early missives. If this is stored independent of patron data (which actually I think it should) then I think we should also track circs since the feature was turned on so it could say 3 out of 4 people found it awesome. Stepping back a bit to recommendations and anonymizing records, we don't anonymize historical circs. We don't expose that data and take staff level access to it pretty seriously. Due to varying state and county regulations dictating minimum record retentions we're still at least 2 years out from being to safely wipe our oldest records. Maybe more. And anonymizing it closes certain opportunities. Some are mundane like addressing old conflicts and billing questions but those can be big in their own right. As the circ manager who talks to the upset patron I may have a different point of view on that. :) Analyzing circulation patterns is far more interesting though and I am long term interested in recommendations. In the age of Anazin, Netflix and everyone else this is not just valuable but expected. It's perhaps the patron request I hear most. Coupled with some holds features it would be a great great boon for home bound services which I feel are a critical function of libraries, at least in my state where it's a strong traditional service. I assume elsewhere as well though I know mileage varies. And it was the building block of several functions that GA PINES identified as critical for TBS support during the Loblolly conference. We may never fully support TBS programs in Evergreen but I thought GA PINES collected a lot of great ideas and input there and would hate to discard that. On Thursday, September 25, 2014, Kathy Lussier kluss...@masslnc.org wrote: Hi all, Great discussion so far! We had a bit of a discussion about privacy concerns in IRC after Terran sent her original message. One approach we were discussing was storing the awesome tags in an anonymous fashion, except in cases where patrons have opted into saving their circ history. In those cases, the user has already consented to having this information saved and could have a more enhanced experience with the recommendation engine. Others who were part of the discussion could elaborate or correct me if I'm not articulating the ideas correctly. The discussion can be found at http://irc.evergreen-ils.org/evergreen/2014-09-25#i_126632. In relation to genres, Vanya said: Maybe, as a solution to that, we can have a hierarchical algorithm for categorizing. In other words, we can allow the administrator to decide whether the categorization comes all the way down to genres, or just takes into account the overall weight of the user's awesome tag. I like the idea of making this configurable, because there may
Re: [OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration
We anonymize our data as much as we can without causing problems in the system. We archive old circulations and do not even allow patrons to opt in to store their reading histories. Yes, there are still remnants of the data there, and we obviously can't clear out data related to current transactions or to fines, but it is our intention to maintain as much patron privacy as possible. In fact, we take patron privacy far more seriously than most of our patrons do. System administrators that have direct access to the data can get all sorts of information if they take the time to find it, but the patron's complete circ history isn't available in their record or through a report. This limits the amount of information that any staff person can find out about anyone else, and it ensures that any Patriot Act or other information requests for circulation history have to come to the managing office where we can ensure that the request meets the letter of the law and that frontline circ staff at some remote branch aren't being pressured into giving out all of that information to members of government or law enforcement that should not have access to it without following proper procedures. I would hope that all library systems are taking equal care. As long as the Awesome Box functionality is designed to hide its ties to individual patrons, then I think it's great, but I think it would do a disservice to patrons to simply dismiss privacy issues out of hand. Terran McCanna PINES Program Manager Georgia Public Library Service 1800 Century Place, Suite 150 Atlanta, GA 30345 404-235-7138 tmcca...@georgialibraries.org - Original Message - From: Rogan Hamby rogan.ha...@yclibrary.net To: Evergreen Discussion Group open-ils-general@list.georgialibraries.org Sent: Friday, September 26, 2014 10:55:27 AM Subject: Re: [OPEN-ILS-GENERAL] ***SPAM*** Re: Awesome Box Integration FWIW, there isn't any reason for patron data to be exposed and privacy issue on a display level here. The privacy discussion is really a distraction from the Awesome Box discussion in my opinion. Some libraries may anonymize (or wipe) older data while others don't but that data existing and using it under the hood is a totally different thing from exposing it users (my point). Now if you do wipe it you obviously don't want to suddenly have features that depend on it, an important point for those who do wipe it (and I wonder if their libraries are expressly exempt from record retention laws) but that was Kathy's point about configurability. And even if you did use historical circulations integrated for awesome box that doesn't mean it has to be used the same way for all type of users with different anonymization of data. Of course, I doubt that some who think their data is wiped understand that it probably is not. Evergreen does not natively erase or anonymize old information, it's just inaccessible to casual users, which is not the same as not existing. That's a fairly common mistake for users not familiar with the database layer. Clear as mud? So, as I said I suspect that if we don't want to completely derail this with tangents it's probably best to put the privacy issue aside and look at Awesome box features not tied to patron specific data. On Fri, Sep 26, 2014 at 10:44 AM, Ruth Frasur direc...@hagerstownlibrary.org wrote: I don't have anything of value to add to this other than while, of course, I love the idea of reader recommendations and Awesome Box integration in any form, I also think there would HAVE to be some type of anonymizing (sp?) of patron data. I don't think this is impossible BUT, as Rogan has said, there is a definite danger of project creep. My suggestion, fwiw, is to find some first/second step for Awesome Box integration and focus more on building a foundation (that may or may not have truly visible/useful features for end users) on which others (or other projects) could expand. On Thu, Sep 25, 2014 at 11:56 PM, Rogan Hamby rogan.ha...@yclibrary.net wrote: I'm concerned with project creep as well as I noted in one of early missives. If this is stored independent of patron data (which actually I think it should) then I think we should also track circs since the feature was turned on so it could say 3 out of 4 people found it awesome. Stepping back a bit to recommendations and anonymizing records, we don't anonymize historical circs. We don't expose that data and take staff level access to it pretty seriously. Due to varying state and county regulations dictating minimum record retentions we're still at least 2 years out from being to safely wipe our oldest records. Maybe more. And anonymizing it closes certain opportunities. Some are mundane like addressing old conflicts and billing questions but those can be big in their own right. As the circ manager who talks to the upset patron I may have a different point of view