Apologies for coming into this thread so late.  I am one of the developers that 
did an initial technical review for our institution. I thought it might be a 
good idea to go into more detail about our findings.   

>I think we need to clear (and careful) in this discussion about what user
>data we are discussing. With authentication being done by the library /
>university, Lean Library doesn’t actually have personally identifiable
>information (PII).  While IP addresses can be traced, is that any more a
>concern than an user’s ISP tracking all of users traffic already, since
>Lean Library is only effective from off campus IP addresses?

While it's true that authentication occurs on library servers, my concern about 
PII stems from the fact that the plugin can send detailed patron browsing 
activity to Lean Library servers.    This behavior appears to be enabled for 
about 100 of the roughly 170 institutions that subscribe to Lean Library.  More 
troubling, the plugin appears to send browsing activity even when the plugin 
appears to be "inactive" due to a patron being on-campus. 

I've copied a portion of my report below.  It would be great if anyone has done 
a similar review and confirm (or, even better, refute) some of the issues we 
came across.  (I originally wrote this in markdown so I apologize for the odd 
formatting)


## Methodology
I used the Firefox Add-on debugger to observe network traffic and plugin 
activity.  More info on using the add-on debugger can be found here: 
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Debugging

### Desktop setup 
- Browser: Firefox 61.0.2
- Operating System: macOS v10.13.6
- Lean Library Plugin: 2.8.1

## Lean Library API Endpoints
- Plugin communicates with lean-library through a handful of "endpoints"
- Base URL for requests is `https://app.leanlibrary.com/?r=api`
- All endpoints are served over HTTPS, however, they do not appear to be 
restricted or authenticated with a token or API key.  Therefore these calls can 
come from any source. For example, I utilized the "api/institutes" and 
"api/resourceDomains"  to determine the IP ranges and database listings for a 
handful of institutions that I do not belong to.
- Note: this is not an exhaustive list, but rather some of the more pertinent 
endpoints that we came across.

### Endpoint Notes: /api/logAction
Client sends user's current URL (hostname, path, and querystring) to the API 
server. This API call occurs whenever the user types a URL into the browser 
"address bar" or if the user clicks on link on a web page. Request payload also 
includes UserAgent, Instution ID, and "Client ID" (I belive this uniquely 
identifies this plugin instance).
- NOTE: It's this behavior does not occur for every institution.  In my 
testing, 'logAction" calls only occur when the institution utilizes "IP range 
validation", which is about 100 of the 171 instutions that appear in the 
"Select Your Library" config screen (see endpoint notes for "/api/institutes" 
further down).
- This API call occurs any time the user clicks on a link or types an address 
into the browser's address bar.  The request payload includes hostname, path, 
and querystring.
- URL's can often include sensitive, personally identifying information. Such 
information could be used by a bad actor to facilitate phishing attacks, among 
other things.
- This behavior occurs even when the plugin claims to be "inactive" - For 
example,  if I click on the Library Access button from an on-campus IP, the 
plugin opens a popup message which states "You are logged in on a campus 
network, so our extension is inactive. Keep calm and study on!"

### Endpoint Notes: /api/institutes
This API call returns a JSON object containing a list of "institutes" (i.e. 
libraries that subscribe to Lean Library) as well as their configuration 
information. Presumably the plugin makes this API call on startup to render the 
"Select Your Library" dropdown on the plugin configuation page.
- One of the fields, "enableIpRangeValidation",  presumably indicates whether 
the plugin should attempt to determine if the user is "on campus". Notably, 
whenever this field is set to "true", the plugin utilizes the "logAction" api 
call to send patron browsing activity to Lean Library.  
- response payload appears to contain the configuration information for every 
library that subscribes to Lean Library.  Including the address of the 
institution's EZProxy server and map of "on-campus" IP ranges.
- While not necessarily private and/or sensitive, but institions might not be 
thrilled w/ the idea of this information being publically available.

### Endpoint Notes: /api/resourceDomains
Returns a list of databases available to a particular institution.  Basically a 
list of "Starting Point URL's" in Ezproxy-speak. 

## Other Server notes
- OS: CentOS
- HTTP Server: Apache/2.2.15  [released: March 2010]
- App server: PHP/7.1.9  [released: December 2016]


Jim deVos
Developer, ASU Library

Reply via email to