Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Kyle Banerjee
Personally, I'd be tempted to go the IP lockout route myself since the
patterns should be clear in the logs, but be aware that # megabytes gives a
reasonable level of control because you can set to log rather than lock
out. I think the risk of locking legitimate users is low. Although people
can download mixed materials, my guess is that your abusing accounts are
not watching loads of video.

There are things you can do with user names that would make it easy enough
to uncover abuse without unduly compromising privacy. For example, you
could flush your logs frequently while extracting the number of downloads
you're interested from individual users. Abuse accounts will be immediately
obvious. BTW, you can do some funky things with EZP that include
conditional logic, regexp searches, and rewriting that might be helpful.

Any path you take will protect user privacy far more than just about any
other site they visit. Plus, whoever maintains your network will
occasionally need to monitor specific computers to mitigate a wide variety
of problems. Systems used as a platform for abusive behavior, harassment,
or activity that causes harm to others get locked out and/or blacklisted
which will really hose your users. Getting that kind of thing cleared up
takes time because most places aren't nearly as forgiving as libraries.

kyle


On Wed, Nov 19, 2014 at 8:47 PM, Dan Scott deni...@gmail.com wrote:

 On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:

  There are a number of technical approaches that could be used to identify
  which accounts have been compromised.
 
  But it's easier to just make the problem go away by setting usage limits
 so
  EZP locks the account out after it downloads too much.
 

 But EZProxy still doesn't let you set limits based on the type of download.
 You therefore have two very blunt sledge hammers with UsageLimit:

 - # of downloads (-transfers)
 - # of megabytes downloaded (-MB)

 # of downloads is effectively useless because many of our electronic
 resource platforms (hi Proquest and EBSCOHost) make between 50 and 150
 requests for JavaScript, CSS, and images per page, so you have to set your
 thresholds incredibly high to avoid locking out users who might be actively
 paging through search results. Any savvy abuser will just script their
 requests to avoid all of the JS/CSS/images to derive a list of PDFs, and
 then download just the PDFs, thereby staying well under the usage limits
 that legit users require... and I've seen exactly that happen through our
 proxy.

 # of megabytes downloaded is a pretty blunt tool as well, given that our
 multimedia-enriched databases now often serve up video and audio as well as
 HTML, images, and PDF files. For the pure audio and video streaming sites
 such as Naxos or Curio, you can set higher limits; but as vendors
 increasingly enrich their databases with audio and video, you're going to
 have to increase your general limits as well... and you can pull down a ton
 of PDFs under that cover.

 So no, I don't think it's easy to make the problem go away through the
 suggested approach, unless you're willing to err on the side of locking out
 legitimate users.



Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Jonathan Rochkind

On 11/20/14 1:06 PM, Kyle Banerjee wrote:

BTW, you can do some funky things with EZP that include
conditional logic


Can you say more about funky things you can do with EZProxy involving 
conditional logic? Cause I've often wanted that but haven't found any! 
Are you talking about a particular part/area of EZProxy? (Logging?).


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Kyle Banerjee
I can't remember the details because I haven't worked with EZP for years
and unfortunately, this stuff isn't documented.

Where I used it was in the user.txt file when authenticating. Things you
can do include setting/modifying session, regular EZP, and arbitrary
variables, as well as doing comparisons and file I/O. You can nest
expressions and perform reasonably sophisticated comparisons and
manipulations.

It is way more powerful than what appears in the documentation, but to get
started with it, you need someone who can provide some syntax and ideas. I
know people who know this stuff monitor c4l, so I'm hoping some of them
will weigh in.

kyle

On Thu, Nov 20, 2014 at 10:17 AM, Jonathan Rochkind rochk...@jhu.edu
wrote:

 On 11/20/14 1:06 PM, Kyle Banerjee wrote:

 BTW, you can do some funky things with EZP that include
 conditional logic


 Can you say more about funky things you can do with EZProxy involving
 conditional logic? Cause I've often wanted that but haven't found any! Are
 you talking about a particular part/area of EZProxy? (Logging?).



Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Michael Berkowski
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Here's some of the relevant documentation of the user.txt expressions Kyle
mentioned.  It is possible to set session variables and get them to be
logged - we're doing this with certain Shibboleth attributes for business
analysis.  I have not had luck getting variables other than session vars
set at the user's moment of login to be logged though.

http://oclc.org/support/services/ezproxy/documentation/expressions.en.html

On Thu, 20 Nov 2014, Kyle Banerjee said:

 I can't remember the details because I haven't worked with EZP for years
 and unfortunately, this stuff isn't documented.
 
 Where I used it was in the user.txt file when authenticating. Things you
 can do include setting/modifying session, regular EZP, and arbitrary
 variables, as well as doing comparisons and file I/O. You can nest
 expressions and perform reasonably sophisticated comparisons and
 manipulations.
 
  On 11/20/14 1:06 PM, Kyle Banerjee wrote:
 
  BTW, you can do some funky things with EZP that include
  conditional logic
 
 
  Can you say more about funky things you can do with EZProxy involving
  conditional logic? Cause I've often wanted that but haven't found any! Are
  you talking about a particular part/area of EZProxy? (Logging?).
 

- -- 

Michael Berkowski
University of Minnesota Libraries
m...@umn.edu
612.626.6137
PGP Public Key: http://z.umn.edu/mjbpubkey


-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRuOowACgkQ01KJk46VC2YL9QCgpyv7ByxUIgnOFcqUT4iFEPLV
0MgAmQHPrDMyVu0x2dgtqE84e9IS1rdV
=lkc4
-END PGP SIGNATURE-


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Joe Hourcle
On Nov 19, 2014, at 11:47 PM, Dan Scott wrote:

 On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:
 
 There are a number of technical approaches that could be used to identify
 which accounts have been compromised.
 
 But it's easier to just make the problem go away by setting usage limits so
 EZP locks the account out after it downloads too much.
 
 
 But EZProxy still doesn't let you set limits based on the type of download.
 You therefore have two very blunt sledge hammers with UsageLimit:
 
 - # of downloads (-transfers)
 - # of megabytes downloaded (-MB)


[trimmed]

I'm not familiar with EZProxy, but if it's running on an OS that you have 
control of (and not some vendor locked appliance), you likely have other tools 
that you can use for rate limiting.

For instance, I have a CGI on a webserver that's horribly resource intensive 
and takes quite a while to run.  Most people wonder what's taking so long, and 
reload multiple times, thinking the process is stuck ... or they know what's 
going on, and will open up multiple instances in different tabs to reduce their 
wait.

So I have the following IP tables rule:

-A INPUT -p tcp -m tcp --dport 80 --tcp-flags FIN,SYN,RST,ACK SYN -m 
connlimit --connlimit-above 5 --connlimit-mask 32 -j REJECT --reject-with 
tcp-reset

I can't remember if starts blocking the 5th connection, or once they're above 
5, but it keeps us from having one IP address with 20+ copies running at once.

...

And back from my days of managing directory servers -- brute forcing was a 
horrible problem with single sign-on.  We didn't have a good way to temporarily 
lock accounts for repeatedly failing passwords at the directory server (which 
would also cause a denial of service, as you could lock someone else) ... so it 
had to be up to each application to implement ... which of course, they didn't.

... so you'd have something like a webpage that required authentication that 
someone could brute force ... and then they'd also get access to a shell 
account and whatever else that person had authorization for.

-Joe


(and on that 'wow, I feel old' note ... it's been 10+ years since I've had to 
manage an LDAP server ... it's possible that they've gotten better about that 
issue since then)


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Steven Marsden
Logging user ID's has a benefit if it's used properly (access tightly
controlled to a select group)

If campus ID's are being used by bots to harvest content, it means that you
have users whose credentials are compromised. Whoever obtained this
information also has access to e-mails, student records, and personal
information. It's a benefit to everyone if this information gets recorded
and reported to campus IT (so the user can have their password reset etc..).

The worst part is, this happens much more then you would expect. I
developed an application (https://github.com/ryersonlibrary/EZ-Analyzer) to
help analyze logs for suspect behavior (it still requires your judgement,
but it helps identify users with very high usage, or show if they are
logging in from different parts of the world)

-Steven

On Thu, Nov 20, 2014 at 3:14 PM, Joe Hourcle onei...@grace.nascom.nasa.gov
wrote:

 On Nov 19, 2014, at 11:47 PM, Dan Scott wrote:

  On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee kyle.baner...@gmail.com
  wrote:
 
  There are a number of technical approaches that could be used to
 identify
  which accounts have been compromised.
 
  But it's easier to just make the problem go away by setting usage
 limits so
  EZP locks the account out after it downloads too much.
 
 
  But EZProxy still doesn't let you set limits based on the type of
 download.
  You therefore have two very blunt sledge hammers with UsageLimit:
 
  - # of downloads (-transfers)
  - # of megabytes downloaded (-MB)


 [trimmed]

 I'm not familiar with EZProxy, but if it's running on an OS that you have
 control of (and not some vendor locked appliance), you likely have other
 tools that you can use for rate limiting.

 For instance, I have a CGI on a webserver that's horribly resource
 intensive and takes quite a while to run.  Most people wonder what's taking
 so long, and reload multiple times, thinking the process is stuck ... or
 they know what's going on, and will open up multiple instances in different
 tabs to reduce their wait.

 So I have the following IP tables rule:

 -A INPUT -p tcp -m tcp --dport 80 --tcp-flags FIN,SYN,RST,ACK SYN
 -m connlimit --connlimit-above 5 --connlimit-mask 32 -j REJECT
 --reject-with tcp-reset

 I can't remember if starts blocking the 5th connection, or once they're
 above 5, but it keeps us from having one IP address with 20+ copies running
 at once.

 ...

 And back from my days of managing directory servers -- brute forcing was a
 horrible problem with single sign-on.  We didn't have a good way to
 temporarily lock accounts for repeatedly failing passwords at the directory
 server (which would also cause a denial of service, as you could lock
 someone else) ... so it had to be up to each application to implement ...
 which of course, they didn't.

 ... so you'd have something like a webpage that required authentication
 that someone could brute force ... and then they'd also get access to a
 shell account and whatever else that person had authorization for.

 -Joe


 (and on that 'wow, I feel old' note ... it's been 10+ years since I've had
 to manage an LDAP server ... it's possible that they've gotten better about
 that issue since then)




-- 
Steven Marsden - Library Systems Analyst
Tel: 416-979-5000 x 4635
Ryerson University Library
350 Victoria Street.  Toronto, ON.  M5B 2K3


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Joshua Welker
Brute force attacks aren't the problem. There's a simple param in EZproxy
that blocks an IP and/or user account after a certain number of failed
logins. I suspect that the problem is that attackers already have valid
login credentials from one of the thousands of security breaches in the
last few years.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Joe Hourcle
Sent: Thursday, November 20, 2014 2:15 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Balancing security and privacy with EZproxy

On Nov 19, 2014, at 11:47 PM, Dan Scott wrote:

 On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee
 kyle.baner...@gmail.com
 wrote:

 There are a number of technical approaches that could be used to
 identify which accounts have been compromised.

 But it's easier to just make the problem go away by setting usage
 limits so EZP locks the account out after it downloads too much.


 But EZProxy still doesn't let you set limits based on the type of
download.
 You therefore have two very blunt sledge hammers with UsageLimit:

 - # of downloads (-transfers)
 - # of megabytes downloaded (-MB)


[trimmed]

I'm not familiar with EZProxy, but if it's running on an OS that you have
control of (and not some vendor locked appliance), you likely have other
tools that you can use for rate limiting.

For instance, I have a CGI on a webserver that's horribly resource
intensive and takes quite a while to run.  Most people wonder what's
taking so long, and reload multiple times, thinking the process is stuck
... or they know what's going on, and will open up multiple instances in
different tabs to reduce their wait.

So I have the following IP tables rule:

-A INPUT -p tcp -m tcp --dport 80 --tcp-flags FIN,SYN,RST,ACK SYN
-m connlimit --connlimit-above 5 --connlimit-mask 32 -j REJECT
--reject-with tcp-reset

I can't remember if starts blocking the 5th connection, or once they're
above 5, but it keeps us from having one IP address with 20+ copies
running at once.

...

And back from my days of managing directory servers -- brute forcing was a
horrible problem with single sign-on.  We didn't have a good way to
temporarily lock accounts for repeatedly failing passwords at the
directory server (which would also cause a denial of service, as you could
lock someone else) ... so it had to be up to each application to implement
... which of course, they didn't.

... so you'd have something like a webpage that required authentication
that someone could brute force ... and then they'd also get access to a
shell account and whatever else that person had authorization for.

-Joe


(and on that 'wow, I feel old' note ... it's been 10+ years since I've had
to manage an LDAP server ... it's possible that they've gotten better
about that issue since then)


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Joshua Welker
Blocking the IP is the obvious solution but not ideal at all. First off,
it's trivially easy to bypass IP blacklists using proxies. I don't want to
play a game of never-ending IP whack-a-mole. Second, it notifies the
attacker that we are onto them, which makes it less likely for us to catch
them. We want to figure out which accounts are compromised so that we can
fix the problem at the source rather than fixing symptoms. If EZproxy is
being abused, then it's just as likely that other, more valuable systems at
the university are being abused as logins are shared between many systems.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kyle
Banerjee
Sent: Thursday, November 20, 2014 12:07 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Balancing security and privacy with EZproxy

Personally, I'd be tempted to go the IP lockout route myself since the
patterns should be clear in the logs, but be aware that # megabytes gives a
reasonable level of control because you can set to log rather than lock out.
I think the risk of locking legitimate users is low. Although people can
download mixed materials, my guess is that your abusing accounts are not
watching loads of video.

There are things you can do with user names that would make it easy enough
to uncover abuse without unduly compromising privacy. For example, you could
flush your logs frequently while extracting the number of downloads you're
interested from individual users. Abuse accounts will be immediately
obvious. BTW, you can do some funky things with EZP that include conditional
logic, regexp searches, and rewriting that might be helpful.

Any path you take will protect user privacy far more than just about any
other site they visit. Plus, whoever maintains your network will
occasionally need to monitor specific computers to mitigate a wide variety
of problems. Systems used as a platform for abusive behavior, harassment, or
activity that causes harm to others get locked out and/or blacklisted which
will really hose your users. Getting that kind of thing cleared up takes
time because most places aren't nearly as forgiving as libraries.

kyle


On Wed, Nov 19, 2014 at 8:47 PM, Dan Scott deni...@gmail.com wrote:

 On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee
 kyle.baner...@gmail.com
 wrote:

  There are a number of technical approaches that could be used to
  identify which accounts have been compromised.
 
  But it's easier to just make the problem go away by setting usage
  limits
 so
  EZP locks the account out after it downloads too much.
 

 But EZProxy still doesn't let you set limits based on the type of
 download.
 You therefore have two very blunt sledge hammers with UsageLimit:

 - # of downloads (-transfers)
 - # of megabytes downloaded (-MB)

 # of downloads is effectively useless because many of our electronic
 resource platforms (hi Proquest and EBSCOHost) make between 50 and 150
 requests for JavaScript, CSS, and images per page, so you have to set
 your thresholds incredibly high to avoid locking out users who might
 be actively paging through search results. Any savvy abuser will just
 script their requests to avoid all of the JS/CSS/images to derive a
 list of PDFs, and then download just the PDFs, thereby staying well
 under the usage limits that legit users require... and I've seen
 exactly that happen through our proxy.

 # of megabytes downloaded is a pretty blunt tool as well, given that
 our multimedia-enriched databases now often serve up video and audio
 as well as HTML, images, and PDF files. For the pure audio and video
 streaming sites such as Naxos or Curio, you can set higher limits; but
 as vendors increasingly enrich their databases with audio and video,
 you're going to have to increase your general limits as well... and
 you can pull down a ton of PDFs under that cover.

 So no, I don't think it's easy to make the problem go away through the
 suggested approach, unless you're willing to err on the side of
 locking out legitimate users.



Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Joshua Welker
That looks promising, but I can't make heads or tails of how to implement
any of those rules. Is there a way I can set up the logger to only record
usernames if the IP address matches a list of malicious IPs?

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Michael Berkowski
Sent: Thursday, November 20, 2014 1:02 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Balancing security and privacy with EZproxy

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Here's some of the relevant documentation of the user.txt expressions Kyle
mentioned.  It is possible to set session variables and get them to be
logged - we're doing this with certain Shibboleth attributes for business
analysis.  I have not had luck getting variables other than session vars
set at the user's moment of login to be logged though.

http://oclc.org/support/services/ezproxy/documentation/expressions.en.html

On Thu, 20 Nov 2014, Kyle Banerjee said:

 I can't remember the details because I haven't worked with EZP for
 years and unfortunately, this stuff isn't documented.

 Where I used it was in the user.txt file when authenticating. Things
 you can do include setting/modifying session, regular EZP, and
 arbitrary variables, as well as doing comparisons and file I/O. You
 can nest expressions and perform reasonably sophisticated comparisons
 and manipulations.

  On 11/20/14 1:06 PM, Kyle Banerjee wrote:
 
  BTW, you can do some funky things with EZP that include conditional
  logic
 
 
  Can you say more about funky things you can do with EZProxy
  involving conditional logic? Cause I've often wanted that but
  haven't found any! Are you talking about a particular part/area of
EZProxy? (Logging?).
 

- --

Michael Berkowski
University of Minnesota Libraries
m...@umn.edu
612.626.6137
PGP Public Key: http://z.umn.edu/mjbpubkey


-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRuOowACgkQ01KJk46VC2YL9QCgpyv7ByxUIgnOFcqUT4iFEPLV
0MgAmQHPrDMyVu0x2dgtqE84e9IS1rdV
=lkc4
-END PGP SIGNATURE-


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Kyle Banerjee
Assuming that the credentials are in fact compromised. They could also be
given away or sold, including by the person they belong to. And while it is
trivially easy to employ proxies, only a handful of people bother.

Finding free EZP credentials is crazy easy on Google. Try it -- you'll have
more options than you know what to do with in less than a minute.

In any case, the simplest way to achieve what you're trying to do without
going the IP route is to log users and retain data only long enough to
allow processing by a minimal detection script.

kyle

On Thu, Nov 20, 2014 at 2:17 PM, Joshua Welker wel...@ucmo.edu wrote:

 Blocking the IP is the obvious solution but not ideal at all. First off,
 it's trivially easy to bypass IP blacklists using proxies. I don't want to
 play a game of never-ending IP whack-a-mole. Second, it notifies the
 attacker that we are onto them, which makes it less likely for us to catch
 them. We want to figure out which accounts are compromised so that we can
 fix the problem at the source rather than fixing symptoms. If EZproxy is
 being abused, then it's just as likely that other, more valuable systems at
 the university are being abused as logins are shared between many systems.

 Josh Welker


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Kyle
 Banerjee
 Sent: Thursday, November 20, 2014 12:07 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Balancing security and privacy with EZproxy

 Personally, I'd be tempted to go the IP lockout route myself since the
 patterns should be clear in the logs, but be aware that # megabytes gives a
 reasonable level of control because you can set to log rather than lock
 out.
 I think the risk of locking legitimate users is low. Although people can
 download mixed materials, my guess is that your abusing accounts are not
 watching loads of video.

 There are things you can do with user names that would make it easy enough
 to uncover abuse without unduly compromising privacy. For example, you
 could
 flush your logs frequently while extracting the number of downloads you're
 interested from individual users. Abuse accounts will be immediately
 obvious. BTW, you can do some funky things with EZP that include
 conditional
 logic, regexp searches, and rewriting that might be helpful.

 Any path you take will protect user privacy far more than just about any
 other site they visit. Plus, whoever maintains your network will
 occasionally need to monitor specific computers to mitigate a wide variety
 of problems. Systems used as a platform for abusive behavior, harassment,
 or
 activity that causes harm to others get locked out and/or blacklisted which
 will really hose your users. Getting that kind of thing cleared up takes
 time because most places aren't nearly as forgiving as libraries.

 kyle


 On Wed, Nov 19, 2014 at 8:47 PM, Dan Scott deni...@gmail.com wrote:

  On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee
  kyle.baner...@gmail.com
  wrote:
 
   There are a number of technical approaches that could be used to
   identify which accounts have been compromised.
  
   But it's easier to just make the problem go away by setting usage
   limits
  so
   EZP locks the account out after it downloads too much.
  
 
  But EZProxy still doesn't let you set limits based on the type of
  download.
  You therefore have two very blunt sledge hammers with UsageLimit:
 
  - # of downloads (-transfers)
  - # of megabytes downloaded (-MB)
 
  # of downloads is effectively useless because many of our electronic
  resource platforms (hi Proquest and EBSCOHost) make between 50 and 150
  requests for JavaScript, CSS, and images per page, so you have to set
  your thresholds incredibly high to avoid locking out users who might
  be actively paging through search results. Any savvy abuser will just
  script their requests to avoid all of the JS/CSS/images to derive a
  list of PDFs, and then download just the PDFs, thereby staying well
  under the usage limits that legit users require... and I've seen
  exactly that happen through our proxy.
 
  # of megabytes downloaded is a pretty blunt tool as well, given that
  our multimedia-enriched databases now often serve up video and audio
  as well as HTML, images, and PDF files. For the pure audio and video
  streaming sites such as Naxos or Curio, you can set higher limits; but
  as vendors increasingly enrich their databases with audio and video,
  you're going to have to increase your general limits as well... and
  you can pull down a ton of PDFs under that cover.
 
  So no, I don't think it's easy to make the problem go away through the
  suggested approach, unless you're willing to err on the side of
  locking out legitimate users.
 



[CODE4LIB] Balancing security and privacy with EZproxy

2014-11-19 Thread Joshua Welker
   Balancing security and privacy with EZproxy

In recent months, we have been contacted several times by one of our
vendors about our databases being accessed by rogue Chinese IP addresses.
With the massive proliferation of online security breaches and password
dumps, attackers are gaining access to student accounts and using them to
access subscription resources through EZproxy. The vendor catches this
happening and alerts us sometimes, but probably more often than not we have
no idea. When we do find out, we force the students to change their
passwords.

We currently log IP addresses in EZproxy and can see when one of these
rogue IP addresses is accessing a resource. However, we do not log user IDs
in EZproxy, so we can’t tell which student account was compromised. Logging
the user IDs would be a quick fix, but it has major privacy implications
for our patrons, as we would have a record of every document they access.
Have any other institutions encountered this problem? Are any best
practices established for how to deal with these security breaches?

I apologize for cross-posting.

Josh Welker
Information Technology Librarian
James C. Kirkpatrick Library
University of Central Missouri
Warrensburg, MO 64093
JCKL 2260
660.543.8022


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-19 Thread Michael Berkowski
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Do you make use of the audit logs? They will log a username along with a
session id enabling you to identify evil sessions by user, but
importantly, the audit logs are purged away at a specified interval.  I
think it defaults to 7 days, but you could decide what purge interval
would be sufficient for your forensics needs. When vendors are notifying
you of malicious activity, it is likely to be within a day or two of the
activity so you might consider keeping your audit log for just 3-4 days.

After the audit log has been rotated away, you no longer have a link 
between user ids and the EZproxy session (which I assume you are logging).

Of course, they would still hang around in backups.

http://oclc.org/support/services/ezproxy/documentation/example/securing.en.html

On Wed, 19 Nov 2014, Joshua Welker said:

Balancing security and privacy with EZproxy
 
 In recent months, we have been contacted several times by one of our
 vendors about our databases being accessed by rogue Chinese IP addresses.
 With the massive proliferation of online security breaches and password
 dumps, attackers are gaining access to student accounts and using them to
 access subscription resources through EZproxy. The vendor catches this
 happening and alerts us sometimes, but probably more often than not we have
 no idea. When we do find out, we force the students to change their
 passwords.
 
 We currently log IP addresses in EZproxy and can see when one of these
 rogue IP addresses is accessing a resource. However, we do not log user IDs
 in EZproxy, so we can’t tell which student account was compromised. Logging
 the user IDs would be a quick fix, but it has major privacy implications
 for our patrons, as we would have a record of every document they access.
 Have any other institutions encountered this problem? Are any best
 practices established for how to deal with these security breaches?
 
 I apologize for cross-posting.
 
 Josh Welker
 Information Technology Librarian
 James C. Kirkpatrick Library
 University of Central Missouri
 Warrensburg, MO 64093
 JCKL 2260
 660.543.8022

- -- 

Michael Berkowski
University of Minnesota Libraries
m...@umn.edu
612.626.6137
PGP Public Key: http://z.umn.edu/mjbpubkey


-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlRtBYYACgkQ01KJk46VC2YDAwCeM1gZH25iP+44RLqn0onooU7A
wsIAnisnbl3hZcgIknMsPyseCnHo71dQ
=gj8M
-END PGP SIGNATURE-


Re: [CODE4LIB] Balancing security and privacy with EZProxy

2014-11-19 Thread Schwartz, Raymond
We do log userids with ezproxy.  However we collect logs as monthly files.  
When we process them for patron statistical categories, we then delete the 
original log file.  So what log files we have older than one month are 
anonymized.

/Ray

Ray Schwartz
Systems Specialist Librarianschwart...@wpunj.edu
David and Lorraine Cheng LibraryTel: +1 973 720-3192
William Paterson University Fax: +1 973 720-2585
300 Pompton RoadMobile: +1 201 424-4491
Wayne, NJ 07470-2103 USAhttp://nova.wpunj.edu/schwartzr2/



-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joshua 
Welker
Sent: Wednesday, November 19, 2014 3:53 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Balancing security and privacy with EZproxy

   Balancing security and privacy with EZproxy

In recent months, we have been contacted several times by one of our vendors 
about our databases being accessed by rogue Chinese IP addresses.
With the massive proliferation of online security breaches and password dumps, 
attackers are gaining access to student accounts and using them to access 
subscription resources through EZproxy. The vendor catches this happening and 
alerts us sometimes, but probably more often than not we have no idea. When we 
do find out, we force the students to change their passwords.

We currently log IP addresses in EZproxy and can see when one of these rogue IP 
addresses is accessing a resource. However, we do not log user IDs in EZproxy, 
so we can’t tell which student account was compromised. Logging the user IDs 
would be a quick fix, but it has major privacy implications for our patrons, as 
we would have a record of every document they access.
Have any other institutions encountered this problem? Are any best practices 
established for how to deal with these security breaches?

I apologize for cross-posting.

Josh Welker
Information Technology Librarian
James C. Kirkpatrick Library
University of Central Missouri
Warrensburg, MO 64093
JCKL 2260
660.543.8022


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-19 Thread Kyle Banerjee
There are a number of technical approaches that could be used to identify
which accounts have been compromised.

But it's easier to just make the problem go away by setting usage limits so
EZP locks the account out after it downloads too much. Alternatively, just
block the Chinese IP's unless you have students/faculty accessing resources
from there.

kyle

On Wed, Nov 19, 2014 at 12:52 PM, Joshua Welker wel...@ucmo.edu wrote:

Balancing security and privacy with EZproxy

 In recent months, we have been contacted several times by one of our
 vendors about our databases being accessed by rogue Chinese IP addresses.
 With the massive proliferation of online security breaches and password
 dumps, attackers are gaining access to student accounts and using them to
 access subscription resources through EZproxy. The vendor catches this
 happening and alerts us sometimes, but probably more often than not we have
 no idea. When we do find out, we force the students to change their
 passwords.

 We currently log IP addresses in EZproxy and can see when one of these
 rogue IP addresses is accessing a resource. However, we do not log user IDs
 in EZproxy, so we can’t tell which student account was compromised. Logging
 the user IDs would be a quick fix, but it has major privacy implications
 for our patrons, as we would have a record of every document they access.
 Have any other institutions encountered this problem? Are any best
 practices established for how to deal with these security breaches?

 I apologize for cross-posting.

 Josh Welker
 Information Technology Librarian
 James C. Kirkpatrick Library
 University of Central Missouri
 Warrensburg, MO 64093
 JCKL 2260
 660.543.8022



Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-19 Thread Kaile Zhu
I thought EZproxy would query a directory service while authenticating the 
user, but it does not store users' information on its own.  However, hackers 
trying to break into a database is very common.  The most common tactics is SQL 
injection.  The secure practices are well known.  I list as many of them as I 
can remember below; hope you are not bored.
1. set database user privileges to the least, and if possible, make them task 
specific.
2. when accepting user inputs, enforce the data constrains at both application 
and database levels.
3. use image captcha to prevent auto-filling.
4. configure the web server to deny any IP that has failed many requests within 
a very short period of time.
5. configure the web server to deny any cross-site scripting.

You really can do nothing about those rogues, because they are rogues, and the 
nature of the web is open to everybody.  But once you do all the things in the 
list above, you should be ok, considering it's just a library's website.  The 
real hackers would have a much bigger target to attack.

Kelly Zhu


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joshua 
Welker
Sent: 2014年11月19日 14:53
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Balancing security and privacy with EZproxy

   Balancing security and privacy with EZproxy

In recent months, we have been contacted several times by one of our vendors 
about our databases being accessed by rogue Chinese IP addresses.
With the massive proliferation of online security breaches and password dumps, 
attackers are gaining access to student accounts and using them to access 
subscription resources through EZproxy. The vendor catches this happening and 
alerts us sometimes, but probably more often than not we have no idea. When we 
do find out, we force the students to change their passwords.

We currently log IP addresses in EZproxy and can see when one of these rogue IP 
addresses is accessing a resource. However, we do not log user IDs in EZproxy, 
so we can’t tell which student account was compromised. Logging the user IDs 
would be a quick fix, but it has major privacy implications for our patrons, as 
we would have a record of every document they access.
Have any other institutions encountered this problem? Are any best practices 
established for how to deal with these security breaches?

I apologize for cross-posting.

Josh Welker
Information Technology Librarian
James C. Kirkpatrick Library
University of Central Missouri
Warrensburg, MO 64093
JCKL 2260
660.543.8022
**Bronze+Blue=Green** The University of Central Oklahoma is Bronze, Blue, and 
Green! Please print this e-mail only if absolutely necessary!

**CONFIDENTIALITY** -This e-mail (including any attachments) may contain 
confidential, proprietary and privileged information. Any unauthorized 
disclosure or use of this information is prohibited.


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-19 Thread Dan Scott
On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee kyle.baner...@gmail.com
wrote:

 There are a number of technical approaches that could be used to identify
 which accounts have been compromised.

 But it's easier to just make the problem go away by setting usage limits so
 EZP locks the account out after it downloads too much.


But EZProxy still doesn't let you set limits based on the type of download.
You therefore have two very blunt sledge hammers with UsageLimit:

- # of downloads (-transfers)
- # of megabytes downloaded (-MB)

# of downloads is effectively useless because many of our electronic
resource platforms (hi Proquest and EBSCOHost) make between 50 and 150
requests for JavaScript, CSS, and images per page, so you have to set your
thresholds incredibly high to avoid locking out users who might be actively
paging through search results. Any savvy abuser will just script their
requests to avoid all of the JS/CSS/images to derive a list of PDFs, and
then download just the PDFs, thereby staying well under the usage limits
that legit users require... and I've seen exactly that happen through our
proxy.

# of megabytes downloaded is a pretty blunt tool as well, given that our
multimedia-enriched databases now often serve up video and audio as well as
HTML, images, and PDF files. For the pure audio and video streaming sites
such as Naxos or Curio, you can set higher limits; but as vendors
increasingly enrich their databases with audio and video, you're going to
have to increase your general limits as well... and you can pull down a ton
of PDFs under that cover.

So no, I don't think it's easy to make the problem go away through the
suggested approach, unless you're willing to err on the side of locking out
legitimate users.