What I'm referring to is a SLA to set expectations for event response from downstream
networks. Bear in mind I'm exploring this from the perspective of a backbone provider
to an educational network. What are other providers doing?
Our base current response metric for customers is 30 minutes for critical events, on a
24x7 basis. We've consistantly met that for four years with a very small staff (remind
me to address burnout issues another time).
What I'm after is management of customer expectations to that response, as well as
management of our own expecations to their responses. So, as a hypothetical SLA
clause, if the customer has established a CSIRT, supports RFC abuse addresses, has an
event response policy and capability as audited by us, has the staff and tools
dedicated to providing their own event response capability, exchanged PGP keys with
us, and maintains a high response level (reasonably similar to our own), we agree to
trust them more from a backbone provider's point of view. While our AUP empowers us to
protect the network, it's occasionally awkward to do that from a border router
(especially involving NAT'ed networks, firewalls, proxies and so on). We could provide
more time without taking defensive measures ourselves, with the expectation the
downstream network has the capability to investigate and resolved problems.
At the other end of the scale, there are downstream networks without trained technical
staff, let alone even a part time event response capability. They lock up the doors at
4:30 PM and go home, and stay there all weekend (and all through summer vacations,
etc.). Often, at these networks, decision making is done by completely non-technical
administrators who have no ability to implement requested defensive measures, let
alone access or read logs. Some of these do not have their own admin passwords, having
completely "contracted out" system administration to community volunteers. Some of
these networks practice security by obscurity, unintentionally hiding accountability
even from themselves. Some of these networks think their Wingates are firewalls.
We're working on these from an educational point of view, but the policies to protect
the network have been in place for four years. In the last quarter, our security team
took more than 1700 events. This is 101% of the entire preceeding year. We have low
expectations of additional staff or resources. Our parent organization currently uses
SLAs to define base levels of expectations, with higher response and improved service
after hours for those who choose to participate. These other SLAs require the
downstream site to invest in hardware, software and training. If a site does not have
the capability to react to a request to investigate and protect the network, then that
could be taken into account in requiring a *shorter* time for response from that
network.
Our current practice requires us to make a best effort to contact the customer before
defending the network. This has ranged from calling the campus police to locate a
faculty member during a holiday break, to sending voicemail, e-mail and faxes to each
of four responsible persons in a building we had good reason to belive would be locked
and empty for the next week before defending the network.
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Tue 1/22/2002 9:30 AM
To: Martin, James E.
Cc: [EMAIL PROTECTED]
Subject: RE: SLA Security
Not to bandy words, but it sounds like you're more interested in a policy and
procedures document rather than SLAs. The policy and procedures document would say
here's "what" you have to do. The SLA says here's how we measure what you do.
For example, in the case of a report of an outside attack, your contract with
an ISP might say that the ISP will work with your technical people to promptly
investigate the claim and will develop an appropriate response. The policy and
procedures document could discuss types of attacks (DOS vs. crack, etc.) and the steps
you and the ISP will take in the event of such attacks.
The SLA might say that if you report an outside attack, the ISP will respond
to you regarding the report within 1 hour 98% of the time and within 4 hours 99.5% of
the time. You might even split the measurements depending on the type of attack. If
you are experiencing a DOS attack, the response time might be different than if your
are being cracked.
Regarding downstream maintenance, patches, etc., a policy and procedures
document would identify who is responsible for which activities, and specific
requirements for keeping versions up to date (e.g., for security software you must be
on the most current version (N), for other operational software you must at least be
on N-2, for some non-critical software you must be on N-4, etc.), the process for
implementing patches (e.g. install in DEV or TEST and run specified acceptance testing
before moving to PROD). An SLA, on the other hand, would specify, for example, how
quickly the vendor must install the latest version of something once it is released.
Hope this is helpful.
John
In a message dated Tue, 22 Jan 2002 9:04:58 AM Eastern Standard Time,
"Martin, James E." <[EMAIL PROTECTED]> writes:
> I'd be interested in any SLA work done on security event response by an ISP
covering the following areas:
>
> a. Defense of the network against reported outside attacks
> b. Defense of the network against attacks reported from the site contracting
for access
> c. Downstream network/site obligations for maintenance, patches, upkeep in
general
>
> I've tried a preliminary draft, based on both upstream and downstream
obligations to respond to reported security events. The document sets out
responsibilities and standard responses based on whether a site has any after-hours
event response capability, and whether a site with the capability refuses action or
declines to protect the network.
>
> What are others doing in this area?
>
> Thanks!
> Jim
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> Sent: Fri 1/18/2002 1:41 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: SLA Security
>
>
>
> A general SLA on security is kind of difficult. Generally, you want your
SLAs to be specifically quantifiable and measurable, but it depends on the services
that you are talking about.
>
> For example, if we were talking about anti-virus protection, you might
have a service level for how fast the vendor implements the latest set of virus
definitions.
>
> For security, you might have an SLA for time to implement a patch after
the patch is made available by a relevant vendor.
>
> If your help desk SLA includes response time and problem correction
time, then a response and resolution of a security breach or a virus could be subject
to those SLAs.
>
> For an IDS, you could include a requirement to audit logs every certain
period.
>
> John