Here was the image I was trying to attach: https://cwiki.apache.org/confluence/display/TC/API+Gateway
Jeremy On Thu, May 11, 2017 at 2:14 PM, Amir Yeshurun <[email protected]> wrote: > Hi Jeremy, > Note that attachments seems to be stripped off on this list and the image > is unavailable. > > Your assumptions are correct. We need to figure out the easiest topology > for UI routes to bypass the GW. Please reattach the picture so we can get > more specific. > > Thanks > /amiry > > > > On Thu, May 11, 2017, 20:06 Jeremy Mitchell <[email protected]> wrote: > > > What is of utmost importance to me is the ability to ease into this. We > > have a TO UI right now that needs to be unaffected by the API gateway in > my > > opinion. Granted the old UI might go away at some point but until that > time > > it needs to function as-is. > > > > To me, the simplest approach is to key off request URL. anything that > > starts with /api gets api gateway treatment, the rest passes on > > thru...Here's a fancy picture to communicate what I envision... > > > > [image: Inline image 1] > > > > I'm assuming all requests (endpoints) go thru the api gateway but maybe > > i'm wrong in that assumption. Anyhow, i guess my point is the UI should > > continue to work with the mojo cookie and "api" calls should use the jwt > > token...however, the UI also uses api endpoints so not sure how that > would > > work... > > > > If it's too difficult for the api gateway to support UI and API routes, > we > > could always wait until the new UI (which leverages the API) is > complete... > > > > Jeremy > > > > On Thu, May 11, 2017 at 10:23 AM, Chris Lemmons <[email protected]> > > wrote: > > > >> > invalidate ALL tokens by changing the token signing key > >> > >> Interesting idea. That does mean that the signing key has to be > retrieved > >> every time from the authentication authority, or it'd be subject to the > >> exact same set of attacks. But a nearly-constant rarely changing key > could > >> be communicated very efficiently, I suspect. And if the authentication > >> system is a web API, it can even use Modified-Since to 304 99% of the > time > >> for maximum efficiency. > >> > >> It does have the downside that key-invalidation events are fairly > >> significant. You'd need to invalidate the keys whenever someone's access > >> was reduced or removed. As the number of accounts in the system > increases, > >> that might not wind up being as infrequent as one might hope. It's easy > to > >> implement, though. > >> > >> On Thu, May 11, 2017 at 10:12 AM Jeremy Mitchell <[email protected] > > > >> wrote: > >> > >> > Regarding the TTL on the JWT token. a 5 minute TTL seems silly. What's > >> the > >> > point? Unless we get into refresh tokens but that sounds like > >> oauth...blah. > >> > > >> > What about this and maybe i'm oversimplifying. the TTL on the jwt > token > >> is > >> > 24 hours. If we become aware that a token has been compromised, > >> invalidate > >> > ALL tokens by changing the token signing key. maybe this is a good > idea > >> or > >> > maybe this is a terrible idea. I have no idea. just a thought.. > >> > > >> > jeremy > >> > > >> > On Wed, May 10, 2017 at 12:23 PM, Chris Lemmons <[email protected]> > >> > wrote: > >> > > >> > > Responding to a few people: > >> > > > >> > > > Often times every auth action must be accompanied by DB writes for > >> > audit > >> > > logs or callback functions. > >> > > > >> > > True. But a) if logging is too expensive it should probably be made > >> > cheaper > >> > > and b) the answer to "audits are too expensive" probably isn't "lets > >> just > >> > > do less authentication". If the audit log is genuinely the > >> bottle-neck, > >> > it > >> > > would still be better to re-auth without the audit log. > >> > > > >> > > > The API gateway can poll for the latest list of tokens at a > regular > >> > > interval > >> > > > >> > > Yeah, datastore replication for local performance is great. Though > if > >> you > >> > > can reasonably query for a list of all valid tokens every second, > it's > >> > > probably cheaper to to just query for the token you need every time > >> you > >> > > need it. If there are massive batches of queries that are coming > >> through, > >> > > it's probably not unreasonable to choose not to re-validate a token > >> > that's > >> > > been validated in the last second. > >> > > > >> > > > Regarding maliciously delayed message or such - I don't fully > >> > understand > >> > > the > >> > > point; if an attacker has such capabilities she can simply > >> prevent/delay > >> > > devop users from updating the auth database itself thus enabling the > >> > > attack. > >> > > > >> > > In a typical attack, an attacker might gain control of a box on the > >> local > >> > > network, but not necessarily the Gateway, Traffic Ops, or Auth > Server. > >> > > Those are probably better hardened. But lots of networks have a > >> squishy > >> > > test box that everyone forgot was there or something. The bad guy > >> wants > >> > to > >> > > use the CDN to DOS someone, or redirect traffic to somewhere > >> malicious, > >> > or > >> > > just cause mayhem. The longer he can keep control, the better for > him. > >> > > > >> > > So this attacker uses the local box to sniff the token off the > >> network. > >> > If > >> > > the communication with the Gateway is encrypted, he might have to do > >> some > >> > > ARP poisoning or something else to trick a host into talking to the > >> local > >> > > box instead. (Properly implemented TLS also migates this angle.) He > >> knows > >> > > that as soon as he starts his nefarious deed, alarms are going to go > >> off, > >> > > so he also uses this local box to DOS the Auth Server. It's a lot > >> easier > >> > to > >> > > take a box down from the outside than to actually gain control. > >> > > > >> > > If the Gateway "fails open" when it can't contact the Auth server, > the > >> > > attacker remains in control. If it "fails closed", the attacker has > to > >> > > actually compromise the auth server (which is harder) to remain in > >> > control. > >> > > > >> > > > Do we block all API calls if the auth service is temporarily down > >> > (being > >> > > upgraded, container restarting, etc…)? > >> > > > >> > > Yes, I think we have to. Authentication is integral to reliable > >> > operation. > >> > > > >> > > We've been talking in some fairly wild hypotheticals, though. Is > >> there a > >> > > specific auth service you're envisioning? > >> > > > >> > > On Wed, May 10, 2017 at 12:50 AM Shmulik Asafi <[email protected]> > >> > wrote: > >> > > > >> > > > Regarding the communication issue Chris raised - there is more > than > >> one > >> > > > possible pattern to this, e.g.: > >> > > > > >> > > > - Blacklisted tokens can be communicated via a pub-sub > mechanism > >> > > > - The API gateway can poll for the latest list of tokens at a > >> > regular > >> > > > interval (which can be very short ~1sec, much shorter than the > >> time > >> > it > >> > > > takes devops to detect and react to malign tokens) > >> > > > > >> > > > Regarding hitting the blacklist datastore - this only sounds > >> similar to > >> > > > hitting to auth database; but the simplicity of a blacklist > function > >> > > allows > >> > > > you to employ more efficient datastores, e.g. Redis or just a > >> hashmap > >> > in > >> > > > the API gateway process memory. > >> > > > > >> > > > Regarding maliciously delayed message or such - I don't fully > >> > understand > >> > > > the point; if an attacker has such capabilities she can simply > >> > > > prevent/delay devop users from updating the auth database itself > >> thus > >> > > > enabling the attack. > >> > > > > >> > > > > >> > > > On Wed, May 10, 2017 at 4:25 AM, Eric Friedrich (efriedri) < > >> > > > [email protected]> wrote: > >> > > > > >> > > > > Our current management wrapper around Traffic Control (called > OMD > >> > > > > Director, demo’d at last TC summit) uses a very similar approach > >> to > >> > > > > authentication. > >> > > > > > >> > > > > We have an auth service that issues a JWT. The JWT is then > >> provided > >> > > along > >> > > > > with all API calls. A few comments on our practical experience: > >> > > > > > >> > > > > - I am a supported of validating tokens both in the API gateway > >> and > >> > in > >> > > > the > >> > > > > service. We have several examples of services- Grafana for > >> example, > >> > > that > >> > > > > require external authentication. Similarly, we have other > services > >> > that > >> > > > > need finer grained authentication than API Gateway policy can > >> handle. > >> > > > > Specifically, a given user may have permissions to view/modify > >> some > >> > > > > delivery services but not others. The API gateway presumably > would > >> > not > >> > > > > understand the semantics of payload so this decision would need > >> to be > >> > > > made > >> > > > > by auth within the service. > >> > > > > > >> > > > > - As brought up earlier, auth in the gateway is both a strength > >> and a > >> > > > > risk. Additional layer of security is also positive, but for my > >> case > >> > of > >> > > > > Grafana above it can present an opportunity to bypass > >> authentication. > >> > > > This > >> > > > > is a risk, but it can be mitigated by adding auth to the service > >> > where > >> > > > > needed. > >> > > > > > >> > > > > - Verifying tokens on every access may potentially be more a > >> little > >> > > > > expensive than discussed. Often times every auth action must be > >> > > > accompanied > >> > > > > by DB writes for audit logs or callback functions. Not the straw > >> to > >> > > break > >> > > > > the camel’s back, but something to keep in mind. > >> > > > > > >> > > > > - There is also the problem of what to do if the underlying auth > >> > > service > >> > > > > is temporarily unavailable. Do we block all API calls if the > auth > >> > > service > >> > > > > is temporarily down (being upgraded, container restarting, > etc…)? > >> > > > > > >> > > > > - I’d like to see what we can do to use a pre-existing package > as > >> an > >> > > API > >> > > > > Gateway. As we decompose TO into microservices, something like > >> nginx > >> > > can > >> > > > > provide additional benefits like TLS termination and load > >> balancing > >> > > > between > >> > > > > service endpoints. I’d hate to see us have to reimplement these > >> > > functions > >> > > > > later. > >> > > > > > >> > > > > - I’d also like to see us give some consideration to how an API > >> > gateway > >> > > > is > >> > > > > deployed. We raised the bar for new users by unbundling Traffic > >> Ops > >> > > from > >> > > > > the database and it could further complicate the installation if > >> we > >> > > don’t > >> > > > > provide enough guidance on how to deploy the API gateway in a > lab > >> > > trial, > >> > > > if > >> > > > > not best practices for production deployment. Should we > recommend > >> to > >> > > > deploy > >> > > > > as an new RPM/systemd service, an immutable container, or as > part > >> of > >> > > the > >> > > > > existing TO RPM? > >> > > > > > >> > > > > —Eric > >> > > > > > >> > > > > > >> > > > > > On May 9, 2017, at 5:05 PM, Chris Lemmons <[email protected] > > > >> > > wrote: > >> > > > > > > >> > > > > > Blacklisting requires proactive communication between the > >> > > > authentication > >> > > > > > system and the gateway. Furthermore, the client can't be sure > >> that > >> > > > > > something hasn't been blacklisted recently (and the message > >> lost or > >> > > > > perhaps > >> > > > > > maliciously delayed) unless it checks whatever system it is > that > >> > does > >> > > > the > >> > > > > > blacklisting. And if you're checking a datastore of some sort > >> for > >> > the > >> > > > > > validity of the token every time, you might as well just check > >> each > >> > > > time > >> > > > > > and skip the blacklisting step. > >> > > > > > > >> > > > > > On Tue, May 9, 2017 at 1:27 PM Shmulik Asafi < > >> [email protected]> > >> > > > wrote: > >> > > > > > > >> > > > > >> Hi, > >> > > > > >> Maybe a missing link here is another component in a jwt > >> stateless > >> > > > > >> architecture which is *blacklisting* malign tokens when > >> necessary. > >> > > > > >> This is obviously a sort of state which needs to be handled > in > >> a > >> > > > > datastore; > >> > > > > >> but it's quite different and easy to scale and has less > >> > performance > >> > > > > impact > >> > > > > >> (I guess especially under DDOS) than doing full auth queries. > >> > > > > >> I believe this should be the approach on the API Gateway > >> roadmap > >> > > > > >> Thanks > >> > > > > >> > >> > > > > >> On 9 May 2017 21:14, "Chris Lemmons" <[email protected]> > >> wrote: > >> > > > > >> > >> > > > > >>> I'll second the principle behind "start with security, > >> optimize > >> > > when > >> > > > > >>> there's a problem". > >> > > > > >>> > >> > > > > >>> It seems to me that in order to maintain security, basically > >> > > everyone > >> > > > > >> would > >> > > > > >>> need to dial the revalidate time so close to zero that it > does > >> > very > >> > > > > >> little > >> > > > > >>> good as a cache on the credentials. Otherwise, as Rob as > >> pointed > >> > > out, > >> > > > > the > >> > > > > >>> TTL on your credential cache is effectively "how long am I > ok > >> > with > >> > > > > >> hackers > >> > > > > >>> in control after I find them". Practically, it also means > that > >> > much > >> > > > lag > >> > > > > >> on > >> > > > > >>> adding or removing permissions. That effectively means a > >> database > >> > > hit > >> > > > > for > >> > > > > >>> every query, or near enough to every query as not to matter. > >> > > > > >>> > >> > > > > >>> That said, you can get the best of multiple worlds, I think. > >> The > >> > > only > >> > > > > DB > >> > > > > >>> query that really has to be done is "give me the last update > >> time > >> > > for > >> > > > > >> this > >> > > > > >>> user". Compare that to the generation time in the token and > >> 99% > >> > of > >> > > > the > >> > > > > >>> time, it's the only query you need. With that check, you can > >> even > >> > > use > >> > > > > >>> fairly long-lived tokens. If anything about the user has > >> changed, > >> > > > > reject > >> > > > > >>> the token, generate a new one, send that to the user and use > >> it. > >> > > The > >> > > > > >>> regenerate step is somewhat expensive, but still well inside > >> > > > > reasonable, > >> > > > > >> I > >> > > > > >>> think. > >> > > > > >>> > >> > > > > >>> On Tue, May 9, 2017 at 11:31 AM Robert Butts < > >> > > > [email protected] > >> > > > > > > >> > > > > >>> wrote: > >> > > > > >>> > >> > > > > >>>>> The TO service (and any other service that requires auth) > >> MUST > >> > > hit > >> > > > > >> the > >> > > > > >>>> database (or the auth service, which itself hits the > >> database) > >> > to > >> > > > > >> verify > >> > > > > >>>> valid tokens' users still have the permissions they did > when > >> the > >> > > > token > >> > > > > >>> was > >> > > > > >>>> created. Otherwise, it's impossible to revoke tokens, e.g. > >> if an > >> > > > > >> employee > >> > > > > >>>> quits, or an attacker gains a token, or a user changes > their > >> > > > password. > >> > > > > >>>> > >> > > > > >>>> I'm elaborating on this, and moving a discussion from a PR > >> > review > >> > > > > here. > >> > > > > >>>> > >> > > > > >>>> From the code submissions to the repo, it appears the > current > >> > plan > >> > > > is > >> > > > > >> for > >> > > > > >>>> the API Gateway to create a JWT, and then for that JWT to > be > >> > > > accepted > >> > > > > >> by > >> > > > > >>>> all Traffic Ops microservices, with no database > >> authentication. > >> > > > > >>>> > >> > > > > >>>> It's a common misconception that JWT allows you > authenticate > >> > > without > >> > > > > >>>> hitting the database. This is an exceedingly dangerous > >> > > > misconception. > >> > > > > >> If > >> > > > > >>>> you don't check the database when every authenticated route > >> is > >> > > > > >> requested, > >> > > > > >>>> it's impossible to revoke access. In practice, this means > the > >> > JWT > >> > > > TTL > >> > > > > >>>> becomes the length of time _after you discover an attacker > is > >> > > > > >>> manipulating > >> > > > > >>>> your production system_, before it's _possible_ to evict > >> them. > >> > > > > >>>> > >> > > > > >>>> How long do you feel is acceptable to have a hacker in and > >> > > > > manipulating > >> > > > > >>>> your system, after you discover them? A day? An hour? Five > >> > > minutes? > >> > > > > >>>> Whatever your TTL, that's the length of time you're willing > >> to > >> > > > allow a > >> > > > > >>>> hacker to steal and destroy you and your customers' data. > >> Worse, > >> > > > > >> because > >> > > > > >>>> this is a CDN, it's the length of time you're willing to > >> allow > >> > > your > >> > > > > CDN > >> > > > > >>> to > >> > > > > >>>> be used to DDOS a target. > >> > > > > >>>> > >> > > > > >>>> Are you going to explain in court that the DDOS your system > >> > > executed > >> > > > > >>> lasted > >> > > > > >>>> 24 hours, or 1 hour, or 10 minutes after you discovered it, > >> > > because > >> > > > > >>> that's > >> > > > > >>>> the TTL you hard-coded? Are you going to explain to a judge > >> and > >> > > > > >>> prosecuting > >> > > > > >>>> attorney exactly which sensitive data was stolen in the ten > >> > > minutes > >> > > > > >> after > >> > > > > >>>> you discovered the attacker in your system, before their > JWT > >> > > > expired? > >> > > > > >>>> > >> > > > > >>>> If you're willing to accept the legal consequences, that's > >> your > >> > > > > >> business. > >> > > > > >>>> Apache Traffic Control should not require users to accept > >> those > >> > > > > >>>> consequences, and ideally shouldn't make it possible, as > many > >> > > users > >> > > > > >> won't > >> > > > > >>>> understand the security risks. > >> > > > > >>>> > >> > > > > >>>> The argument has been made "authorization does not check > the > >> > > > database > >> > > > > >> to > >> > > > > >>>> avoid congestion" -- Has anyone tested this in practice? > The > >> > > > database > >> > > > > >>> query > >> > > > > >>>> itself is 50ms. Assuming your database and service are > 2500km > >> > > apart, > >> > > > > >>> that's > >> > > > > >>>> another 50ms network latency. Traffic Ops has endpoints > that > >> > take > >> > > > 10s > >> > > > > >> to > >> > > > > >>>> generate. Worst-case scenario, this will double the time of > >> tiny > >> > > > > >>> endpoints > >> > > > > >>>> to 200ms, and increase large endpoints inconsequentially. > >> It's > >> > > > highly > >> > > > > >>>> unlikely performance is an issue in practice. > >> > > > > >>>> > >> > > > > >>>> As Jan said, we can still have the services check the auth > as > >> > well > >> > > > > >> after > >> > > > > >>>> the proxy auth. Moreover, the services don't even have to > >> know > >> > > about > >> > > > > >> the > >> > > > > >>>> auth service, they can hit a mapped route on the API > Gateway, > >> > > which > >> > > > > >> gives > >> > > > > >>>> us better modularisation and separation of concerns. > >> > > > > >>>> > >> > > > > >>>> It's not difficult, it can be a trivial endpoint on the > auth > >> > > > service, > >> > > > > >>>> remapped in the API Gateway, which takes the JWT token and > >> > returns > >> > > > > true > >> > > > > >>> if > >> > > > > >>>> it's still authorized in the database. To be clear, this is > >> not > >> > a > >> > > > > >> problem > >> > > > > >>>> today. Traffic Ops still uses the Mojolicious cookie today, > >> so > >> > > this > >> > > > > >> would > >> > > > > >>>> only need done if and when we remove that, or if we move > >> > > authorized > >> > > > > >>>> endpoints out of Traffic Ops into their own microservices. > >> > > > > >>>> > >> > > > > >>>> Considering the significant security and legal risks, we > >> should > >> > > > always > >> > > > > >>> hit > >> > > > > >>>> the database to validate requests of authorized endpoints, > >> and > >> > > > > >> reconsider > >> > > > > >>>> if and when someone observes performance issues in > practice. > >> > > > > >>>> > >> > > > > >>>> > >> > > > > >>>> On Tue, May 9, 2017 at 6:56 AM, Dewayne Richardson < > >> > > > [email protected] > >> > > > > > > >> > > > > >>>> wrote: > >> > > > > >>>> > >> > > > > >>>>> If only the API GW authenticates/authorizes we also have a > >> > single > >> > > > > >> point > >> > > > > >>>> of > >> > > > > >>>>> entry to test for security instead of having it sprinkled > >> > across > >> > > > > >>> services > >> > > > > >>>>> in different ways. It also simplifies the code on the > >> service > >> > > side > >> > > > > >> and > >> > > > > >>>>> makes them easier to test with automation. > >> > > > > >>>>> > >> > > > > >>>>> -Dew > >> > > > > >>>>> > >> > > > > >>>>> On Mon, May 8, 2017 at 8:42 AM, Robert Butts < > >> > > > > >> [email protected] > >> > > > > >>>> > >> > > > > >>>>> wrote: > >> > > > > >>>>> > >> > > > > >>>>>>> couldn't make nginx or http do what we need. > >> > > > > >>>>>> > >> > > > > >>>>>> I was suggesting a different architecture. Not making the > >> > proxy > >> > > do > >> > > > > >>>> auth, > >> > > > > >>>>>> only standard proxying. > >> > > > > >>>>>> > >> > > > > >>>>>>> We can still have the services check the auth as well > >> after > >> > the > >> > > > > >>> proxy > >> > > > > >>>>>> auth > >> > > > > >>>>>> > >> > > > > >>>>>> +1 > >> > > > > >>>>>> > >> > > > > >>>>>> > >> > > > > >>>>>> On Mon, May 8, 2017 at 3:36 AM, Amir Yeshurun < > >> > [email protected]> > >> > > > > >>> wrote: > >> > > > > >>>>>> > >> > > > > >>>>>>> Hi, > >> > > > > >>>>>>> > >> > > > > >>>>>>> Let me elaborate some more on the purpose of the API > GW. I > >> > will > >> > > > > >> put > >> > > > > >>>> up > >> > > > > >>>>> a > >> > > > > >>>>>>> wiki page following our discussions here. > >> > > > > >>>>>>> > >> > > > > >>>>>>> Main purpose is to allow innovation by creating new > >> services > >> > > that > >> > > > > >>>>> handle > >> > > > > >>>>>> TO > >> > > > > >>>>>>> functionality, not as a part of the monolithic Mojo app. > >> > > > > >>>>>>> The long term vision is to de-compose TO into multiple > >> > > > > >>> microservices, > >> > > > > >>>>>>> allowing new functionality easily added. > >> > > > > >>>>>>> Indeed, the goal it to eventually deprecate the current > >> AAA > >> > > > > >> model, > >> > > > > >>>> and > >> > > > > >>>>>>> replace it with the new AAA model currently under work > >> > > > > >> (user-roles, > >> > > > > >>>>>>> role-capabilities) > >> > > > > >>>>>>> > >> > > > > >>>>>>> I think that handling authorization in the API layer is > a > >> > valid > >> > > > > >>>>> approach. > >> > > > > >>>>>>> Security wise, I don't see much difference between that, > >> and > >> > > > > >> having > >> > > > > >>>>> each > >> > > > > >>>>>>> module access the auth service, as long as the auth > >> service > >> > is > >> > > > > >>>> deployed > >> > > > > >>>>>> in > >> > > > > >>>>>>> the backend. > >> > > > > >>>>>>> Having another proxy (nginx?) fronting the world and > >> > forwarding > >> > > > > >> all > >> > > > > >>>>>>> requests to the backend GW mitigates the risk for > >> > compromising > >> > > > > >> the > >> > > > > >>>>>>> authorization service. > >> > > > > >>>>>>> However, as mentioned above, we can still have the > >> services > >> > > check > >> > > > > >>> the > >> > > > > >>>>>> auth > >> > > > > >>>>>>> as well after the proxy auth. > >> > > > > >>>>>>> > >> > > > > >>>>>>> It is a standalone process, completely optional at this > >> > point. > >> > > > > >> One > >> > > > > >>>> can > >> > > > > >>>>>>> choose to deploy it in order to allow integration with > >> > > additional > >> > > > > >>>>>>> services. Deployment > >> > > > > >>>>>>> and management are still T.B.D, and feedback on this is > >> most > >> > > > > >>> welcome. > >> > > > > >>>>>>> > >> > > > > >>>>>>> Regarding token validation and revocation: > >> > > > > >>>>>>> Tokens have expiration time. Expired tokens do not pass > >> token > >> > > > > >>>>> validation. > >> > > > > >>>>>>> In production, expiration should be set to relatively > >> short > >> > > time, > >> > > > > >>>> say 5 > >> > > > > >>>>>>> minute. > >> > > > > >>>>>>> This way revocation is automatic. Re-authentication is > >> > handled > >> > > > > >> via > >> > > > > >>>>>> refresh > >> > > > > >>>>>>> tokens (not implemented yet). Hitting the DB upon every > >> API > >> > > call > >> > > > > >>>> cause > >> > > > > >>>>>>> congestion on users DB. > >> > > > > >>>>>>> To avoid that, we chose to have all user information > >> > > > > >> self-contained > >> > > > > >>>>>> inside > >> > > > > >>>>>>> the JWT. > >> > > > > >>>>>>> > >> > > > > >>>>>>> Thanks > >> > > > > >>>>>>> /amiry > >> > > > > >>>>>>> > >> > > > > >>>>>>> On Mon, May 8, 2017 at 5:42 AM Jan van Doorn < > >> > [email protected]> > >> > > > > >>>> wrote: > >> > > > > >>>>>>> > >> > > > > >>>>>>>> It's the reverse proxy we've discussed for the "micro > >> > > services" > >> > > > > >>>>> version > >> > > > > >>>>>>> for > >> > > > > >>>>>>>> a while now (as in > >> > > > > >>>>>>>> > >> > > > > >>>> https://cwiki.apache.org/confluence/display/TC/Design+ > >> > > Overview+v3.0 > >> > > > > >>>>> ). > >> > > > > >>>>>>>> > >> > > > > >>>>>>>> On Sun, May 7, 2017 at 7:22 PM Eric Friedrich > (efriedri) > >> < > >> > > > > >>>>>>>> [email protected]> > >> > > > > >>>>>>>> wrote: > >> > > > > >>>>>>>> > >> > > > > >>>>>>>>> From a higher level- what is purpose of the API > Gateway? > >> > It > >> > > > > >>>> seems > >> > > > > >>>>>> like > >> > > > > >>>>>>>>> there may have been some previous discussions about > API > >> > > > > >>> Gateway. > >> > > > > >>>>> Are > >> > > > > >>>>>>>> there > >> > > > > >>>>>>>>> any notes or description that I can catch up on? > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> How will it be deployed? (Is it a standalone service > or > >> > > > > >>> something > >> > > > > >>>>>> that > >> > > > > >>>>>>>>> runs inside the experimental Traffic Ops)? > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> Is this new component required or optional? > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> —Eric > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>>> On May 7, 2017, at 8:28 PM, Jan van Doorn < > >> > [email protected] > >> > > > > >>> > >> > > > > >>>>> wrote: > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> I looked into this a year or so ago, and I couldn't > >> make > >> > > > > >>> nginx > >> > > > > >>>> or > >> > > > > >>>>>>> http > >> > > > > >>>>>>>> do > >> > > > > >>>>>>>>>> what we need. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> We can still have the services check the auth as well > >> > after > >> > > > > >>> the > >> > > > > >>>>>> proxy > >> > > > > >>>>>>>>> auth, > >> > > > > >>>>>>>>>> and make things better than today, where we have the > >> same > >> > > > > >>>> problem > >> > > > > >>>>>>> that > >> > > > > >>>>>>>> if > >> > > > > >>>>>>>>>> the TO mojo app is compromised, everything is > >> compromised. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> If we always route to TO, we don't untangle the mess > of > >> > > > > >> being > >> > > > > >>>>>>> dependent > >> > > > > >>>>>>>>> on > >> > > > > >>>>>>>>>> the monolithic TO for everything. Many services > today, > >> and > >> > > > > >>> more > >> > > > > >>>>> in > >> > > > > >>>>>>> the > >> > > > > >>>>>>>>>> future really just need a check to see if the user is > >> > > > > >>>> authorized, > >> > > > > >>>>>> and > >> > > > > >>>>>>>>>> nothing more. > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>> On Sun, May 7, 2017 at 11:55 AM Robert Butts < > >> > > > > >>>>>>> [email protected] > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>>> wrote: > >> > > > > >>>>>>>>>> > >> > > > > >>>>>>>>>>> What are the advantages of these config files, over > an > >> > > > > >>>> existing > >> > > > > >>>>>>>> reverse > >> > > > > >>>>>>>>>>> proxy, like Nginx or httpd? It's just as much work > as > >> > > > > >>>>> configuring > >> > > > > >>>>>>> and > >> > > > > >>>>>>>>>>> deploying an existing product, but more code we have > >> to > >> > > > > >>> write > >> > > > > >>>>> and > >> > > > > >>>>>>>>> maintain. > >> > > > > >>>>>>>>>>> I'm having trouble seeing the advantage. > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>>>> -1 on auth rules as a part of the proxy. Making a > >> proxy > >> > > > > >> care > >> > > > > >>>>> about > >> > > > > >>>>>>>> auth > >> > > > > >>>>>>>>>>> violates the Single Responsibility Principle, and > >> > further, > >> > > > > >>> is > >> > > > > >>>> a > >> > > > > >>>>>>>> security > >> > > > > >>>>>>>>>>> risk. It creates unnecessary attack surface. If your > >> > proxy > >> > > > > >>> app > >> > > > > >>>>> or > >> > > > > >>>>>>>>> server is > >> > > > > >>>>>>>>>>> compromised, the entire framework is now > compromised. > >> An > >> > > > > >>>>> attacker > >> > > > > >>>>>>>> could > >> > > > > >>>>>>>>>>> simply rewrite the proxy config to make all routes > >> > > > > >> no-auth. > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>>>> The simple alternative is for the proxy to always > >> route > >> > to > >> > > > > >>> TO, > >> > > > > >>>>> and > >> > > > > >>>>>>> TO > >> > > > > >>>>>>>>>>> checks the token against the auth service (which may > >> also > >> > > > > >> be > >> > > > > >>>>>>> proxied), > >> > > > > >>>>>>>>> and > >> > > > > >>>>>>>>>>> redirects unauthorized requests to a login endpoint > >> > (which > >> > > > > >>> may > >> > > > > >>>>>> also > >> > > > > >>>>>>> be > >> > > > > >>>>>>>>>>> proxied). > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>>>> The TO service (and any other service that requires > >> auth) > >> > > > > >>> MUST > >> > > > > >>>>> hit > >> > > > > >>>>>>> the > >> > > > > >>>>>>>>>>> database (or the auth service, which itself hits the > >> > > > > >>> database) > >> > > > > >>>>> to > >> > > > > >>>>>>>> verify > >> > > > > >>>>>>>>>>> valid tokens' users still have the permissions they > >> did > >> > > > > >> when > >> > > > > >>>> the > >> > > > > >>>>>>> token > >> > > > > >>>>>>>>> was > >> > > > > >>>>>>>>>>> created. Otherwise, it's impossible to revoke > tokens, > >> > e.g. > >> > > > > >>> if > >> > > > > >>>> an > >> > > > > >>>>>>>>> employee > >> > > > > >>>>>>>>>>> quits, or an attacker gains a token, or a user > changes > >> > > > > >> their > >> > > > > >>>>>>> password. > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>>>> On Sun, May 7, 2017 at 4:35 AM, Amir Yeshurun < > >> > > > > >>>> [email protected]> > >> > > > > >>>>>>>> wrote: > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>>>>> Seems that attachments are stripped on this list. > >> > > > > >> Examples > >> > > > > >>>>> pasted > >> > > > > >>>>>>>> below > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>>> *rules.json* > >> > > > > >>>>>>>>>>>> [ > >> > > > > >>>>>>>>>>>> { "host": "localhost", "path": "/login", > >> > > > > >>>>>>> "forward": > >> > > > > >>>>>>>>>>>> "localhost:9004", "scheme": "https", "auth": false > }, > >> > > > > >>>>>>>>>>>> { "host": "localhost", "path": > >> "/api/1.2/innovation/", > >> > > > > >>>>>>> "forward": > >> > > > > >>>>>>>>>>>> "localhost:8004", "scheme": "http", "auth": true, > >> > > > > >>>>> "routes-file": > >> > > > > >>>>>>>>>>>> "innovation.json" }, > >> > > > > >>>>>>>>>>>> { "host": "localhost", "path": "/api/1.2/", > >> > > > > >>>>>>> "forward": > >> > > > > >>>>>>>>>>>> "localhost:3000", "scheme": "http", "auth": true, > >> > > > > >>>>> "routes-file": > >> > > > > >>>>>>>>>>>> "traffic-ops-routes.json" }, > >> > > > > >>>>>>>>>>>> { "host": "localhost", "path": > >> "/internal/api/1.2/", > >> > > > > >>>>>>> "forward": > >> > > > > >>>>>>>>>>>> "localhost:3000", "scheme": "http", "auth": true, > >> > > > > >>>>> "routes-file": > >> > > > > >>>>>>>>>>>> "internal-routes.json" } > >> > > > > >>>>>>>>>>>> ] > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>>> *traffic-ops-routes.json (partial)* > >> > > > > >>>>>>>>>>>> . > >> > > > > >>>>>>>>>>>> . > >> > > > > >>>>>>>>>>>> . > >> > > > > >>>>>>>>>>>> { "match": "/cdns/health", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-health-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/capacity", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-health-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/usage/overview", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-stats-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/name/dnsseckeys/generate", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-security-keys-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/name/[^\/]+/?", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/name/[^\/]+/sslkeys", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-security-keys-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/name/[^\/]+/dnsseckeys", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-security-keys-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/name/[^\/]+/dnsseckeys/ > delete", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-security-keys-write"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/[^\/]+/queue_update", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>>> "POST": > >> > > > > >>>>>>>>>>>> ["queue-updates-write"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/[^\/]+/snapshot", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "PUT": > >> > > > > >>>>>>>>>>>> ["cdn-config-snapshot-write"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/[^\/]+/health", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-health-read"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns/[^\/]+/?", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-read"], "PUT": ["cdn-write"], "PATCH": > >> > > > > >>> ["cdn-write"], > >> > > > > >>>>>>>> "DELETE": > >> > > > > >>>>>>>>>>>> ["cdn-write"] }}, > >> > > > > >>>>>>>>>>>> { "match": "/cdns", > >> > > > > >>> "auth": > >> > > > > >>>> { > >> > > > > >>>>>>> "GET": > >> > > > > >>>>>>>>>>>> ["cdn-read"], "POST": ["cdn-write"] }}, > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>>> . > >> > > > > >>>>>>>>>>>> . > >> > > > > >>>>>>>>>>>> . > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>>> On Sun, May 7, 2017 at 12:39 PM Amir Yeshurun < > >> > > > > >>>> [email protected] > >> > > > > >>>>>> > >> > > > > >>>>>>>> wrote: > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>> Attached please find examples for forwarding rules > >> file > >> > > > > >>>>>>> (rules.json) > >> > > > > >>>>>>>>>>> and > >> > > > > >>>>>>>>>>>>> the authorization rules file > >> (traffic-ops-routes.json) > >> > > > > >>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>> On Sun, May 7, 2017 at 10:39 AM Amir Yeshurun < > >> > > > > >>>>> [email protected]> > >> > > > > >>>>>>>>> wrote: > >> > > > > >>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> Hi all, > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> I am about to submit a PR with a first > operational > >> > > > > >>> version > >> > > > > >>>> of > >> > > > > >>>>>> the > >> > > > > >>>>>>>> API > >> > > > > >>>>>>>>>>>> GW, > >> > > > > >>>>>>>>>>>>>> to the "experimental" code base. > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> The API GW forwarding logic is as follow: > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> 1. Find host to forward the request: Prefix > match > >> on > >> > > > > >>> the > >> > > > > >>>>>>> request > >> > > > > >>>>>>>>>>> path > >> > > > > >>>>>>>>>>>>>> against a list of forwarding rules. The matched > >> > > > > >>>> forwarding > >> > > > > >>>>>> rule > >> > > > > >>>>>>>>>>>> defines the > >> > > > > >>>>>>>>>>>>>> target's host, and the target's *authorization > >> > > > > >> rules*. > >> > > > > >>>>>>>>>>>>>> 2. Authorization: Regex match on the request > path > >> > > > > >>>> against a > >> > > > > >>>>>>> list > >> > > > > >>>>>>>> of > >> > > > > >>>>>>>>>>>> *authorization > >> > > > > >>>>>>>>>>>>>> rules*. The matched rule defines the required > >> > > > > >>>> capabilities > >> > > > > >>>>> to > >> > > > > >>>>>>>>>>> perform > >> > > > > >>>>>>>>>>>>>> the HTTP method on the route. These capabilities > >> are > >> > > > > >>>>> compared > >> > > > > >>>>>>>>>>>> against the > >> > > > > >>>>>>>>>>>>>> user's capabilities in the user's JWT > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> At this moment, the 2 sets of rules are > hard-coded > >> in > >> > > > > >>> json > >> > > > > >>>>>> files. > >> > > > > >>>>>>>> The > >> > > > > >>>>>>>>>>>>>> files are provided with the API GW distribution > and > >> > > > > >>> contain > >> > > > > >>>>>>>>>>> definitions > >> > > > > >>>>>>>>>>>> for > >> > > > > >>>>>>>>>>>>>> TC 2.0 API routes. I have tested parts of the > API, > >> > > > > >>> however, > >> > > > > >>>>>> there > >> > > > > >>>>>>>>>>> might > >> > > > > >>>>>>>>>>>> be > >> > > > > >>>>>>>>>>>>>> mistakes in some of the routes. Please be warned. > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> Considering manageability and high availability, > I > >> am > >> > > > > >>> aware > >> > > > > >>>>>> that > >> > > > > >>>>>>>>> using > >> > > > > >>>>>>>>>>>>>> local files for storing the set of authorization > >> rules > >> > > > > >> is > >> > > > > >>>>>>> inferior > >> > > > > >>>>>>>> to > >> > > > > >>>>>>>>>>>>>> centralized configuration. > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> We are considering different approaches for > >> > centralized > >> > > > > >>>>>>>>> configuration, > >> > > > > >>>>>>>>>>>>>> having the following points in mind > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> - Microservice world: API GW will front multiple > >> > > > > >>>> services, > >> > > > > >>>>>> not > >> > > > > >>>>>>>> only > >> > > > > >>>>>>>>>>>>>> Mojo. It can also front other TC components like > >> > > > > >>> Traffic > >> > > > > >>>>>> Stats > >> > > > > >>>>>>>> and > >> > > > > >>>>>>>>>>>> Traffic > >> > > > > >>>>>>>>>>>>>> Monitor. Each service defines its own routes and > >> > > > > >>>>>> capabilities. > >> > > > > >>>>>>>> Here > >> > > > > >>>>>>>>>>>> comes > >> > > > > >>>>>>>>>>>>>> the question of what is the "source of truth" > for > >> the > >> > > > > >>>> route > >> > > > > >>>>>>>>>>>> definitions. > >> > > > > >>>>>>>>>>>>>> - Handling private routes. API GW may front > non-TC > >> > > > > >>>>> services. > >> > > > > >>>>>>>>>>>>>> - User changes to the AAA scheme. The ability > for > >> > > > > >> admin > >> > > > > >>>>> user > >> > > > > >>>>>> to > >> > > > > >>>>>>>>>>> makes > >> > > > > >>>>>>>>>>>>>> changes in the required capabilities of a route, > >> > > > > >> maybe > >> > > > > >>>> even > >> > > > > >>>>>>>> define > >> > > > > >>>>>>>>>>>> new > >> > > > > >>>>>>>>>>>>>> capability names, was raised in the past as a > use > >> > > > > >> case > >> > > > > >>>> that > >> > > > > >>>>>>>> should > >> > > > > >>>>>>>>>>> be > >> > > > > >>>>>>>>>>>>>> supported. > >> > > > > >>>>>>>>>>>>>> - Easy development and deployment of new > services. > >> > > > > >>>>>>>>>>>>>> - Using TO DB for expediency. > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> I would appreciate any feedback and views on your > >> > > > > >>> approach > >> > > > > >>>> to > >> > > > > >>>>>>>> manage > >> > > > > >>>>>>>>>>>>>> route definitions. > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>>> Thanks > >> > > > > >>>>>>>>>>>>>> /amiry > >> > > > > >>>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>>> > >> > > > > >>>>>>>>>>>> > >> > > > > >>>>>>>>>>> > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>>> > >> > > > > >>>>>>>> > >> > > > > >>>>>>> > >> > > > > >>>>>> > >> > > > > >>>>> > >> > > > > >>>> > >> > > > > >>> > >> > > > > >> > >> > > > > > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > *Shmulik Asafi* > >> > > > Qwilt | Work: +972-72-2221692 <+972%2072-222-1692> > >> > <+972%2072-222-1692>| Mobile: > >> > > > +972-54-6581595 <054-658-1595> <+972%2054-658-1595> > >> <+972%2054-658-1595>| > >> > [email protected] > >> > > > <[email protected]> > >> > > > > >> > > > >> > > >> > > > > >
