All -
I have some really interesting work going on in a feature branch that
enables to creation of a command line session for interacting with Knox
through KnoxShell.
It is similar to kerberos in that we have init and destroy commands for
managing the cached token and a list command to show the details about a
cached token.
The file permissions protected, hidden file is a json string that looks
like:
bash-3.2$ cat ~/.knoxtokencache
{"access_token":"eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJndWVzdCIsImF1ZCI6InRva2VuYmFzZWQiLCJpc3MiOiJLTk9YU1NPIiwiZXhwIjoxNDg2NTM4MTgzfQ.GKHedGqBOuX3SgEFhOm00M8p3JyIOug8Jup4g4duzgB71dWp2lHYK1I0Q_LdnQUYOE0vZ4hoRoC9xHFKzxLHqBsptqeziGHbxHVhBuvGuFLAuB8HlXblvrAJe52vtHJx-9FxDoEgeJodlyFgHku_K2HOPuFkVeTsg9Wqx0j2V38","token_type":"Bearer
","expires_in":1486538183183}
The programming model that this enables is pretty nice. For instance the
following script uses the Token credential collector and a new header based
transport for providing bearer tokens.
import groovy.json.JsonSlurper
import java.util.HashMap
import java.util.Map
import org.apache.hadoop.gateway.shell.Credentials
import org.apache.hadoop.gateway.shell.Hadoop
import org.apache.hadoop.gateway.shell.hdfs.Hdfs
gateway = "https://localhost:8443/gateway/tokenbased"
credentials = new Credentials()
credentials.add("KnoxToken", "none: ", "token")
credentials.collect()
token = credentials.get("token").string()
headers = new HashMap()
headers.put("Authorization", "Bearer " + token)
session = Hadoop.login( gateway, headers )
if (args.length > 0) {
dir = args[0]
} else {
dir = "/"
}
text = Hdfs.ls( session ).dir( dir ).now().string
json = (new JsonSlurper()).parseText( text )
println json.FileStatuses.FileStatus.pathSuffix
//println json
session.shutdown()
It has occurred to me that we could have the result of the token request
include the URL that it can be used with by configuring it in the KnoxToken
service that provides the token. If we do that then there is no reason to
hardcode or require the URL to be passed via args or env variable or
anything. You have an active command line session for using a particular
URL.
Question is...
Should we require the user to explicitly provide the URL to avoid
inadvertently affecting the wrong cluster?
I can argue either side of it.
If you keep the token session lifetime a reasonably short time then it will
be expired by the time you mentally switch context.
If it were hardcoded then you could end up with the same problem without it
even expiring.
I am leaning toward adding it to the config of the KnoxToken service and
then to the json file that is stored in ~/.knoxtokencache and then the
script can pull the url out of the credential collector along with the
token itself.
Thoughts?
--larry