That parameter is related to data encryption on RPC when a client connects.
I never checked, but if you use it without kerberos you should have
encryption without authentication. So, for example, if you do:
hdfs dfs -touch /tmp/myfile
the file will be owned by the user that is running the hdfs command, the
user must exists on all machines.
The only difference compared with a non-secure configuration will be
that the data between client and server will be encrypted using
symmetric encryption.
I'm not security expert, but I think (*I'm theorizing*) that in this
case there is a security risk related to the handshake in the initial
connection, so you should run the command in this way only from a
trusted host.
--
Antonio
Il 10/02/20 23:52, Daniel Howard ha scritto:
On Wed, Feb 5, 2020 at 11:29 PM Antonio A. Rendina
<arendin...@gmail.com <mailto:arendin...@gmail.com>> wrote:
I never configured the access token, so I don't know how it works,
but I think that you should also set:
hadoop.rpc.protection=privacy
Question: can one use *hadoop.rpc.protection=privacy* /without/ Kerberos?
In theory, SASL can be used independently of Kerberos, but I haven't
found an example for doing this with Hadoop, yet.
As for my original question ... after I enabled encrypted data
transfer for block data transfer I did some filesystem benchmarks and
there was plenty of performance impact to let me know that /something/
was afoot. :)
-danny
--
http://dannyman.toldme.com