[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Todd Lipcon (JIRA) Thu, 19 Jul 2012 16:47:38 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418810#comment-13418810
 ]


Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

bq. The javadoc style for 'Returns BLAH' and then '@return BLAH' is Sun javadoc 
sytle.
Ew. That's disgusting. Oh well.

bq. the ReloadingX509TrustManager will work with an empty keystore if the 
keystore file is not avail at initialization time, and if the keystore file 
becomes available later one, it will be loaded. WARNs are logged while the file 
is not present, so it won't go unnoticed.

WARNs in the logs are often not noticed. Don't you think it's simpler to just 
fail if the conf is not present? If someone configures this and doesn't create 
the file (or the file is unreadable due to a permissions error), I think it's 
friendlier to fail fast. Otherwise they'll just end up seeing strange 
downstream issues like client certs not being properly trusted, which will be 
more difficult to root-cause back to the trust store configuration without log 
spelunking.

bq. If reload() fails to reload the new keystore, it assumes there are not 
certs and runs empty until the next reload attempt. Seems a safer assumption 
that continuing running with obsolete keys.

My worry here is that people might be using a conf management system to push 
out the key store files. If the reload happens to trigger right in the middle 
of a conf mgmt update, and the update is non-atomic, it will see an invalid 
keystore. I wouldn't want the TT to revert to an empty key store until the next 
reload interval in that case.

bq. While hadoop.ssl.enabled only applies to shuffle, the intention is to use 
it for the rest of the HTTP endpoints. Thus, a single know would enable SSL. 
That is why the name of the property and its location (in core-default.xml)

Given it doesn't currently affect the other HTTP endpoints, I find this very 
confusing. Why not make a separate config for now, and then once it affects 
more than just the shuffle, you can change the default for 
{{mapred.shuffle.use.ssl}} to {{${hadoop.use.ssl}}} to pick up the system-wide 
default.

bq. In the TestSSLFactory, the Assert.fail() statements, are sections the test 
should not make it; they are used for negative tests.
I get that. But, if the test breaks, you'll end up with a meaningless failure, 
instead of a message explaining why it failed. If you let the exception fall 
through, then the failed unit test would actually have a stack trace that 
explains why it failed, which aids in debugging.

bq. Client certs are disabled by default. If they are per job, yes they could 
be shipped via DC. This would require a alternate implementation of the 
KeyStoresFactory, thus the mechanism is already in place.
Does it need an alternate implementation? The distributed cache files can be 
put on the classpath already, in which case the existing keystore-loading code 
should be able to find them. The only change would be in the documentation -- 
explaining that the client should ship the files via distributed cache rather 
than putting them in HADOOP_CONF_DIR. Why wouldn't that be enough?
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, 
> MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, 
> MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, 
> MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, 
> MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides 
> comprehensive authentication for the cluster, it does not provide 
> confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense 
> of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Reply via email to