Re: Question regarding Sparks new Internal authentication mechanism

2017-07-20 Thread Udit Mehrotra
Hi Marcelo,

Thanks for looking into it. I have opened a jira for this:

https://issues.apache.org/jira/browse/SPARK-21494

And yes, it works fine with internal shuffle service. But for our system we
have external shuffle/dynamic allocation configured by default. We wanted
to try switching from the standard SASL/3DES to the new AES based
authentication.

Thanks !

On Thu, Jul 20, 2017 at 4:32 PM, Marcelo Vanzin  wrote:

> Also, things seem to work with all your settings if you disable use of
> the shuffle service (which also means no dynamic allocation), if that
> helps you make progress in what you wanted to do.
>
> On Thu, Jul 20, 2017 at 4:25 PM, Marcelo Vanzin 
> wrote:
> > Hmm... I tried this with the new shuffle service (I generally have an
> > old one running) and also see failures. I also noticed some odd things
> > in your logs that I'm also seeing in mine, but it's better to track
> > these in a bug instead of e-mail.
> >
> > Please file a bug and attach your logs there, I'll take a look at this.
> >
> > On Thu, Jul 20, 2017 at 2:06 PM, Udit Mehrotra
> >  wrote:
> >> Hi Marcelo,
> >>
> >> I ran with setting DEBUG level logging for 'org.apache.spark.network.
> crypto'
> >> for both Spark and Yarn.
> >>
> >> However, the DEBUG logs still do not convey anything meaningful. Please
> find
> >> it attached. Can you please take a quick look, and let me know if you
> see
> >> anything suspicious ?
> >>
> >> If not, do you think I should open a JIRA for this ?
> >>
> >> Thanks !
> >>
> >> On Wed, Jul 19, 2017 at 3:14 PM, Marcelo Vanzin 
> wrote:
> >>>
> >>> Hmm... that's not enough info and logs are intentionally kept silent
> >>> to avoid flooding, but if you enable DEBUG level logging for
> >>> org.apache.spark.network.crypto in both YARN and the Spark app, that
> >>> might provide more info.
> >>>
> >>> On Wed, Jul 19, 2017 at 2:58 PM, Udit Mehrotra
> >>>  wrote:
> >>> > So I added these settings in yarn-site.xml as well. Now I get a
> >>> > completely
> >>> > different error, but atleast it seems like it is using the crypto
> >>> > library:
> >>> >
> >>> > ExecutorLostFailure (executor 1 exited caused by one of the running
> >>> > tasks)
> >>> > Reason: Unable to create executor due to Unable to register with
> >>> > external
> >>> > shuffle server due to : java.lang.IllegalArgumentException:
> >>> > Authentication
> >>> > failed.
> >>> > at
> >>> >
> >>> > org.apache.spark.network.crypto.AuthRpcHandler.receive(
> AuthRpcHandler.java:125)
> >>> > at
> >>> >
> >>> > org.apache.spark.network.server.TransportRequestHandler.
> processRpcRequest(TransportRequestHandler.java:157)
> >>> > at
> >>> >
> >>> > org.apache.spark.network.server.TransportRequestHandler.handle(
> TransportRequestHandler.java:105)
> >>> > at
> >>> >
> >>> > org.apache.spark.network.server.TransportChannelHandler.channelRead(
> TransportChannelHandler.java:118)
> >>> >
> >>> > Any clue about this ?
> >>> >
> >>> >
> >>> > On Wed, Jul 19, 2017 at 1:13 PM, Marcelo Vanzin  >
> >>> > wrote:
> >>> >>
> >>> >> On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
> >>> >>  wrote:
> >>> >> > Is there any additional configuration I need for external shuffle
> >>> >> > besides
> >>> >> > setting the following:
> >>> >> > spark.network.crypto.enabled true
> >>> >> > spark.network.crypto.saslFallback false
> >>> >> > spark.authenticate   true
> >>> >>
> >>> >> Have you set these options on the shuffle service configuration too
> >>> >> (which is the YARN xml config file, not spark-defaults.conf)?
> >>> >>
> >>> >> If you have there might be an issue, and you should probably file a
> >>> >> bug and include your NM's log file.
> >>> >>
> >>> >> --
> >>> >> Marcelo
> >>> >
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Marcelo
> >>
> >>
> >
> >
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>


Re: Question regarding Sparks new Internal authentication mechanism

2017-07-20 Thread Marcelo Vanzin
Also, things seem to work with all your settings if you disable use of
the shuffle service (which also means no dynamic allocation), if that
helps you make progress in what you wanted to do.

On Thu, Jul 20, 2017 at 4:25 PM, Marcelo Vanzin  wrote:
> Hmm... I tried this with the new shuffle service (I generally have an
> old one running) and also see failures. I also noticed some odd things
> in your logs that I'm also seeing in mine, but it's better to track
> these in a bug instead of e-mail.
>
> Please file a bug and attach your logs there, I'll take a look at this.
>
> On Thu, Jul 20, 2017 at 2:06 PM, Udit Mehrotra
>  wrote:
>> Hi Marcelo,
>>
>> I ran with setting DEBUG level logging for 'org.apache.spark.network.crypto'
>> for both Spark and Yarn.
>>
>> However, the DEBUG logs still do not convey anything meaningful. Please find
>> it attached. Can you please take a quick look, and let me know if you see
>> anything suspicious ?
>>
>> If not, do you think I should open a JIRA for this ?
>>
>> Thanks !
>>
>> On Wed, Jul 19, 2017 at 3:14 PM, Marcelo Vanzin  wrote:
>>>
>>> Hmm... that's not enough info and logs are intentionally kept silent
>>> to avoid flooding, but if you enable DEBUG level logging for
>>> org.apache.spark.network.crypto in both YARN and the Spark app, that
>>> might provide more info.
>>>
>>> On Wed, Jul 19, 2017 at 2:58 PM, Udit Mehrotra
>>>  wrote:
>>> > So I added these settings in yarn-site.xml as well. Now I get a
>>> > completely
>>> > different error, but atleast it seems like it is using the crypto
>>> > library:
>>> >
>>> > ExecutorLostFailure (executor 1 exited caused by one of the running
>>> > tasks)
>>> > Reason: Unable to create executor due to Unable to register with
>>> > external
>>> > shuffle server due to : java.lang.IllegalArgumentException:
>>> > Authentication
>>> > failed.
>>> > at
>>> >
>>> > org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:125)
>>> > at
>>> >
>>> > org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
>>> > at
>>> >
>>> > org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
>>> > at
>>> >
>>> > org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
>>> >
>>> > Any clue about this ?
>>> >
>>> >
>>> > On Wed, Jul 19, 2017 at 1:13 PM, Marcelo Vanzin 
>>> > wrote:
>>> >>
>>> >> On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
>>> >>  wrote:
>>> >> > Is there any additional configuration I need for external shuffle
>>> >> > besides
>>> >> > setting the following:
>>> >> > spark.network.crypto.enabled true
>>> >> > spark.network.crypto.saslFallback false
>>> >> > spark.authenticate   true
>>> >>
>>> >> Have you set these options on the shuffle service configuration too
>>> >> (which is the YARN xml config file, not spark-defaults.conf)?
>>> >>
>>> >> If you have there might be an issue, and you should probably file a
>>> >> bug and include your NM's log file.
>>> >>
>>> >> --
>>> >> Marcelo
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Marcelo
>>
>>
>
>
>
> --
> Marcelo



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Question regarding Sparks new Internal authentication mechanism

2017-07-20 Thread Marcelo Vanzin
Hmm... I tried this with the new shuffle service (I generally have an
old one running) and also see failures. I also noticed some odd things
in your logs that I'm also seeing in mine, but it's better to track
these in a bug instead of e-mail.

Please file a bug and attach your logs there, I'll take a look at this.

On Thu, Jul 20, 2017 at 2:06 PM, Udit Mehrotra
 wrote:
> Hi Marcelo,
>
> I ran with setting DEBUG level logging for 'org.apache.spark.network.crypto'
> for both Spark and Yarn.
>
> However, the DEBUG logs still do not convey anything meaningful. Please find
> it attached. Can you please take a quick look, and let me know if you see
> anything suspicious ?
>
> If not, do you think I should open a JIRA for this ?
>
> Thanks !
>
> On Wed, Jul 19, 2017 at 3:14 PM, Marcelo Vanzin  wrote:
>>
>> Hmm... that's not enough info and logs are intentionally kept silent
>> to avoid flooding, but if you enable DEBUG level logging for
>> org.apache.spark.network.crypto in both YARN and the Spark app, that
>> might provide more info.
>>
>> On Wed, Jul 19, 2017 at 2:58 PM, Udit Mehrotra
>>  wrote:
>> > So I added these settings in yarn-site.xml as well. Now I get a
>> > completely
>> > different error, but atleast it seems like it is using the crypto
>> > library:
>> >
>> > ExecutorLostFailure (executor 1 exited caused by one of the running
>> > tasks)
>> > Reason: Unable to create executor due to Unable to register with
>> > external
>> > shuffle server due to : java.lang.IllegalArgumentException:
>> > Authentication
>> > failed.
>> > at
>> >
>> > org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:125)
>> > at
>> >
>> > org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
>> > at
>> >
>> > org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
>> > at
>> >
>> > org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
>> >
>> > Any clue about this ?
>> >
>> >
>> > On Wed, Jul 19, 2017 at 1:13 PM, Marcelo Vanzin 
>> > wrote:
>> >>
>> >> On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
>> >>  wrote:
>> >> > Is there any additional configuration I need for external shuffle
>> >> > besides
>> >> > setting the following:
>> >> > spark.network.crypto.enabled true
>> >> > spark.network.crypto.saslFallback false
>> >> > spark.authenticate   true
>> >>
>> >> Have you set these options on the shuffle service configuration too
>> >> (which is the YARN xml config file, not spark-defaults.conf)?
>> >>
>> >> If you have there might be an issue, and you should probably file a
>> >> bug and include your NM's log file.
>> >>
>> >> --
>> >> Marcelo
>> >
>> >
>>
>>
>>
>> --
>> Marcelo
>
>



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Marcelo Vanzin
Hmm... that's not enough info and logs are intentionally kept silent
to avoid flooding, but if you enable DEBUG level logging for
org.apache.spark.network.crypto in both YARN and the Spark app, that
might provide more info.

On Wed, Jul 19, 2017 at 2:58 PM, Udit Mehrotra
 wrote:
> So I added these settings in yarn-site.xml as well. Now I get a completely
> different error, but atleast it seems like it is using the crypto library:
>
> ExecutorLostFailure (executor 1 exited caused by one of the running tasks)
> Reason: Unable to create executor due to Unable to register with external
> shuffle server due to : java.lang.IllegalArgumentException: Authentication
> failed.
> at
> org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:125)
> at
> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
> at
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
> at
> org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
>
> Any clue about this ?
>
>
> On Wed, Jul 19, 2017 at 1:13 PM, Marcelo Vanzin  wrote:
>>
>> On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
>>  wrote:
>> > Is there any additional configuration I need for external shuffle
>> > besides
>> > setting the following:
>> > spark.network.crypto.enabled true
>> > spark.network.crypto.saslFallback false
>> > spark.authenticate   true
>>
>> Have you set these options on the shuffle service configuration too
>> (which is the YARN xml config file, not spark-defaults.conf)?
>>
>> If you have there might be an issue, and you should probably file a
>> bug and include your NM's log file.
>>
>> --
>> Marcelo
>
>



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Udit Mehrotra
So I added these settings in yarn-site.xml as well. Now I get a completely
different error, but atleast it seems like it is using the crypto library:

ExecutorLostFailure (executor 1 exited caused by one of the running tasks)
Reason: Unable to create executor due to Unable to register with external
shuffle server due to : java.lang.IllegalArgumentException: Authentication
failed.
at
org.apache.spark.network.crypto.AuthRpcHandler.receive(AuthRpcHandler.java:125)
at
org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
at
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
at
org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)

Any clue about this ?


On Wed, Jul 19, 2017 at 1:13 PM, Marcelo Vanzin  wrote:

> On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
>  wrote:
> > Is there any additional configuration I need for external shuffle besides
> > setting the following:
> > spark.network.crypto.enabled true
> > spark.network.crypto.saslFallback false
> > spark.authenticate   true
>
> Have you set these options on the shuffle service configuration too
> (which is the YARN xml config file, not spark-defaults.conf)?
>
> If you have there might be an issue, and you should probably file a
> bug and include your NM's log file.
>
> --
> Marcelo
>


Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Udit Mehrotra
So i am using the standard documentation here for configuring external
shuffle service:

https://spark.apache.org/docs/latest/running-on-yarn.html#
configuring-the-external-shuffle-service

I can see on the cluster that following jars are present as well:
/usr/lib/spark/yarn/spark-2.2.0-yarn-shuffle.jar

Is there any additional configuration I need for external shuffle besides
setting the following:
spark.network.crypto.enabled true
spark.network.crypto.saslFallback false
spark.authenticate   true

On Wed, Jul 19, 2017 at 12:51 PM, Marcelo Vanzin 
wrote:

> Well, how did you install the Spark shuffle service on YARN? It's not
> part of YARN.
>
> If you really have the Spark 2.2 shuffle service jar deployed in your
> YARN service, then perhaps you didn't configure it correctly to use
> the new auth mechanism.
>
> On Wed, Jul 19, 2017 at 12:47 PM, Udit Mehrotra
>  wrote:
> > Sorry about that. Will keep the list in my replies.
> >
> > So, just to clarify I am not using an older version of sparks shuffle
> > service. This is a brand new cluster with just Spark 2.2.0 installed
> > alongside hadoop 2.7.3. Could there be anything else I am missing, or I
> can
> > try differently ?
> >
> >
> > Thanks !
> >
> >
> > On Wed, Jul 19, 2017 at 12:03 PM, Marcelo Vanzin 
> > wrote:
> >>
> >> Please include the list on your replies, so others can benefit from
> >> the discussion too.
> >>
> >> On Wed, Jul 19, 2017 at 11:43 AM, Udit Mehrotra
> >>  wrote:
> >> > Hi Marcelo,
> >> >
> >> > Thanks a lot for confirming that. Can you explain what you mean by
> >> > upgrading
> >> > the version of shuffle service ? Wont it automatically use the
> >> > corresponding
> >> > class from spark 2.2.0 to start the external shuffle service ?
> >>
> >> That depends on how you deploy your shuffle service. Normally YARN
> >> will have no idea that your application is using a new Spark - it will
> >> still have the old version of the service jar in its classpath.
> >>
> >>
> >> --
> >> Marcelo
> >
> >
>
>
>
> --
> Marcelo
>


Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Marcelo Vanzin
On Wed, Jul 19, 2017 at 1:10 PM, Udit Mehrotra
 wrote:
> Is there any additional configuration I need for external shuffle besides
> setting the following:
> spark.network.crypto.enabled true
> spark.network.crypto.saslFallback false
> spark.authenticate   true

Have you set these options on the shuffle service configuration too
(which is the YARN xml config file, not spark-defaults.conf)?

If you have there might be an issue, and you should probably file a
bug and include your NM's log file.

-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Marcelo Vanzin
Well, how did you install the Spark shuffle service on YARN? It's not
part of YARN.

If you really have the Spark 2.2 shuffle service jar deployed in your
YARN service, then perhaps you didn't configure it correctly to use
the new auth mechanism.

On Wed, Jul 19, 2017 at 12:47 PM, Udit Mehrotra
 wrote:
> Sorry about that. Will keep the list in my replies.
>
> So, just to clarify I am not using an older version of sparks shuffle
> service. This is a brand new cluster with just Spark 2.2.0 installed
> alongside hadoop 2.7.3. Could there be anything else I am missing, or I can
> try differently ?
>
>
> Thanks !
>
>
> On Wed, Jul 19, 2017 at 12:03 PM, Marcelo Vanzin 
> wrote:
>>
>> Please include the list on your replies, so others can benefit from
>> the discussion too.
>>
>> On Wed, Jul 19, 2017 at 11:43 AM, Udit Mehrotra
>>  wrote:
>> > Hi Marcelo,
>> >
>> > Thanks a lot for confirming that. Can you explain what you mean by
>> > upgrading
>> > the version of shuffle service ? Wont it automatically use the
>> > corresponding
>> > class from spark 2.2.0 to start the external shuffle service ?
>>
>> That depends on how you deploy your shuffle service. Normally YARN
>> will have no idea that your application is using a new Spark - it will
>> still have the old version of the service jar in its classpath.
>>
>>
>> --
>> Marcelo
>
>



-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Udit Mehrotra
Sorry about that. Will keep the list in my replies.

So, just to clarify I am not using an older version of sparks shuffle
service. This is a brand new cluster with just Spark 2.2.0 installed
alongside hadoop 2.7.3. Could there be anything else I am missing, or I can
try differently ?


Thanks !


On Wed, Jul 19, 2017 at 12:03 PM, Marcelo Vanzin 
wrote:

> Please include the list on your replies, so others can benefit from
> the discussion too.
>
> On Wed, Jul 19, 2017 at 11:43 AM, Udit Mehrotra
>  wrote:
> > Hi Marcelo,
> >
> > Thanks a lot for confirming that. Can you explain what you mean by
> upgrading
> > the version of shuffle service ? Wont it automatically use the
> corresponding
> > class from spark 2.2.0 to start the external shuffle service ?
>
> That depends on how you deploy your shuffle service. Normally YARN
> will have no idea that your application is using a new Spark - it will
> still have the old version of the service jar in its classpath.
>
>
> --
> Marcelo
>


Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Marcelo Vanzin
Please include the list on your replies, so others can benefit from
the discussion too.

On Wed, Jul 19, 2017 at 11:43 AM, Udit Mehrotra
 wrote:
> Hi Marcelo,
>
> Thanks a lot for confirming that. Can you explain what you mean by upgrading
> the version of shuffle service ? Wont it automatically use the corresponding
> class from spark 2.2.0 to start the external shuffle service ?

That depends on how you deploy your shuffle service. Normally YARN
will have no idea that your application is using a new Spark - it will
still have the old version of the service jar in its classpath.


-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Udit Mehrotra
Hi Spark Dev’s,

I am trying out the new Spark’s internal authentication mechanism based off
AES encryption, https://issues.apache.org/jira/browse/SPARK-19139 which has
come up in Spark 2.2.0.

I set the following properties in my spark-defaults:
spark.network.crypto.enabled true
spark.network.crypto.saslFallback false
spark.authenticate   true

This seems to work fine with internal shuffle service of Spark. However,
when in I try it with Yarn’s external shuffle service the executors are
unable to register with the shuffle service as it still expects SASL
authentication. Here is the error I get:

ExecutorLostFailure (executor 42 exited caused by one of the running tasks)
Reason: Unable to create executor due to Unable to register with external
shuffle server due to : java.lang.IllegalStateException: Expected
SaslMessage, received something else (maybe your client does not have SASL
enabled?)
at
org.apache.spark.network.sasl.SaslMessage.decode(SaslMessage.java:69)
at
org.apache.spark.network.sasl.SaslRpcHandler.receive(SaslRpcHandler.java:89)
at
org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
at
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:105)
at
org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)

Can someone confirm that this is expected behavior? Or provide some
guidance, on how I can make it work with external shuffle service ?

Note: If I set ‘spark.network.crypto.saslFallback’ to true, the job runs
fine with external shuffle service as well since its falling back to sasl
authentication.

Thank you for your help.


Re: Question regarding Sparks new Internal authentication mechanism

2017-07-19 Thread Marcelo Vanzin
On Wed, Jul 19, 2017 at 11:19 AM, Udit Mehrotra
 wrote:
> spark.network.crypto.saslFallback false
> spark.authenticate   true
>
> This seems to work fine with internal shuffle service of Spark. However,
> when in I try it with Yarn’s external shuffle service the executors are
> unable to register with the shuffle service as it still expects SASL
> authentication. Here is the error I get:
>
> Can someone confirm that this is expected behavior? Or provide some
> guidance, on how I can make it work with external shuffle service ?

Yes, that's the expected behavior, since you disabled SASL fallback in
your configuration. If you set it back on, then you can talk to the
old shuffle service.

Or you could upgrade the version of the shuffle service running on
your YARN cluster so that it also supports the new auth mechanism.

-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org