Re: Possible bug on Spark Yarn Client (1.5.1) during kerberos mode ?

Steve Loughran Thu, 22 Oct 2015 13:17:03 -0700

On 22 Oct 2015, at 19:32, Chester Chen 
<[email protected]<mailto:[email protected]>> wrote:


Steven
      You summarized mostly correct. But there is a couple points I want to 
emphasize.

     Not every cluster have the Hive Service enabled. So The Yarn Client 
shouldn't try to get the hive delegation token just because security mode is 
enabled.

I agree, but it shouldn't be failing with a stack trace. Log -yes, fail no.


     The Yarn Client code can check if the service is enabled or not (possible 
by check hive metastore URI is present or other hive-site.xml elements). If 
hive service is not enabled, then we don't need to get hive delegation token. 
Hence we don't have the exception.

     If we still try to get hive delegation regardless hive service is enabled 
or not ( like the current code is doing now), then code should still launch the 
yarn container and spark job, as the user could simply run a job against HDFS, 
not accessing Hive.  Of course, access Hive will fail.


That's exactly what should be happening: the token is only needed if the code 
tries to talk to hive. The problem is the YARN client doesn't know whether 
that's the case, so it tries every time. It shouldn't be failing though.

Created an issue to cover this; I'll see what reflection it takes. I'll also 
pull the code out into a method that can be tested standalone: we shoudn't have 
to wait until a run on UGI.isSecure() mode.

https://issues.apache.org/jira/browse/SPARK-11265


Meanwhile, for the curious, these slides include an animation of what goes on 
when a YARN app is launched in a secure cluster, to help explain why things 
seem a bit complicated

http://people.apache.org/~stevel/kerberos/2015-09-kerberos-the-madness.pptx

     The 3rd point is that not sure why org.spark-project.hive's hive-exec and 
orga.apache.hadoop.hive hive-exec behave differently for the same method.

Chester









On Thu, Oct 22, 2015 at 10:18 AM, Charmee Patel 
<[email protected]<mailto:[email protected]>> wrote:
A similar issue occurs when interacting with Hive secured by Sentry. 
https://issues.apache.org/jira/browse/SPARK-9042

By changing how Hive Context instance is created, this issue might also be 
resolved.

On Thu, Oct 22, 2015 at 11:33 AM Steve Loughran 
<[email protected]<mailto:[email protected]>> wrote:
On 22 Oct 2015, at 08:25, Chester Chen 
<[email protected]<mailto:[email protected]>> wrote:

Doug
   We are not trying to compiling against different version of hive. The 
1.2.1.spark hive-exec is specified on spark 1.5.2 Pom file. We are moving from 
spark 1.3.1 to 1.5.1. Simply trying to supply the needed dependency. The rest 
of application (besides spark) simply uses hive 0.13.1.

   Yes we are using yarn client directly, there are many functions we need and 
modified are not provided in yarn client. The spark launcher in the current 
form does not satisfy our requirements (at least last time I see it) there is a 
discussion thread about several month ago.

    From spark 1.x  to 1.3.1, we fork the yarn client to achieve these goals ( 
yarn listener call backs, killApplications, yarn capacities call back etc). In 
current integration for 1.5.1, to avoid forking the spark, we simply subclass 
the yarn client overwrites a few methods. But we lost resource capacity call 
back and estimation by doing this.

   This is bit off the original topic.

    I still think there is a bug related to the spark yarn client in case of 
Kerberos + spark hive-exec dependency.

Chester


I think I understand what's being implied here.


  1.  In a secure cluster, a spark app needs a hive delegation token  to talk 
to hive
  2.  Spark yarn Client (org.apache.spark.deploy.yarn.Client) uses reflection 
to get the delegation token
  3.  The reflection doesn't work, a CFNE exception is logged
  4.  The app should still launch, but it'll be without a hive token , so 
attempting to work with Hive will fail.

I haven't seen this, because while I do test runs against a kerberos cluster, I 
wasn't talking to hive from the deployed app.


It sounds like this workaround works because the hive RPC protocol is 
compatible enough with 0.13 that a 0.13 client can ask hive for the token, 
though then your remote CP is stuck on 0.13

Looking at the hive class, the metastore has now made the hive constructor 
private and gone to a factory method (public static Hive get(HiveConf c) throws 
HiveException) to get an instance. The reflection code would need to be updated.

I'll file a bug with my name next to it

Re: Possible bug on Spark Yarn Client (1.5.1) during kerberos mode ?

Reply via email to