[ 
https://issues.apache.org/jira/browse/SPARK-12312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176342#comment-17176342
 ] 

John Lonergan commented on SPARK-12312:
---------------------------------------

Was this oracle?

We experienced a problem when running out jobs on yarn but not local.
The problem stems from the fact that spark (also flink) run the application
code in a context where a Kerberos principal has been setup to talk to
Hadoop.
Then when oracle jdbc comes along then it has a bug in my opinion in that
instead of creating a new principal for the database connection it instead
reuses the ambient principal used for Hadoop.
If you turn on Kerberos/ojdbc trace you can see the Hadoop principal being
used against the database.

I believe Microsoft also used to suffer from this but was fixed.

Worse to come however. If you use the oracle jdbc Kerberos then you might
notice that once oracle has done it's connection then the JAAS config in
your app is trashed. Another bug in oracle imho. Again I believe msoft used
to do the same.

One solution is to use a single principal for Hadoop and oracle. If you
can't do that then you may need to create your own oracle driver wrapper
that compensates for the ambient Hadoop principal allowing oracle to
proceed with the intended id. If you do the latter then you will also
likely end up remediating the JAAS corruption issue.

I expect many folk never encounter these issues unless they are in
corporate environments where different principal for each resource is a
common pattern.

Cheers John




> JDBC connection to Kerberos secured databases fails on remote executors
> -----------------------------------------------------------------------
>
>                 Key: SPARK-12312
>                 URL: https://issues.apache.org/jira/browse/SPARK-12312
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.2, 2.4.2
>            Reporter: nabacg
>            Priority: Minor
>
> When loading DataFrames from JDBC datasource with Kerberos authentication, 
> remote executors (yarn-client/cluster etc. modes) fail to establish a 
> connection due to lack of Kerberos ticket or ability to generate it. 
> This is a real issue when trying to ingest data from kerberized data sources 
> (SQL Server, Oracle) in enterprise environment where exposing simple 
> authentication access is not an option due to IT policy issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to