[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2022-09-26 Thread forrest lv (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17609802#comment-17609802
 ] 

forrest lv commented on SPARK-26254:


nice job

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Assignee: Gabor Somogyi
>Priority: Major
> Fix For: 3.0.0
>
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2019-01-04 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734393#comment-16734393
 ] 

Gabor Somogyi commented on SPARK-26254:
---

[~hyukjin.kwon] In my last comment right before the ping almost everything is 
clear but don't know what is the suggestion related kafka.

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2019-01-04 Thread Hyukjin Kwon (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734004#comment-16734004
 ] 

Hyukjin Kwon commented on SPARK-26254:
--

[~gsomogyi], can you point out which comment is about the discussion exactly?

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2019-01-04 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733956#comment-16733956
 ] 

Gabor Somogyi commented on SPARK-26254:
---

ping [~vanzin]

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-13 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720633#comment-16720633
 ] 

Steve Loughran commented on SPARK-26254:


bq. There was concern about using ServiceLoader before, but if the interface 
being loaded is private to Spark, it's fine with me.

HADOOP-15808 there. If you have any class which declares a delegation token, 
but that class doesn't actually load (missing, transitive CNFE, etc), and the 
jar containing that META-INF manifest gets into the classpath of your resource 
manager, there goes your cluster as soon as the first job is 
submitted.Traumatic.

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-13 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720241#comment-16720241
 ] 

Gabor Somogyi commented on SPARK-26254:
---

{quote}There was concern about using ServiceLoader before, but if the interface 
being loaded is private to Spark, it's fine with me.
{quote}
org.apache.spark.deploy.security.HadoopDelegationTokenProvider fulfils it.
{quote}Keep HDFS and HBase in core
{quote}
clear and same idea.
{quote}move the Kafka one to some Kafka package
{quote}
you mean module isn't it?
 * If we move inside core then the kafka deps remain
 * If we move to kafka-sql then DStreams will not reach it

My suggestion is to create a module something like kafka-token-provider and 
kafka-sql (+ later DStreams) can depend on that.
{quote}the Hive one to the Hive module
{quote}
clear and same idea. For example hive-token-provider which extracts the ugly 
dependencies from core.

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-12 Thread Marcelo Vanzin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719574#comment-16719574
 ] 

Marcelo Vanzin commented on SPARK-26254:


bq. loaded the providers with ServiceLoader 

If you're going to use that, then you probably don't need a new module. Keep 
HDFS and HBase in core, move the Kafka one to some Kafka package (which one 
TBD, especially if you want to support both dstreams and structured streaming), 
and the Hive one to the Hive module.

There was concern about using ServiceLoader before, but if the interface being 
loaded is private to Spark, it's fine with me.

My original idea was to move everything (renewer code et al) to a new module, 
and make core not have this feature at all; yarn, mesos and others would depend 
on this new module. But the above change might be simpler / better.

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-07 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712843#comment-16712843
 ] 

Steve Loughran commented on SPARK-26254:


maybe ask the Kafka people for opinions [~jkreps] can probably nominate someone

bq. token-providers provided dependency to kafka-sql project => It's kinda' 
weird but at the moment looks the least problematic

probably makes sense then

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-06 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711516#comment-16711516
 ] 

Gabor Somogyi commented on SPARK-26254:
---

I've reached a state where tradeoff has to be made, so interested in opinions 
[~vanzin] [~ste...@apache.org]

I've created a project with token-providers name which is depending on core. 
With this successfully extracted all the nasty hive + kafka dependencies + all 
token providers are there. Then loaded the providers with ServiceLoader which 
also works fine. Finally reached a point where kafka-sql project expects couple 
of things from KafkaUtil which is in token-providers now. Here is the list of 
problems:
{noformat}
[error] 
/Users/gaborsomogyi/spark/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala:31:
 object KafkaTokenUtil is not a member of package 
org.apache.spark.deploy.security
[error] import org.apache.spark.deploy.security.KafkaTokenUtil
[error]^
[error] 
/Users/gaborsomogyi/spark/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSecurityHelper.scala:25:
 object KafkaTokenUtil is not a member of package 
org.apache.spark.deploy.security
[error] import org.apache.spark.deploy.security.KafkaTokenUtil
[error]^
[error] 
/Users/gaborsomogyi/spark/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSecurityHelper.scala:32:
 not found: value KafkaTokenUtil
[error]   KafkaTokenUtil.TOKEN_SERVICE) != null
[error]   ^
[error] 
/Users/gaborsomogyi/spark/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSecurityHelper.scala:37:
 not found: value KafkaTokenUtil
[error]   KafkaTokenUtil.TOKEN_SERVICE)
[error]   ^
[error] 
/Users/gaborsomogyi/spark/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala:566:
 not found: value KafkaTokenUtil
[error]   if (KafkaTokenUtil.isGlobalJaasConfigurationProvided) {
[error]   ^

+ all isTokenAvailable tests expects TOKEN_KIND, TOKEN_SERVICE + 
KafkaDelegationTokenIdentifier in KafkaSecurityHelperSuite
{noformat}

Here I see these possibilities:
* Hardcode TOKEN_KIND + TOKEN_SERVICE and duplicate 
isGlobalJaasConfigurationProvided => The drawback here is we can't really test 
whether the provider created token can be read in kafka-sql (we can actually 
but with hardcoded strings in both sides which makes it brittle)
* As we're loading providers with ServiceLoader the kafka related one can be 
moved to kafka-sql => The drawback is that providers spread around and this 
code can't really be reused in DStreams.
* Add token-providers provided dependency to kafka-sql project => It's kinda' 
weird but at the moment looks the least problematic

Waiting on opinions...


> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-04 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708542#comment-16708542
 ] 

Gabor Somogyi commented on SPARK-26254:
---

HBase libs are not on provided scope in core's pom.xml but the provider should 
move.


> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-03 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707679#comment-16707679
 ] 

Steve Loughran commented on SPARK-26254:


+HBase

I don't have any opinions on the best place; people who know the spark 
packaging are the ones there. And people deploying to other infras than YARN 
will have their opinions too.

Token loading can be fairly brittle to classpath problems (HADOOP-15808); its 
good not to trust everything to be well-configured. 

> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-03 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706970#comment-16706970
 ] 

Gabor Somogyi commented on SPARK-26254:
---

I've created this jira to discuss the details.
cc [~vanzin] [~steveloughran]

I've taken a look at the code and I see mainly 2 problematic library set: hive 
+ kafka.
So these should be moved + all the delegation token providers. What do you 
think guys?
Any thoughts welcome.


> Move delegation token providers into a separate project
> ---
>
> Key: SPARK-26254
> URL: https://issues.apache.org/jira/browse/SPARK-26254
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Gabor Somogyi
>Priority: Major
>
> There was a discussion in 
> [PR#22598|https://github.com/apache/spark/pull/22598] that there are several 
> provided dependencies inside core project which shouldn't be there (for ex. 
> hive and kafka). This jira is to solve this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org