[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-20 Thread Kostas Sakellis (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200236#comment-15200236
 ] 

Kostas Sakellis commented on SPARK-13877:
-

Why have we not more seriously considered Spark subprojects? I think it makes 
more sense than pull this functionality out of Spark completely. It will give 
these modules their own release trains and have a good separation from core 
spark while under a software governance model that we all understand. 

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Kostas Sakellis (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200252#comment-15200252
 ] 

Kostas Sakellis commented on SPARK-13877:
-

How is this any different than creating a random repo anywhere else? 

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200306#comment-15200306
 ] 

Hari Shreedharan commented on SPARK-13877:
--

You could have separate repos and separate releases, and keep the same package 
names simply by doing sub-projects. Can you explain what the overhead is and 
what tools you are concerned about? 

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200249#comment-15200249
 ] 

Reynold Xin commented on SPARK-13877:
-

Seems really high overhead. Might as well just keep it in that case.


> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200253#comment-15200253
 ] 

Reynold Xin commented on SPARK-13877:
-

"Overhead". Tools are much better outside the ASF for maintaining a lot of the 
smaller projects.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200088#comment-15200088
 ] 

Reynold Xin commented on SPARK-13877:
-

I'm not 100% sure it is a good idea to move it out (it might be), but I'm 
definitely against renaming the package name.

Given Kafka is one of the most widely used streaming sources in 1.x, moving 
this out and changing the package name means we are breaking almost every spark 
streaming app out there when they upgrade to 2.x. It seems like we are doing 
all these extra work just to break things.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200188#comment-15200188
 ] 

Mark Grover commented on SPARK-13877:
-

Yeah, that totally makes sense. I agree that it's a big change but I also think 
we can't really keep the same package name if this code moves out of Apache 
Spark.

So should we mark this as Won't Fix then?

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200152#comment-15200152
 ] 

Cody Koeninger commented on SPARK-13877:


Thumbs down on renaming the package name as well... from a practical point of 
view, we may need things to be in the same package hierarchy because of access 
modifiers.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200079#comment-15200079
 ] 

Mark Grover commented on SPARK-13877:
-

I am guessing this needs to be done before Spark 2.0 code freeze. Also, if we 
are moving this to be outside of Spark, it's not a part of Apache Spark project 
any more, so in my opinion, we should be updating the maven coordinates and 
package names to be something like {{org.spark-packages.*}}. I am happy to 
volunteer to make those changes, unless someone has objection.

But, I think it's a big enough change for our end users that we should have a 
dev@ vote thread on this. Also, we need to come up with a who can commit code 
to this external repo. All Spark committers seems like a safe choice to begin 
with but could be expand later on. Thoughts?

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196805#comment-15196805
 ] 

Hari Shreedharan commented on SPARK-13877:
--

[~c...@koeninger.org] - Sure. I agree with having one or more repos - each 
building against a set of compatible APIs. My point is whatever the case be - 
it is more flexible to do that outside Spark than have multiple codebases 
inside.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195973#comment-15195973
 ] 

Cody Koeninger commented on SPARK-13877:


[~hshreedharan] They aren't compatible from an api / feature point of view.  
Reason I'm bringing it up now is because if we're talking about making new 
repos under https://github.com/spark-packages we need a kafka-08 and a 
kafka-beta (or whatever names) repo, not just one.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195897#comment-15195897
 ] 

Hari Shreedharan commented on SPARK-13877:
--

[~c...@koeninger.org] - As long as they are code-compatible, you could keep the 
same codebase. Of course, if they are not compatible from an API point of view 
or if new features require non-backward compatible code, it makes sense to keep 
different codebases.

But that is beyond the point. What I am trying to say is that moving Kafka out 
of Spark does allow the flexibility of having whatever Kafka versions that the 
"new" project wants and does not have to have the kafka and spark versions 
tightly coupled with each other.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195873#comment-15195873
 ] 

Mark Grover commented on SPARK-13877:
-

I am in support of taking the kafka integration out as well. However, in my 
mind, we should figure out the answer to the following questions before we do 
(some of these have already been aptly pointed out by Cody and Sean):
* Where will the code repo be located?
* Who would have access to commit code?
* How do we track issues there? Github Issues/PRs?
* Whose infrastructure would the test jobs run on?
* Where would the artifacts be released? Probably not on apache.org/dist. If 
not there, then where?

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195870#comment-15195870
 ] 

Cody Koeninger commented on SPARK-13877:


I don't think it makes sense to put kafka 0.8 and kafka 0.10 in the same 
project.  Better to just have 2 different codebases, with obviously different 
artifact names.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Hari Shreedharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195843#comment-15195843
 ] 

Hari Shreedharan commented on SPARK-13877:
--

I think we could easily make this a sub-project and not necessarily release it 
with Spark itself. Moving all of these sub-projects out makes sense from a 
development point of view, but the biggest benefit (since the 
MQTT/Twitter/Flume modules see relatively few changes anyway) we get is if we 
can move the Kafka module out so it is independent of the Spark version and can 
be built against a matrix of spark and kafka (spark 1.6 x kafka 0.8, spark 2 x 
kafka 0.8, spark 2 x kafka 0.9 etc) versions without having to further 
complicate the Spark build.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195764#comment-15195764
 ] 

Sean Owen commented on SPARK-13877:
---

(Copying my comment from the other JIRA as further discussion really concerns 
this change.)

[Moving out the flume, MQTT modules, etc] is a little different in that it 
started as part of the ASF project. It has been removed from the project (OK), 
it's been forked and maintained by others outside the project (OK), and nobody 
has now less access to it (i.e. I assume any committer would be added as a 
project member if they wanted to). We have to make sure it's not presented from 
official docs as still part of Spark, and can't release it together in a way 
that suggests it's official (I assume we won't).

We also have to be careful this doesn't add up to appearing to take a part of a 
community project "private". Those modules are so ancillary that I can't 
imagine it's controversial. It's going to be more of an issue for the Kafka 
integration (the topic here).

It's worth asking: do these concerns (which are real and painful) mean a 
separate project is the answer, or just several modules?

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195481#comment-15195481
 ] 

Sean Owen commented on SPARK-13877:
---

Yes that's a key question. If the subprojects are moving out of ASF territory, 
they become not part of the project (and in fact can't be presented as official 
project artifacts). They can be managed by whoever wants to self-organize to 
tend them.

That makes sense for the fairly unused modules like MQTT for example. Kafka 
integration is a bigger deal. [~zsxwing] what was your intent?

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-15 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195435#comment-15195435
 ] 

Cody Koeninger commented on SPARK-13877:


I agree that it's a good idea to move everything that's currently in /external 
out to separate repos.  

I'm not clear on how management of those repos is going to work - is it still 
going through spark jira, spark committers etc?

Also, the linked jira SPARK-13843 doesn't seem to have made any changes to 
docs.  Seems like that would be necessary.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-14 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194653#comment-15194653
 ] 

Saisai Shao commented on SPARK-13877:
-

I agreed to move this out, so it could easily support different versions. 
Currently if we want to introduce Kafka 0.9 supports, either dropping the 0.8 
supports or maintaining two modules, either way is not  so elegant. Maintaining 
it out of Spark would be a good choice.

> Consider removing Kafka modules from Spark / Spark Streaming
> 
>
> Key: SPARK-13877
> URL: https://issues.apache.org/jira/browse/SPARK-13877
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Streaming
>Affects Versions: 1.6.1
>Reporter: Hari Shreedharan
>
> Based on the discussion the PR for SPARK-13843 
> ([here|https://github.com/apache/spark/pull/11672#issuecomment-196553283]), 
> we should consider moving the Kafka modules out of Spark as well. 
> Providing newer functionality (like security) has become painful while 
> maintaining compatibility with older versions of Kafka. Moving this out 
> allows more flexibility, allowing users to mix and match Kafka and Spark 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org