Re: Discourse: A proposed alternative to the Spark User list

2015-01-23 Thread Nicholas Chammas
That sounds good to me. Shall I open a JIRA / PR about updating the site
community page?
On 2015년 1월 23일 (금) at 오전 4:37 Patrick Wendell patr...@databricks.com
wrote:

 Hey Nick,

 So I think we what can do is encourage people to participate on the
 stack overflow topic, and this I think we can do on the Spark website
 as a first class community resource for Spark. We should probably be
 spending more time on that site given its popularity.

 In terms of encouraging this explicitly *to replace* the ASF mailing
 list, that I think is harder to do. The ASF makes a lot of effort to
 host its own infrastructure that is neutral and not associated with
 any corporation. And by and large the ASF policy is to consider that
 as the de-facto forum of communication for any project.

 Personally, I wish the ASF would update this policy - for instance, by
 allowing the use of third party lists or communication fora - provided
 that they allow exporting the conversation if those sites were to
 change course. However, the state of the art stands as such.

 - Patrick

 On Wed, Jan 21, 2015 at 8:43 AM, Nicholas Chammas
 nicholas.cham...@gmail.com wrote:
  Josh / Patrick,
 
  What do y’all think of the idea of promoting Stack Overflow as a place to
  ask questions over this list, as long as the questions fit SO’s
 guidelines
  (how-to-ask, dont-ask)?
 
  The apache-spark tag is very active on there.
 
  Discussions of all types are still on-topic here, but when possible we
 want
  to encourage people to use SO.
 
  Nick
 
  On Wed Jan 21 2015 at 8:37:05 AM Jay Vyas jayunit100.apa...@gmail.com
 wrote:
 
  Its a very valid  idea indeed, but... It's a tricky  subject since the
  entire ASF is run on mailing lists , hence there are so many different
 but
  equally sound ways of looking at this idea, which conflict with one
 another.
 
   On Jan 21, 2015, at 7:03 AM, btiernay btier...@hotmail.com wrote:
  
   I think this is a really great idea for really opening up the
   discussions
   that happen here. Also, it would be nice to know why there doesn't
 seem
   to
   be much interest. Maybe I'm misunderstanding some nuance of Apache
   projects.
  
   Cheers
  
  
  
   --
   View this message in context:
   http://apache-spark-user-list.1001560.n3.nabble.com/
 Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21288.html
   Sent from the Apache Spark User List mailing list archive at
 Nabble.com.
  
   -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 



Re: Discourse: A proposed alternative to the Spark User list

2015-01-23 Thread Gerard Maas
+1

On Fri, Jan 23, 2015 at 5:58 PM, Nicholas Chammas 
nicholas.cham...@gmail.com wrote:

 That sounds good to me. Shall I open a JIRA / PR about updating the site
 community page?
 On 2015년 1월 23일 (금) at 오전 4:37 Patrick Wendell patr...@databricks.com
 wrote:

 Hey Nick,

 So I think we what can do is encourage people to participate on the
 stack overflow topic, and this I think we can do on the Spark website
 as a first class community resource for Spark. We should probably be
 spending more time on that site given its popularity.

 In terms of encouraging this explicitly *to replace* the ASF mailing
 list, that I think is harder to do. The ASF makes a lot of effort to
 host its own infrastructure that is neutral and not associated with
 any corporation. And by and large the ASF policy is to consider that
 as the de-facto forum of communication for any project.

 Personally, I wish the ASF would update this policy - for instance, by
 allowing the use of third party lists or communication fora - provided
 that they allow exporting the conversation if those sites were to
 change course. However, the state of the art stands as such.

 - Patrick


 On Wed, Jan 21, 2015 at 8:43 AM, Nicholas Chammas
 nicholas.cham...@gmail.com wrote:
  Josh / Patrick,
 
  What do y’all think of the idea of promoting Stack Overflow as a place
 to
  ask questions over this list, as long as the questions fit SO’s
 guidelines
  (how-to-ask, dont-ask)?
 
  The apache-spark tag is very active on there.
 
  Discussions of all types are still on-topic here, but when possible we
 want
  to encourage people to use SO.
 
  Nick
 
  On Wed Jan 21 2015 at 8:37:05 AM Jay Vyas jayunit100.apa...@gmail.com
 wrote:
 
  Its a very valid  idea indeed, but... It's a tricky  subject since the
  entire ASF is run on mailing lists , hence there are so many different
 but
  equally sound ways of looking at this idea, which conflict with one
 another.
 
   On Jan 21, 2015, at 7:03 AM, btiernay btier...@hotmail.com wrote:
  
   I think this is a really great idea for really opening up the
   discussions
   that happen here. Also, it would be nice to know why there doesn't
 seem
   to
   be much interest. Maybe I'm misunderstanding some nuance of Apache
   projects.
  
   Cheers
  
  
  
   --
   View this message in context:
   http://apache-spark-user-list.1001560.n3.nabble.com/
 Discourse-A-proposed-alternative-to-the-Spark-User-
 list-tp20851p21288.html
   Sent from the Apache Spark User List mailing list archive at
 Nabble.com.
  
   
 -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 




Re: Discourse: A proposed alternative to the Spark User list

2015-01-23 Thread Nicholas Chammas
https://issues.apache.org/jira/browse/SPARK-5390

On Fri Jan 23 2015 at 12:05:00 PM Gerard Maas gerard.m...@gmail.com wrote:

 +1

 On Fri, Jan 23, 2015 at 5:58 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 That sounds good to me. Shall I open a JIRA / PR about updating the site
 community page?
 On 2015년 1월 23일 (금) at 오전 4:37 Patrick Wendell patr...@databricks.com
 wrote:

 Hey Nick,

 So I think we what can do is encourage people to participate on the
 stack overflow topic, and this I think we can do on the Spark website
 as a first class community resource for Spark. We should probably be
 spending more time on that site given its popularity.

 In terms of encouraging this explicitly *to replace* the ASF mailing
 list, that I think is harder to do. The ASF makes a lot of effort to
 host its own infrastructure that is neutral and not associated with
 any corporation. And by and large the ASF policy is to consider that
 as the de-facto forum of communication for any project.

 Personally, I wish the ASF would update this policy - for instance, by
 allowing the use of third party lists or communication fora - provided
 that they allow exporting the conversation if those sites were to
 change course. However, the state of the art stands as such.

 - Patrick


 On Wed, Jan 21, 2015 at 8:43 AM, Nicholas Chammas
 nicholas.cham...@gmail.com wrote:
  Josh / Patrick,
 
  What do y’all think of the idea of promoting Stack Overflow as a place
 to
  ask questions over this list, as long as the questions fit SO’s
 guidelines
  (how-to-ask, dont-ask)?
 
  The apache-spark tag is very active on there.
 
  Discussions of all types are still on-topic here, but when possible we
 want
  to encourage people to use SO.
 
  Nick
 
  On Wed Jan 21 2015 at 8:37:05 AM Jay Vyas jayunit100.apa...@gmail.com
 wrote:
 
  Its a very valid  idea indeed, but... It's a tricky  subject since the
  entire ASF is run on mailing lists , hence there are so many
 different but
  equally sound ways of looking at this idea, which conflict with one
 another.
 
   On Jan 21, 2015, at 7:03 AM, btiernay btier...@hotmail.com wrote:
  
   I think this is a really great idea for really opening up the
   discussions
   that happen here. Also, it would be nice to know why there doesn't
 seem
   to
   be much interest. Maybe I'm misunderstanding some nuance of Apache
   projects.
  
   Cheers
  
  
  
   --
   View this message in context:
   http://apache-spark-user-list.1001560.n3.nabble.com/
 Discourse-A-proposed-alternative-to-the-Spark-User-
 list-tp20851p21288.html
   Sent from the Apache Spark User List mailing list archive at
 Nabble.com.
  
   
 -
   To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
   For additional commands, e-mail: user-h...@spark.apache.org
  
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 
 





Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Petar Zecevic


Ok, thanks for the clarifications. I didn't know this list has to remain 
as the only official list.


Nabble is really not the best solution in the world, but we're stuck 
with it, I guess.


That's it from me on this subject.

Petar


On 22.1.2015. 3:55, Nicholas Chammas wrote:


I think a few things need to be laid out clearly:

 1. This mailing list is the “official” user discussion platform. That
is, it is sponsored and managed by the ASF.
 2. Users are free to organize independent discussion platforms
focusing on Spark, and there is already one such platform in Stack
Overflow under the |apache-spark| and related tags. Stack Overflow
works quite well.
 3. The ASF will not agree to deprecating or migrating this user list
to a platform that they do not control.
 4. This mailing list has grown to an unwieldy size and discussions
are hard to find or follow; discussion tooling is also lacking. We
want to improve the utility and user experience of this mailing list.
 5. We don’t want to fragment this “official” discussion community.
 6. Nabble is an independent product not affiliated with the ASF. It
offers a slightly better interface to the Apache mailing list
archives.

So to respond to some of your points, pzecevic:

Apache user group could be frozen (not accepting new questions, if
that’s possible) and redirect users to Stack Overflow (automatic
reply?).

From what I understand of the ASF’s policies, this is not possible. :( 
This mailing list must remain the official Spark user discussion platform.


Other thing, about new Stack Exchange site I proposed earlier. If
a new site is created, there is no problem with guidelines, I
think, because Spark community can apply different guidelines for
the new site.

I think Stack Overflow and the various Spark tags are working fine. I 
don’t see a compelling need for a Stack Exchange dedicated to Spark, 
either now or in the near future. Also, I doubt a Spark-specific site 
can pass the 4 tests in the Area 51 FAQ 
http://area51.stackexchange.com/faq:


  * Almost all Spark questions are on-topic for Stack Overflow
  * Stack Overflow already exists, it already has a tag for Spark, and
nobody is complaining
  * You’re not creating such a big group that you don’t have enough
experts to answer all possible questions
  * There’s a high probability that users of Stack Overflow would
enjoy seeing the occasional question about Spark

I think complaining won’t be sufficient. :)

Someone expressed a concern that they won’t allow creating a
project-specific site, but there already exist some
project-specific sites, like Tor, Drupal, Ubuntu…

The communities for these projects are many, many times larger than 
the Spark community is or likely ever will be, simply due to the 
nature of the problems they are solving.


What we need is an improvement to this mailing list. We need better 
tooling than Nabble to sit on top of the Apache archives, and we also 
need some way to control the volume and quality of mail on the list so 
that it remains a useful resource for the majority of users.


Nick

​

On Wed Jan 21 2015 at 3:13:21 PM pzecevic petar.zece...@gmail.com 
mailto:petar.zece...@gmail.com wrote:


Hi,
I tried to find the last reply by Nick Chammas (that I received in the
digest) using the Nabble web interface, but I cannot find it
(perhaps he
didn't reply directly to the user list?). That's one example of
Nabble's
usability.

Anyhow, I wanted to add my two cents...

Apache user group could be frozen (not accepting new questions, if
that's
possible) and redirect users to Stack Overflow (automatic reply?). Old
questions remain (and are searchable) on Nabble, new questions go
to Stack
Exchange, so no need for migration. That's the idea, at least, as
I'm not
sure if that's technically doable... Is it?
dev mailing list could perhaps stay on Nabble (it's not that
busy), or have
a special tag on Stack Exchange.

Other thing, about new Stack Exchange site I proposed earlier. If
a new site
is created, there is no problem with guidelines, I think, because
Spark
community can apply different guidelines for the new site.

There is a FAQ about creating new sites:
http://area51.stackexchange.com/faq
It says: Stack Exchange sites are free to create and free to use.
All we
ask is that you have an enthusiastic, committed group of expert
users who
check in regularly, asking and answering questions.
I think this requirement is satisfied...
Someone expressed a concern that they won't allow creating a
project-specific site, but there already exist some
project-specific sites,
like Tor, Drupal, Ubuntu...

Later, though, the FAQ also says:
If Y already exists, it already has a tag for X, and nobody is
complaining
(then you should not create a new 

Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Petar Zecevic


But voting is done on dev list, right? That could stay there...

Overlay might be a fine solution, too, but that still gives two user 
lists (SO and Nabble+overlay).



On 22.1.2015. 10:42, Sean Owen wrote:


Yes, there is some project business like votes of record on releases 
that needs to be carried on in standard, simple accessible place and 
SO is not at all suitable.


Nobody is stuck with Nabble. The suggestion is to enable a different 
overlay on the existing list. SO remains a place you can ask questions 
too. So I agree with Nick's take.


BTW are there perhaps plans to split this mailing list into 
subproject-specific lists? That might also help tune in/out the subset 
of conversations of interest.


On Jan 22, 2015 10:30 AM, Petar Zecevic petar.zece...@gmail.com 
mailto:petar.zece...@gmail.com wrote:



Ok, thanks for the clarifications. I didn't know this list has to
remain as the only official list.

Nabble is really not the best solution in the world, but we're
stuck with it, I guess.

That's it from me on this subject.

Petar


On 22.1.2015. 3:55, Nicholas Chammas wrote:


I think a few things need to be laid out clearly:

 1. This mailing list is the “official” user discussion platform.
That is, it is sponsored and managed by the ASF.
 2. Users are free to organize independent discussion platforms
focusing on Spark, and there is already one such platform in
Stack Overflow under the |apache-spark| and related tags.
Stack Overflow works quite well.
 3. The ASF will not agree to deprecating or migrating this user
list to a platform that they do not control.
 4. This mailing list has grown to an unwieldy size and
discussions are hard to find or follow; discussion tooling is
also lacking. We want to improve the utility and user
experience of this mailing list.
 5. We don’t want to fragment this “official” discussion community.
 6. Nabble is an independent product not affiliated with the ASF.
It offers a slightly better interface to the Apache mailing
list archives.

So to respond to some of your points, pzecevic:

Apache user group could be frozen (not accepting new
questions, if that’s possible) and redirect users to Stack
Overflow (automatic reply?).

From what I understand of the ASF’s policies, this is not
possible. :( This mailing list must remain the official Spark
user discussion platform.

Other thing, about new Stack Exchange site I proposed
earlier. If a new site is created, there is no problem with
guidelines, I think, because Spark community can apply
different guidelines for the new site.

I think Stack Overflow and the various Spark tags are working
fine. I don’t see a compelling need for a Stack Exchange
dedicated to Spark, either now or in the near future. Also, I
doubt a Spark-specific site can pass the 4 tests in the Area 51
FAQ http://area51.stackexchange.com/faq:

  * Almost all Spark questions are on-topic for Stack Overflow
  * Stack Overflow already exists, it already has a tag for
Spark, and nobody is complaining
  * You’re not creating such a big group that you don’t have
enough experts to answer all possible questions
  * There’s a high probability that users of Stack Overflow would
enjoy seeing the occasional question about Spark

I think complaining won’t be sufficient. :)

Someone expressed a concern that they won’t allow creating a
project-specific site, but there already exist some
project-specific sites, like Tor, Drupal, Ubuntu…

The communities for these projects are many, many times larger
than the Spark community is or likely ever will be, simply due to
the nature of the problems they are solving.

What we need is an improvement to this mailing list. We need
better tooling than Nabble to sit on top of the Apache archives,
and we also need some way to control the volume and quality of
mail on the list so that it remains a useful resource for the
majority of users.

Nick

​

On Wed Jan 21 2015 at 3:13:21 PM pzecevic
petar.zece...@gmail.com mailto:petar.zece...@gmail.com wrote:

Hi,
I tried to find the last reply by Nick Chammas (that I
received in the
digest) using the Nabble web interface, but I cannot find it
(perhaps he
didn't reply directly to the user list?). That's one example
of Nabble's
usability.

Anyhow, I wanted to add my two cents...

Apache user group could be frozen (not accepting new
questions, if that's
possible) and redirect users to Stack Overflow (automatic
reply?). Old
questions remain (and are searchable) on Nabble, new
questions go to Stack
Exchange, so no need for 

Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Sean Owen
Yes, there is some project business like votes of record on releases that
needs to be carried on in standard, simple accessible place and SO is not
at all suitable.

Nobody is stuck with Nabble. The suggestion is to enable a different
overlay on the existing list. SO remains a place you can ask questions too.
So I agree with Nick's take.

BTW are there perhaps plans to split this mailing list into
subproject-specific lists? That might also help tune in/out the subset of
conversations of interest.
On Jan 22, 2015 10:30 AM, Petar Zecevic petar.zece...@gmail.com wrote:


 Ok, thanks for the clarifications. I didn't know this list has to remain
 as the only official list.

 Nabble is really not the best solution in the world, but we're stuck with
 it, I guess.

 That's it from me on this subject.

 Petar


 On 22.1.2015. 3:55, Nicholas Chammas wrote:

  I think a few things need to be laid out clearly:

1. This mailing list is the “official” user discussion platform. That
is, it is sponsored and managed by the ASF.
2. Users are free to organize independent discussion platforms
focusing on Spark, and there is already one such platform in Stack Overflow
under the apache-spark and related tags. Stack Overflow works quite
well.
3. The ASF will not agree to deprecating or migrating this user list
to a platform that they do not control.
4. This mailing list has grown to an unwieldy size and discussions are
hard to find or follow; discussion tooling is also lacking. We want to
improve the utility and user experience of this mailing list.
5. We don’t want to fragment this “official” discussion community.
6. Nabble is an independent product not affiliated with the ASF. It
offers a slightly better interface to the Apache mailing list archives.

 So to respond to some of your points, pzecevic:

 Apache user group could be frozen (not accepting new questions, if that’s
 possible) and redirect users to Stack Overflow (automatic reply?).

 From what I understand of the ASF’s policies, this is not possible. :(
 This mailing list must remain the official Spark user discussion platform.

 Other thing, about new Stack Exchange site I proposed earlier. If a new
 site is created, there is no problem with guidelines, I think, because
 Spark community can apply different guidelines for the new site.

 I think Stack Overflow and the various Spark tags are working fine. I
 don’t see a compelling need for a Stack Exchange dedicated to Spark, either
 now or in the near future. Also, I doubt a Spark-specific site can pass the
 4 tests in the Area 51 FAQ http://area51.stackexchange.com/faq:

- Almost all Spark questions are on-topic for Stack Overflow
- Stack Overflow already exists, it already has a tag for Spark, and
nobody is complaining
- You’re not creating such a big group that you don’t have enough
experts to answer all possible questions
- There’s a high probability that users of Stack Overflow would enjoy
seeing the occasional question about Spark

 I think complaining won’t be sufficient. :)

 Someone expressed a concern that they won’t allow creating a
 project-specific site, but there already exist some project-specific sites,
 like Tor, Drupal, Ubuntu…

 The communities for these projects are many, many times larger than the
 Spark community is or likely ever will be, simply due to the nature of the
 problems they are solving.

 What we need is an improvement to this mailing list. We need better
 tooling than Nabble to sit on top of the Apache archives, and we also need
 some way to control the volume and quality of mail on the list so that it
 remains a useful resource for the majority of users.

 Nick
 ​

 On Wed Jan 21 2015 at 3:13:21 PM pzecevic petar.zece...@gmail.com wrote:

 Hi,
 I tried to find the last reply by Nick Chammas (that I received in the
 digest) using the Nabble web interface, but I cannot find it (perhaps he
 didn't reply directly to the user list?). That's one example of Nabble's
 usability.

 Anyhow, I wanted to add my two cents...

 Apache user group could be frozen (not accepting new questions, if that's
 possible) and redirect users to Stack Overflow (automatic reply?). Old
 questions remain (and are searchable) on Nabble, new questions go to Stack
 Exchange, so no need for migration. That's the idea, at least, as I'm not
 sure if that's technically doable... Is it?
 dev mailing list could perhaps stay on Nabble (it's not that busy), or
 have
 a special tag on Stack Exchange.

 Other thing, about new Stack Exchange site I proposed earlier. If a new
 site
 is created, there is no problem with guidelines, I think, because Spark
 community can apply different guidelines for the new site.

 There is a FAQ about creating new sites:
 http://area51.stackexchange.com/faq
 It says: Stack Exchange sites are free to create and free to use. All we
 ask is that you have an enthusiastic, committed group of expert users who
 check 

Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Gerard Maas
I've have been contributing to SO for a while now.  Here're few
observations I'd like to contribute to the discussion:

The level of questions on SO is often of more entry-level. Harder
questions (that require expertise in a certain area) remain unanswered for
a while. Same questions here on the list (as they are often cross-posted)
receive faster turnaround.
Roughly speaking, there're two groups of questions: Implementing things on
Spark and Running Spark.  The second one is borderline on SO guidelines as
they often involve cluster setups, long logs and little idea of what's
going on (mind you, often those questions come from people starting with
Spark)

In my opinion, Stack Overflow offers a better Q/A experience, in
particular, they have tooling in place to reduce duplicates, something that
often overloads this list (same getting started issues or how to map,
filter, flatmap over and over again).  That said, this list offers a
richer forum, where the expertise pool is a lot deeper.
Also, while SO is fairly strict in requiring posters from showing a minimal
amount of effort in the question being asked, this list is quite friendly
to the same behavior. This could be probably an element that makes the list
'lower impedance'.
One additional thing on SO is that the [apache-spark] tag is a 'low rep'
tag. Neither questions nor answers get significant voting, reducing the
'rep gaming' factor  (discouraging participation?)

Thinking about how to improve both platforms: SO[apache-spark] and this ML,
and get back the list to not overwhelming message volumes, we could
implement some 'load balancing' policies:
- encourage new users to use Stack Overflow, in particular, redirect newbie
questions to SO the friendly way: did you search SO already? or link to
an existing question.
  - most how to map, flatmap, filter, aggregate, reduce, ... would fall
under  this category
- encourage domain experts to hang on SO more often  (my impression is that
MLLib, GraphX are fairly underserved)
- have an 'scalation process' in place, where we could post
'interesting/hard/bug' questions from SO back to the list (or encourage the
poster to do so)
- update our community guidelines on [
http://spark.apache.org/community.html] to implement such policies.

Those are just some ideas on how to improve the community and better serve
the newcomers while avoiding overload of our existing expertise pool.

kr, Gerard.


On Thu, Jan 22, 2015 at 10:42 AM, Sean Owen so...@cloudera.com wrote:

 Yes, there is some project business like votes of record on releases that
 needs to be carried on in standard, simple accessible place and SO is not
 at all suitable.

 Nobody is stuck with Nabble. The suggestion is to enable a different
 overlay on the existing list. SO remains a place you can ask questions too.
 So I agree with Nick's take.

 BTW are there perhaps plans to split this mailing list into
 subproject-specific lists? That might also help tune in/out the subset of
 conversations of interest.
 On Jan 22, 2015 10:30 AM, Petar Zecevic petar.zece...@gmail.com wrote:


 Ok, thanks for the clarifications. I didn't know this list has to remain
 as the only official list.

 Nabble is really not the best solution in the world, but we're stuck with
 it, I guess.

 That's it from me on this subject.

 Petar


 On 22.1.2015. 3:55, Nicholas Chammas wrote:

  I think a few things need to be laid out clearly:

1. This mailing list is the “official” user discussion platform. That
is, it is sponsored and managed by the ASF.
2. Users are free to organize independent discussion platforms
focusing on Spark, and there is already one such platform in Stack 
 Overflow
under the apache-spark and related tags. Stack Overflow works quite
well.
3. The ASF will not agree to deprecating or migrating this user list
to a platform that they do not control.
4. This mailing list has grown to an unwieldy size and discussions
are hard to find or follow; discussion tooling is also lacking. We want to
improve the utility and user experience of this mailing list.
5. We don’t want to fragment this “official” discussion community.
6. Nabble is an independent product not affiliated with the ASF. It
offers a slightly better interface to the Apache mailing list archives.

 So to respond to some of your points, pzecevic:

 Apache user group could be frozen (not accepting new questions, if that’s
 possible) and redirect users to Stack Overflow (automatic reply?).

 From what I understand of the ASF’s policies, this is not possible. :(
 This mailing list must remain the official Spark user discussion platform.

 Other thing, about new Stack Exchange site I proposed earlier. If a new
 site is created, there is no problem with guidelines, I think, because
 Spark community can apply different guidelines for the new site.

 I think Stack Overflow and the various Spark tags are working fine. I
 don’t see a compelling need for a Stack 

Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Nicholas Chammas
I agree with Sean that a Spark-specific Stack Exchange likely won't help
and almost certainly won't make it out of Area 51. The idea certainly
sounds nice from our perspective as Spark users, but it doesn't mesh with
the structure of Stack Exchange or the criteria for creating new sites.

On Thu Jan 22 2015 at 1:23:14 PM Sean Owen so...@cloudera.com wrote:

 FWIW I am a moderator for datascience.stackexchange.com, and even that
 hasn't really achieved the critical mass that SE sites are supposed
 to: http://area51.stackexchange.com/proposals/55053/data-science

 I think a Spark site would have a lot less traffic. One annoyance is
 that people can't figure out when to post on SO vs Data Science vs
 Cross Validated. A Spark site would have the same problem,
 fragmentation and cross posting with SO. I don't think this would be
 accepted as a StackExchange site and don't think it helps.

 On Thu, Jan 22, 2015 at 6:16 PM, pierred pie...@demartines.com wrote:
 
  A dedicated stackexchange site for Apache Spark sounds to me like the
  logical solution.  Less trolling, more enthusiasm, and with the
  participation of the people on this list, I think it would very quickly
  become the reference for many technical questions, as well as a great
  vehicle to promote the awesomeness of Spark.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Nicholas Chammas
we could implement some ‘load balancing’ policies:

I think Gerard’s suggestions are good. We need some “official” buy-in from
the project’s maintainers and heavy contributors and we should move forward
with them.

I know that at least Josh Rosen, Sean Owen, and Tathagata Das, who are
active on this list, are also active on SO
http://stackoverflow.com/tags/apache-spark/topusers. So perhaps we’re
already part of the way there.

Nick
​

On Thu Jan 22 2015 at 5:32:40 AM Gerard Maas gerard.m...@gmail.com wrote:

 I've have been contributing to SO for a while now.  Here're few
 observations I'd like to contribute to the discussion:

 The level of questions on SO is often of more entry-level. Harder
 questions (that require expertise in a certain area) remain unanswered for
 a while. Same questions here on the list (as they are often cross-posted)
 receive faster turnaround.
 Roughly speaking, there're two groups of questions: Implementing things on
 Spark and Running Spark.  The second one is borderline on SO guidelines as
 they often involve cluster setups, long logs and little idea of what's
 going on (mind you, often those questions come from people starting with
 Spark)

 In my opinion, Stack Overflow offers a better Q/A experience, in
 particular, they have tooling in place to reduce duplicates, something that
 often overloads this list (same getting started issues or how to map,
 filter, flatmap over and over again).  That said, this list offers a
 richer forum, where the expertise pool is a lot deeper.
 Also, while SO is fairly strict in requiring posters from showing a
 minimal amount of effort in the question being asked, this list is quite
 friendly to the same behavior. This could be probably an element that makes
 the list 'lower impedance'.
 One additional thing on SO is that the [apache-spark] tag is a 'low rep'
 tag. Neither questions nor answers get significant voting, reducing the
 'rep gaming' factor  (discouraging participation?)

 Thinking about how to improve both platforms: SO[apache-spark] and this
 ML, and get back the list to not overwhelming message volumes, we could
 implement some 'load balancing' policies:
 - encourage new users to use Stack Overflow, in particular, redirect
 newbie questions to SO the friendly way: did you search SO already? or
 link to an existing question.
   - most how to map, flatmap, filter, aggregate, reduce, ... would fall
 under  this category
 - encourage domain experts to hang on SO more often  (my impression is
 that MLLib, GraphX are fairly underserved)
 - have an 'scalation process' in place, where we could post
 'interesting/hard/bug' questions from SO back to the list (or encourage the
 poster to do so)
 - update our community guidelines on [
 http://spark.apache.org/community.html] to implement such policies.

 Those are just some ideas on how to improve the community and better serve
 the newcomers while avoiding overload of our existing expertise pool.

 kr, Gerard.


 On Thu, Jan 22, 2015 at 10:42 AM, Sean Owen so...@cloudera.com wrote:

 Yes, there is some project business like votes of record on releases that
 needs to be carried on in standard, simple accessible place and SO is not
 at all suitable.

 Nobody is stuck with Nabble. The suggestion is to enable a different
 overlay on the existing list. SO remains a place you can ask questions too.
 So I agree with Nick's take.

 BTW are there perhaps plans to split this mailing list into
 subproject-specific lists? That might also help tune in/out the subset of
 conversations of interest.
 On Jan 22, 2015 10:30 AM, Petar Zecevic petar.zece...@gmail.com
 wrote:


 Ok, thanks for the clarifications. I didn't know this list has to remain
 as the only official list.

 Nabble is really not the best solution in the world, but we're stuck
 with it, I guess.

 That's it from me on this subject.

 Petar


 On 22.1.2015. 3:55, Nicholas Chammas wrote:

  I think a few things need to be laid out clearly:

1. This mailing list is the “official” user discussion platform.
That is, it is sponsored and managed by the ASF.
2. Users are free to organize independent discussion platforms
focusing on Spark, and there is already one such platform in Stack 
 Overflow
under the apache-spark and related tags. Stack Overflow works quite
well.
3. The ASF will not agree to deprecating or migrating this user list
to a platform that they do not control.
4. This mailing list has grown to an unwieldy size and discussions
are hard to find or follow; discussion tooling is also lacking. We want 
 to
improve the utility and user experience of this mailing list.
5. We don’t want to fragment this “official” discussion community.
6. Nabble is an independent product not affiliated with the ASF. It
offers a slightly better interface to the Apache mailing list archives.

 So to respond to some of your points, pzecevic:

 Apache user group could be frozen (not accepting new 

Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Marcelo Vanzin
On Thu, Jan 22, 2015 at 10:21 AM, Sean Owen so...@cloudera.com wrote:
 I think a Spark site would have a lot less traffic. One annoyance is
 that people can't figure out when to post on SO vs Data Science vs
 Cross Validated.

Another is that a lot of the discussions we see on the Spark users
list would be closed very quickly at Stack Overflow. Long and abstract
discussions are generally a good recipe to get your question closed.
Which is an argument for why Discourse would be more appropriate, I
guess.

Finally, maybe I'm showing my age, but I really dislike having to
follow lots of different places. What would happen is that,
personally, I'd end up either ignoring any new discussion forum, or
just treating it like a mailing list and doing everything by e-mail.
Now get off my lawn.

-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread pierred
Love it!

There is a reason why SO is so effective and popular.  Search is excellent,
you can quickly find very thoughtful answers about sometimes thorny
problems, and it is easy to contribute, format code, etc.  Perhaps the most
useful feature is that the best answers naturally bubble up to the top, so
these are the ones you see first.

One annoyance is the troll phenomenon, see e.g.
http://michael.richter.name/blogs/why-i-no-longer-contribute-to-stackoverflow
(that also mentions other pet peeves about SO).  That phenomenon is, IMHO,
most prevalent on the stackoverflow itself, perhaps less so on other
stackexchange sites.

At the same time, I do appreciate the pressure to provide well-written,
concise, and for the posterity questions and answers.  That peer pressure
is what, to a good extent, makes the material on SO so valuable and useful. 
It is probably a tricky balance to strike.

A dedicated stackexchange site for Apache Spark sounds to me like the
logical solution.  Less trolling, more enthusiasm, and with the
participation of the people on this list, I think it would very quickly
become the reference for many technical questions, as well as a great
vehicle to promote the awesomeness of Spark.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21321.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-22 Thread Sean Owen
FWIW I am a moderator for datascience.stackexchange.com, and even that
hasn't really achieved the critical mass that SE sites are supposed
to: http://area51.stackexchange.com/proposals/55053/data-science

I think a Spark site would have a lot less traffic. One annoyance is
that people can't figure out when to post on SO vs Data Science vs
Cross Validated. A Spark site would have the same problem,
fragmentation and cross posting with SO. I don't think this would be
accepted as a StackExchange site and don't think it helps.

On Thu, Jan 22, 2015 at 6:16 PM, pierred pie...@demartines.com wrote:

 A dedicated stackexchange site for Apache Spark sounds to me like the
 logical solution.  Less trolling, more enthusiasm, and with the
 participation of the people on this list, I think it would very quickly
 become the reference for many technical questions, as well as a great
 vehicle to promote the awesomeness of Spark.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-21 Thread Nicholas Chammas
Josh / Patrick,

What do y’all think of the idea of promoting Stack Overflow as a place to
ask questions over this list, as long as the questions fit SO’s guidelines (
how-to-ask http://stackoverflow.com/help/how-to-ask, dont-ask
http://stackoverflow.com/help/dont-ask)?

The apache-spark http://stackoverflow.com/questions/tagged/apache-spark
tag is very active on there.

Discussions of all types are still on-topic here, but when possible we want
to encourage people to use SO.

Nick

On Wed Jan 21 2015 at 8:37:05 AM Jay Vyas jayunit100.apa...@gmail.com
http://mailto:jayunit100.apa...@gmail.com wrote:

Its a very valid  idea indeed, but... It's a tricky  subject since the
 entire ASF is run on mailing lists , hence there are so many different but
 equally sound ways of looking at this idea, which conflict with one another.

  On Jan 21, 2015, at 7:03 AM, btiernay btier...@hotmail.com wrote:
 
  I think this is a really great idea for really opening up the discussions
  that happen here. Also, it would be nice to know why there doesn't seem
 to
  be much interest. Maybe I'm misunderstanding some nuance of Apache
 projects.
 
  Cheers
 
 
 
  --
  View this message in context: http://apache-spark-user-list.
 1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-
 list-tp20851p21288.html
  Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
  -
  To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
  For additional commands, e-mail: user-h...@spark.apache.org
 

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org

  ​


Re: Discourse: A proposed alternative to the Spark User list

2015-01-21 Thread pzecevic
Hi,
I tried to find the last reply by Nick Chammas (that I received in the
digest) using the Nabble web interface, but I cannot find it (perhaps he
didn't reply directly to the user list?). That's one example of Nabble's
usability.

Anyhow, I wanted to add my two cents...

Apache user group could be frozen (not accepting new questions, if that's
possible) and redirect users to Stack Overflow (automatic reply?). Old
questions remain (and are searchable) on Nabble, new questions go to Stack
Exchange, so no need for migration. That's the idea, at least, as I'm not
sure if that's technically doable... Is it?
dev mailing list could perhaps stay on Nabble (it's not that busy), or have
a special tag on Stack Exchange.

Other thing, about new Stack Exchange site I proposed earlier. If a new site
is created, there is no problem with guidelines, I think, because Spark
community can apply different guidelines for the new site. 

There is a FAQ about creating new sites: http://area51.stackexchange.com/faq
It says: Stack Exchange sites are free to create and free to use. All we
ask is that you have an enthusiastic, committed group of expert users who
check in regularly, asking and answering questions.
I think this requirement is satisfied...
Someone expressed a concern that they won't allow creating a
project-specific site, but there already exist some project-specific sites,
like Tor, Drupal, Ubuntu...

Later, though, the FAQ also says:
If Y already exists, it already has a tag for X, and nobody is complaining
(then you should not create a new site). But we could complain :)

The advantage of having a separate site is that users, who should have more
privileges, would need to earn them through Spark questions and answers
only. The other thing, already mentioned, is that the community could create
Spark specific guidelines. There are also  'meta' sites for asking questions
like this one, etc.

There is a process for starting a site - it's not instantaneous. New site
needs to go through private beta and public beta, so that could be a
drawback.


Like btiernay, I must say: there might be something about Apache projects
and mailing lists that I do not know, so excuse me if that is the case...




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21299.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-21 Thread Nicholas Chammas
I think a few things need to be laid out clearly:

   1. This mailing list is the “official” user discussion platform. That
   is, it is sponsored and managed by the ASF.
   2. Users are free to organize independent discussion platforms focusing
   on Spark, and there is already one such platform in Stack Overflow under
   the apache-spark and related tags. Stack Overflow works quite well.
   3. The ASF will not agree to deprecating or migrating this user list to
   a platform that they do not control.
   4. This mailing list has grown to an unwieldy size and discussions are
   hard to find or follow; discussion tooling is also lacking. We want to
   improve the utility and user experience of this mailing list.
   5. We don’t want to fragment this “official” discussion community.
   6. Nabble is an independent product not affiliated with the ASF. It
   offers a slightly better interface to the Apache mailing list archives.

So to respond to some of your points, pzecevic:

Apache user group could be frozen (not accepting new questions, if that’s
possible) and redirect users to Stack Overflow (automatic reply?).

From what I understand of the ASF’s policies, this is not possible. :( This
mailing list must remain the official Spark user discussion platform.

Other thing, about new Stack Exchange site I proposed earlier. If a new
site is created, there is no problem with guidelines, I think, because
Spark community can apply different guidelines for the new site.

I think Stack Overflow and the various Spark tags are working fine. I don’t
see a compelling need for a Stack Exchange dedicated to Spark, either now
or in the near future. Also, I doubt a Spark-specific site can pass the 4
tests in the Area 51 FAQ http://area51.stackexchange.com/faq:

   - Almost all Spark questions are on-topic for Stack Overflow
   - Stack Overflow already exists, it already has a tag for Spark, and
   nobody is complaining
   - You’re not creating such a big group that you don’t have enough
   experts to answer all possible questions
   - There’s a high probability that users of Stack Overflow would enjoy
   seeing the occasional question about Spark

I think complaining won’t be sufficient. :)

Someone expressed a concern that they won’t allow creating a
project-specific site, but there already exist some project-specific sites,
like Tor, Drupal, Ubuntu…

The communities for these projects are many, many times larger than the
Spark community is or likely ever will be, simply due to the nature of the
problems they are solving.

What we need is an improvement to this mailing list. We need better tooling
than Nabble to sit on top of the Apache archives, and we also need some way
to control the volume and quality of mail on the list so that it remains a
useful resource for the majority of users.

Nick
​

On Wed Jan 21 2015 at 3:13:21 PM pzecevic petar.zece...@gmail.com wrote:

 Hi,
 I tried to find the last reply by Nick Chammas (that I received in the
 digest) using the Nabble web interface, but I cannot find it (perhaps he
 didn't reply directly to the user list?). That's one example of Nabble's
 usability.

 Anyhow, I wanted to add my two cents...

 Apache user group could be frozen (not accepting new questions, if that's
 possible) and redirect users to Stack Overflow (automatic reply?). Old
 questions remain (and are searchable) on Nabble, new questions go to Stack
 Exchange, so no need for migration. That's the idea, at least, as I'm not
 sure if that's technically doable... Is it?
 dev mailing list could perhaps stay on Nabble (it's not that busy), or have
 a special tag on Stack Exchange.

 Other thing, about new Stack Exchange site I proposed earlier. If a new
 site
 is created, there is no problem with guidelines, I think, because Spark
 community can apply different guidelines for the new site.

 There is a FAQ about creating new sites: http://area51.stackexchange.
 com/faq
 It says: Stack Exchange sites are free to create and free to use. All we
 ask is that you have an enthusiastic, committed group of expert users who
 check in regularly, asking and answering questions.
 I think this requirement is satisfied...
 Someone expressed a concern that they won't allow creating a
 project-specific site, but there already exist some project-specific sites,
 like Tor, Drupal, Ubuntu...

 Later, though, the FAQ also says:
 If Y already exists, it already has a tag for X, and nobody is
 complaining
 (then you should not create a new site). But we could complain :)

 The advantage of having a separate site is that users, who should have more
 privileges, would need to earn them through Spark questions and answers
 only. The other thing, already mentioned, is that the community could
 create
 Spark specific guidelines. There are also  'meta' sites for asking
 questions
 like this one, etc.

 There is a process for starting a site - it's not instantaneous. New site
 needs to go through private beta and public beta, so that could be a

RE: Discourse: A proposed alternative to the Spark User list

2015-01-21 Thread Bob Tiernay
Very well stated. Thanks for putting in the effort to formalize your thoughts 
of which I agree entirely.
How are these type of decisions made traditionally in the Spark community? Is 
there a formal process? What's the next step?
Thanks again

From: nicholas.cham...@gmail.com
Date: Thu, 22 Jan 2015 02:55:33 +
Subject: Re: Discourse: A proposed alternative to the Spark User list
To: petar.zece...@gmail.com; user@spark.apache.org

I think a few things need to be laid out clearly:

This mailing list is the “official” user discussion platform. That is, it is 
sponsored and managed by the ASF.
Users are free to organize independent discussion platforms focusing on Spark, 
and there is already one such platform in Stack Overflow under the apache-spark 
and related tags. Stack Overflow works quite well.
The ASF will not agree to deprecating or migrating this user list to a platform 
that they do not control.
This mailing list has grown to an unwieldy size and discussions are hard to 
find or follow; discussion tooling is also lacking. We want to improve the 
utility and user experience of this mailing list.
We don’t want to fragment this “official” discussion community.
Nabble is an independent product not affiliated with the ASF. It offers a 
slightly better interface to the Apache mailing list archives. 

So to respond to some of your points, pzecevic:

Apache user group could be frozen (not accepting new questions, if that’s 
possible) and redirect users to Stack Overflow (automatic reply?).

From what I understand of the ASF’s policies, this is not possible. :( This 
mailing list must remain the official Spark user discussion platform.

Other thing, about new Stack Exchange site I proposed earlier. If a new site is 
created, there is no problem with guidelines, I think, because Spark community 
can apply different guidelines for the new site.

I think Stack Overflow and the various Spark tags are working fine. I don’t see 
a compelling need for a Stack Exchange dedicated to Spark, either now or in the 
near future. Also, I doubt a Spark-specific site can pass the 4 tests in the 
Area 51 FAQ:

Almost all Spark questions are on-topic for Stack Overflow
Stack Overflow already exists, it already has a tag for Spark, and nobody is 
complaining
You’re not creating such a big group that you don’t have enough experts to 
answer all possible questions
There’s a high probability that users of Stack Overflow would enjoy seeing the 
occasional question about Spark

I think complaining won’t be sufficient. :)

Someone expressed a concern that they won’t allow creating a project-specific 
site, but there already exist some project-specific sites, like Tor, Drupal, 
Ubuntu…

The communities for these projects are many, many times larger than the Spark 
community is or likely ever will be, simply due to the nature of the problems 
they are solving.
What we need is an improvement to this mailing list. We need better tooling 
than Nabble to sit on top of the Apache archives, and we also need some way to 
control the volume and quality of mail on the list so that it remains a useful 
resource for the majority of users.
Nick
​
On Wed Jan 21 2015 at 3:13:21 PM pzecevic petar.zece...@gmail.com wrote:
Hi,

I tried to find the last reply by Nick Chammas (that I received in the

digest) using the Nabble web interface, but I cannot find it (perhaps he

didn't reply directly to the user list?). That's one example of Nabble's

usability.



Anyhow, I wanted to add my two cents...



Apache user group could be frozen (not accepting new questions, if that's

possible) and redirect users to Stack Overflow (automatic reply?). Old

questions remain (and are searchable) on Nabble, new questions go to Stack

Exchange, so no need for migration. That's the idea, at least, as I'm not

sure if that's technically doable... Is it?

dev mailing list could perhaps stay on Nabble (it's not that busy), or have

a special tag on Stack Exchange.



Other thing, about new Stack Exchange site I proposed earlier. If a new site

is created, there is no problem with guidelines, I think, because Spark

community can apply different guidelines for the new site.



There is a FAQ about creating new sites: http://area51.stackexchange.com/faq

It says: Stack Exchange sites are free to create and free to use. All we

ask is that you have an enthusiastic, committed group of expert users who

check in regularly, asking and answering questions.

I think this requirement is satisfied...

Someone expressed a concern that they won't allow creating a

project-specific site, but there already exist some project-specific sites,

like Tor, Drupal, Ubuntu...



Later, though, the FAQ also says:

If Y already exists, it already has a tag for X, and nobody is complaining

(then you should not create a new site). But we could complain :)



The advantage of having a separate site is that users, who should have more

privileges, would need to earn them through Spark

Re: Discourse: A proposed alternative to the Spark User list

2015-01-21 Thread btiernay
I think this is a really great idea for really opening up the discussions
that happen here. Also, it would be nice to know why there doesn't seem to
be much interest. Maybe I'm misunderstanding some nuance of Apache projects.

Cheers



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21288.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-21 Thread Jay Vyas
Its a very valid  idea indeed, but... It's a tricky  subject since the entire 
ASF is run on mailing lists , hence there are so many different but equally 
sound ways of looking at this idea, which conflict with one another.

 On Jan 21, 2015, at 7:03 AM, btiernay btier...@hotmail.com wrote:
 
 I think this is a really great idea for really opening up the discussions
 that happen here. Also, it would be nice to know why there doesn't seem to
 be much interest. Maybe I'm misunderstanding some nuance of Apache projects.
 
 Cheers
 
 
 
 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21288.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-17 Thread pzecevic
Hi, guys!

I'm reviving this old question from Nick Chammas with a new proposal: what
do you think about creating a separate Stack Exchange 'Apache Spark' site
(like 'philosophy' and 'English' etc.)?

I'm not sure what would be the best way to deal with user and dev lists,
though - to merge them into one or create two separate sites...

And I don't know it it's at all possible to migrate current lists to stack
exchange, but I believe it would be an improvement over the current
situation. People are used to stack exchange, it's easy to use and search,
topics (Spark SQL, Streaming, Graphx) could be marked with tags for easy
filtering, code formatting is super easy etc.

What do you all think?



Nick Chammas wrote
 When people have questions about Spark, there are 2 main places (as far as
 I can tell) where they ask them:
 
- Stack Overflow, under the apache-spark tag
lt;http://stackoverflow.com/questions/tagged/apache-sparkgt;
- This mailing list
 
 The mailing list is valuable as an independent place for discussion that
 is
 part of the Spark project itself. Furthermore, it allows for a broader
 range of discussions than would be allowed on Stack Overflow
 lt;http://stackoverflow.com/help/dont-askgt;.
 
 As the Spark project has grown in popularity, I see that a few problems
 have emerged with this mailing list:
 
- It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
interested in, and it’s hard to know when someone has mentioned you
specifically.
- It’s hard to search for existing threads and link information across
disparate threads.
- It’s hard to format code and log snippets nicely, and by extension,
hard to read other people’s posts with this kind of information.
 
 There are existing solutions to all these (and other) problems based
 around
 straight-up discipline or client-side tooling, which users have to conjure
 up for themselves.
 
 I’d like us as a community to consider using Discourse
 lt;http://www.discourse.org/gt; as an alternative to, or overlay on top
 of,
 this mailing list, that provides better out-of-the-box solutions to these
 problems.
 
 Discourse is a modern discussion platform built by some of the same people
 who created Stack Overflow. It has many neat features
 lt;http://v1.discourse.org/about/gt; that I believe this community would
 benefit from.
 
 For example:
 
- When a user starts typing up a new post, they get a panel *showing
existing conversations that look similar*, just like on Stack Overflow.
- It’s easy to search for posts and link between them.
- *Markdown support* is built-in to composer.
- You can *specifically mention people* and they will be notified.
- Posts can be categorized (e.g. Streaming, SQL, etc.).
- There is a built-in option for mailing list support which forwards
 all
activity on the forum to a user’s email address and which allows for
creation of new posts via email.
 
 What do you think of Discourse as an alternative, more manageable way to
 discus Spark?
 
 There are a few options we can consider:
 
1. Work with the ASF as well as the Discourse team to allow Discourse
 to
act as an overlay on top of this mailing list
   
 lt;https://meta.discourse.org/t/discourse-as-a-front-end-for-existing-asf-mailing-lists/23167?u=nicholaschammasgt;,
allowing people to continue to use the mailing list as-is if they want.
(This is the toughest but perhaps most attractive option.)
2. Create a new Discourse forum for Spark that is not bound to this
 user
list. This is relatively easy but will effectively fork the community
 on
this list. (We cannot shut down this mailing in favor of one managed by
Discourse.)
3. Don’t use Discourse. Just encourage people on this list to post
instead on Stack Overflow whenever possible.
4. Something else.
 
 What does everyone think?
 
 Nick
 ​





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21203.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Discourse: A proposed alternative to the Spark User list

2015-01-17 Thread Andrew Ash
People can continue using the stack exchange sites as is with no additional
work from the Spark team.  I would not support migrating our mailing lists
yet again to another system like Discourse because I fear fragmentation of
the community between the many sites.

On Sat, Jan 17, 2015 at 6:24 AM, pzecevic petar.zece...@gmail.com wrote:

 Hi, guys!

 I'm reviving this old question from Nick Chammas with a new proposal: what
 do you think about creating a separate Stack Exchange 'Apache Spark' site
 (like 'philosophy' and 'English' etc.)?

 I'm not sure what would be the best way to deal with user and dev lists,
 though - to merge them into one or create two separate sites...

 And I don't know it it's at all possible to migrate current lists to stack
 exchange, but I believe it would be an improvement over the current
 situation. People are used to stack exchange, it's easy to use and search,
 topics (Spark SQL, Streaming, Graphx) could be marked with tags for easy
 filtering, code formatting is super easy etc.

 What do you all think?



 Nick Chammas wrote
  When people have questions about Spark, there are 2 main places (as far
 as
  I can tell) where they ask them:
 
 - Stack Overflow, under the apache-spark tag
 lt;http://stackoverflow.com/questions/tagged/apache-sparkgt;
 - This mailing list
 
  The mailing list is valuable as an independent place for discussion that
  is
  part of the Spark project itself. Furthermore, it allows for a broader
  range of discussions than would be allowed on Stack Overflow
  lt;http://stackoverflow.com/help/dont-askgt;.
 
  As the Spark project has grown in popularity, I see that a few problems
  have emerged with this mailing list:
 
 - It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
 interested in, and it’s hard to know when someone has mentioned you
 specifically.
 - It’s hard to search for existing threads and link information across
 disparate threads.
 - It’s hard to format code and log snippets nicely, and by extension,
 hard to read other people’s posts with this kind of information.
 
  There are existing solutions to all these (and other) problems based
  around
  straight-up discipline or client-side tooling, which users have to
 conjure
  up for themselves.
 
  I’d like us as a community to consider using Discourse
  lt;http://www.discourse.org/gt; as an alternative to, or overlay on
 top
  of,
  this mailing list, that provides better out-of-the-box solutions to these
  problems.
 
  Discourse is a modern discussion platform built by some of the same
 people
  who created Stack Overflow. It has many neat features
  lt;http://v1.discourse.org/about/gt; that I believe this community
 would
  benefit from.
 
  For example:
 
 - When a user starts typing up a new post, they get a panel *showing
 existing conversations that look similar*, just like on Stack
 Overflow.
 - It’s easy to search for posts and link between them.
 - *Markdown support* is built-in to composer.
 - You can *specifically mention people* and they will be notified.
 - Posts can be categorized (e.g. Streaming, SQL, etc.).
 - There is a built-in option for mailing list support which forwards
  all
 activity on the forum to a user’s email address and which allows for
 creation of new posts via email.
 
  What do you think of Discourse as an alternative, more manageable way to
  discus Spark?
 
  There are a few options we can consider:
 
 1. Work with the ASF as well as the Discourse team to allow Discourse
  to
 act as an overlay on top of this mailing list
 
  lt;
 https://meta.discourse.org/t/discourse-as-a-front-end-for-existing-asf-mailing-lists/23167?u=nicholaschammasgt
 ;,
 allowing people to continue to use the mailing list as-is if they
 want.
 (This is the toughest but perhaps most attractive option.)
 2. Create a new Discourse forum for Spark that is not bound to this
  user
 list. This is relatively easy but will effectively fork the community
  on
 this list. (We cannot shut down this mailing in favor of one managed
 by
 Discourse.)
 3. Don’t use Discourse. Just encourage people on this list to post
 instead on Stack Overflow whenever possible.
 4. Something else.
 
  What does everyone think?
 
  Nick
  ​





 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21203.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Discourse: A proposed alternative to the Spark User list

2015-01-17 Thread Nicholas Chammas
The Stack Exchange community will not support creating a whole new site
just for Spark (otherwise you’d see dedicated sites for much larger topics
like “Python”). Their tagging system works well enough to separate
questions about different topics, and the apache-spark
http://stackoverflow.com/questions/tagged/apache-spark tag on Stack
Overflow is already doing pretty well.

The ASF as well as this community will also not support any migration of
the mailing list to another system due to ASF rules
http://www.apache.org/foundation/how-it-works.html and community
fragmentation.

Realistically, the only options available to us that I see are options 1
and 3 from my original email (which can be used together).

Option 3: Change the culture around the user list. Encourage people to use
Stack Overflow whenever possible, and this list only when their question
doesn’t fit SO’s strict rules.

Option 1: Work with the ASF and the Discourse teams to allow Discourse to
be deployed as an overlay on top of this existing mailing list. (e.g. Like
a new UI on top of an old database.)

The goal of both changes would be to make the user list more usable.

Nick

On 2015년 1월 17일 (토) at 오전 8:51 Andrew Ash and...@andrewash.com wrote:

People can continue using the stack exchange sites as is with no additional
 work from the Spark team.  I would not support migrating our mailing lists
 yet again to another system like Discourse because I fear fragmentation of
 the community between the many sites.

 On Sat, Jan 17, 2015 at 6:24 AM, pzecevic petar.zece...@gmail.com wrote:

 Hi, guys!

 I'm reviving this old question from Nick Chammas with a new proposal: what
 do you think about creating a separate Stack Exchange 'Apache Spark' site
 (like 'philosophy' and 'English' etc.)?

 I'm not sure what would be the best way to deal with user and dev lists,
 though - to merge them into one or create two separate sites...

 And I don't know it it's at all possible to migrate current lists to stack
 exchange, but I believe it would be an improvement over the current
 situation. People are used to stack exchange, it's easy to use and search,
 topics (Spark SQL, Streaming, Graphx) could be marked with tags for easy
 filtering, code formatting is super easy etc.

 What do you all think?



 Nick Chammas wrote
  When people have questions about Spark, there are 2 main places (as far
 as
  I can tell) where they ask them:
 
 - Stack Overflow, under the apache-spark tag
 lt;http://stackoverflow.com/questions/tagged/apache-sparkgt;
 - This mailing list
 
  The mailing list is valuable as an independent place for discussion that
  is
  part of the Spark project itself. Furthermore, it allows for a broader
  range of discussions than would be allowed on Stack Overflow
  lt;http://stackoverflow.com/help/dont-askgt;.
 
  As the Spark project has grown in popularity, I see that a few problems
  have emerged with this mailing list:
 
 - It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
 interested in, and it’s hard to know when someone has mentioned you
 specifically.
 - It’s hard to search for existing threads and link information
 across
 disparate threads.
 - It’s hard to format code and log snippets nicely, and by extension,
 hard to read other people’s posts with this kind of information.
 
  There are existing solutions to all these (and other) problems based
  around
  straight-up discipline or client-side tooling, which users have to
 conjure
  up for themselves.
 
  I’d like us as a community to consider using Discourse
  lt;http://www.discourse.org/gt; as an alternative to, or overlay on
 top
  of,
  this mailing list, that provides better out-of-the-box solutions to
 these
  problems.
 
  Discourse is a modern discussion platform built by some of the same
 people
  who created Stack Overflow. It has many neat features
  lt;http://v1.discourse.org/about/gt; that I believe this community
 would
  benefit from.
 
  For example:
 
 - When a user starts typing up a new post, they get a panel *showing
 existing conversations that look similar*, just like on Stack
 Overflow.
 - It’s easy to search for posts and link between them.
 - *Markdown support* is built-in to composer.
 - You can *specifically mention people* and they will be notified.
 - Posts can be categorized (e.g. Streaming, SQL, etc.).
 - There is a built-in option for mailing list support which forwards
  all
 activity on the forum to a user’s email address and which allows for
 creation of new posts via email.
 
  What do you think of Discourse as an alternative, more manageable way to
  discus Spark?
 
  There are a few options we can consider:
 
 1. Work with the ASF as well as the Discourse team to allow Discourse
  to
 act as an overlay on top of this mailing list
 
  lt;https://meta.discourse.org/t/discourse-as-a-front-
 end-for-existing-asf-mailing-lists/23167?u=nicholaschammasgt;,
 

Re: Discourse: A proposed alternative to the Spark User list

2014-12-26 Thread Nicholas Chammas
Thanks for providing that additional background, Josh.

It looks like many people on that Google Groups thread wanted a better
interface than is offered by the Apache mailing lists. Some even raised the
idea of a bi-directional bridge
https://groups.google.com/d/msg/spark-users/vtg-5db8JWY/Z37CbNJSvQAJ,
like I proposed on the Discourse Meta forum
https://meta.discourse.org/t/discourse-as-a-front-end-for-existing-asf-mailing-lists/23167?u=nicholaschammas
.

Tobias wrote:

uh, I would have expected a rather heated discussion, but the opposite
seems to be the case ;-)

Well, it’s the holidays, so people may just be away. :)

On the other hand, I suspect that part of the lack of responses comes from
1) people not finding this thread due to the volume of mail on the list and
the poor interface, and 2) people having long ago given up on the list and
moved on—specifically, people who cared about the discussion platform we
used and knew there were better options out there.

That’s the scary and disappointing aspect to this. We’ll likely never hear
from the people who long ago found this mailing list hard to use.
Increasingly, we will be left with 1) a dwindling group of hardcore
devotees who are willing to put up with the list and somehow still squeeze
value out of it, and 2) waves of new users who come, throw a few new posts
into the chaos, and then leave, disappointed.

Option 3 from my initial email (keep the list as-is and just encourage
people to post on Stack Overflow when possible) does currently seem like
the most realistic option we have to making this list more usable. Though I
look forward to hearing what Josh and others have to say about the matter.

Nick
​

On Fri Dec 26 2014 at 2:11:56 AM Josh Rosen rosenvi...@gmail.com wrote:

 We have a mirror of the user and developer mailing lists on Nabble, but
 unfortunately this has led to significant usability issues because users
 may attempt to post messages through Nabble which silently fail to get
 posted to the actual Apache list and thus are never read by most
 subscribers:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Nabble-mailing-list-mirror-errors-quot-This-post-has-NOT-been-accepted-by-the-mailing-list-yet-quot-td9772.html.
 In fact, there are replies to this very thread that were not properly
 mirrored from the Apache list to Nabble.

 Before Spark moved to Apache, our mailing list was hosted on Google
 Groups.  Several community members were in favor of keeping the discussion
 list on Google Groups, since its interface is a bit more user-friendly:
 https://groups.google.com/forum/#!topic/spark-users/vtg-5db8JWY

 See also:
 https://mail-archives.apache.org/mod_mbox/spark-dev/201308.mbox/%3cce3c361b.fc97f%25chris.a.mattm...@jpl.nasa.gov%3E

 Since Andy mentioned IRC, there's actually an #apache-spark channel on
 Freenode (I idle there sometimes).

 I'll comment more on the actual proposals here in a separate followup
 email, but I just wanted to add a bit of additional context in the meantime.

 - Josh

 On Thu, Dec 25, 2014 at 5:36 PM, Tobias Pfeiffer t...@preferred.jp wrote:

 Nick,

 uh, I would have expected a rather heated discussion, but the opposite
 seems to be the case ;-)

 Independent of my personal preferences w.r.t. usability, habits etc., I
 think it is not good for a software/tool/framework if questions and
 discussions are spread over too many places. I guess everyone of us knows
 an example of where this makes/has made it very hard for newcomers to get
 started ;-)

 As it is now, I think the mailing list has somewhat of an official
 touch, while Stack Overflow is, well, Stack Overflow ;-) To introduce
 another discussion platform next to the mailing list (your proposal (2.))
 would increase confusion, the number of double-postings and, as you said,
 effectively fork the community. Your proposal (1.) sounds attractive, but I
 highly doubt that the user experience can match people's expectations
 towards the pure solution on either the mailing list or Discourse, given
 the rather different discussion styles.

 Having said that, I totally agree to the points you mentioned; even just
 linking to a thread where a question has been discussed before is very
 time-consuming and I would be happy to use a platform where all those
 points are addressed. Stack Overflow seems to provide that, too, and except
 for the broader range of discussions you mentioned, I don't see the
 benefit of using Discourse over Stack Overflow. So personally, I would
 suggest to go with (3.) and encourage SO as a platform for questions that
 are ok to be asked there and try to reduce/focus mailing list communication
 for everything else. (Note that this is pretty much the same state as now
 plus encouraging people in an unspecified way, which means that maybe
 nothing changes at all.)

 Just my 2 cent,
 Tobias


 On Wed Dec 24 2014 at 21:50:48 Nick Chammas nicholas.cham...@gmail.com
 wrote:

 When people have questions about Spark, there are 2 main 

Re: Discourse: A proposed alternative to the Spark User list

2014-12-26 Thread Sean Owen
I like the idea and the hope that it turns 2+ places for discussions into
1, but in practice I think it will just turn it into 3+. The only thing I
can imagine is making a tool like this an overlay. Does that require much
integration work and does it affect anyone who can't use it?

People won't stop asking on SO and I don't imagine you can drop an official
project mailing list. ASF mailing lists are janky (ezmlm? Really?) and I'd
much prefer anything else but maybe simplest to leave that and try to
overlay.
On Dec 24, 2014 8:50 PM, Nick Chammas nicholas.cham...@gmail.com wrote:

 When people have questions about Spark, there are 2 main places (as far as
 I can tell) where they ask them:

- Stack Overflow, under the apache-spark tag
http://stackoverflow.com/questions/tagged/apache-spark
- This mailing list

 The mailing list is valuable as an independent place for discussion that
 is part of the Spark project itself. Furthermore, it allows for a broader
 range of discussions than would be allowed on Stack Overflow
 http://stackoverflow.com/help/dont-ask.

 As the Spark project has grown in popularity, I see that a few problems
 have emerged with this mailing list:

- It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
interested in, and it’s hard to know when someone has mentioned you
specifically.
- It’s hard to search for existing threads and link information across
disparate threads.
- It’s hard to format code and log snippets nicely, and by extension,
hard to read other people’s posts with this kind of information.

 There are existing solutions to all these (and other) problems based
 around straight-up discipline or client-side tooling, which users have to
 conjure up for themselves.

 I’d like us as a community to consider using Discourse
 http://www.discourse.org/ as an alternative to, or overlay on top of,
 this mailing list, that provides better out-of-the-box solutions to these
 problems.

 Discourse is a modern discussion platform built by some of the same people
 who created Stack Overflow. It has many neat features
 http://v1.discourse.org/about/ that I believe this community would
 benefit from.

 For example:

- When a user starts typing up a new post, they get a panel *showing
existing conversations that look similar*, just like on Stack Overflow.
- It’s easy to search for posts and link between them.
- *Markdown support* is built-in to composer.
- You can *specifically mention people* and they will be notified.
- Posts can be categorized (e.g. Streaming, SQL, etc.).
- There is a built-in option for mailing list support which forwards
all activity on the forum to a user’s email address and which allows for
creation of new posts via email.

 What do you think of Discourse as an alternative, more manageable way to
 discus Spark?

 There are a few options we can consider:

1. Work with the ASF as well as the Discourse team to allow Discourse
to act as an overlay on top of this mailing list

 https://meta.discourse.org/t/discourse-as-a-front-end-for-existing-asf-mailing-lists/23167?u=nicholaschammas,
allowing people to continue to use the mailing list as-is if they want.
(This is the toughest but perhaps most attractive option.)
2. Create a new Discourse forum for Spark that is not bound to this
user list. This is relatively easy but will effectively fork the community
on this list. (We cannot shut down this mailing in favor of one managed by
Discourse.)
3. Don’t use Discourse. Just encourage people on this list to post
instead on Stack Overflow whenever possible.
4. Something else.

 What does everyone think?

 Nick
 ​

 --
 View this message in context: Discourse: A proposed alternative to the
 Spark User list
 http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851.html
 Sent from the Apache Spark User List mailing list archive
 http://apache-spark-user-list.1001560.n3.nabble.com/ at Nabble.com.



Re: Discourse: A proposed alternative to the Spark User list

2014-12-25 Thread andy petrella
Nice idea, although it needs a plan on their hosting, or spark to host it
if I'm not wrong.

I've been using Slack for discussions, it's not exactly the same of
discourse, the ML or SO but offers interesting features.
It's more in the mood of IRC integrated with external services.

my2c

On Wed Dec 24 2014 at 21:50:48 Nick Chammas nicholas.cham...@gmail.com
wrote:

 When people have questions about Spark, there are 2 main places (as far as
 I can tell) where they ask them:

- Stack Overflow, under the apache-spark tag
http://stackoverflow.com/questions/tagged/apache-spark
- This mailing list

 The mailing list is valuable as an independent place for discussion that
 is part of the Spark project itself. Furthermore, it allows for a broader
 range of discussions than would be allowed on Stack Overflow
 http://stackoverflow.com/help/dont-ask.

 As the Spark project has grown in popularity, I see that a few problems
 have emerged with this mailing list:

- It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
interested in, and it’s hard to know when someone has mentioned you
specifically.
- It’s hard to search for existing threads and link information across
disparate threads.
- It’s hard to format code and log snippets nicely, and by extension,
hard to read other people’s posts with this kind of information.

 There are existing solutions to all these (and other) problems based
 around straight-up discipline or client-side tooling, which users have to
 conjure up for themselves.

 I’d like us as a community to consider using Discourse
 http://www.discourse.org/ as an alternative to, or overlay on top of,
 this mailing list, that provides better out-of-the-box solutions to these
 problems.

 Discourse is a modern discussion platform built by some of the same people
 who created Stack Overflow. It has many neat features
 http://v1.discourse.org/about/ that I believe this community would
 benefit from.

 For example:

- When a user starts typing up a new post, they get a panel *showing
existing conversations that look similar*, just like on Stack Overflow.
- It’s easy to search for posts and link between them.
- *Markdown support* is built-in to composer.
- You can *specifically mention people* and they will be notified.
- Posts can be categorized (e.g. Streaming, SQL, etc.).
- There is a built-in option for mailing list support which forwards
all activity on the forum to a user’s email address and which allows for
creation of new posts via email.

 What do you think of Discourse as an alternative, more manageable way to
 discus Spark?

 There are a few options we can consider:

1. Work with the ASF as well as the Discourse team to allow Discourse
to act as an overlay on top of this mailing list

 https://meta.discourse.org/t/discourse-as-a-front-end-for-existing-asf-mailing-lists/23167?u=nicholaschammas,
allowing people to continue to use the mailing list as-is if they want.
(This is the toughest but perhaps most attractive option.)
2. Create a new Discourse forum for Spark that is not bound to this
user list. This is relatively easy but will effectively fork the community
on this list. (We cannot shut down this mailing in favor of one managed by
Discourse.)
3. Don’t use Discourse. Just encourage people on this list to post
instead on Stack Overflow whenever possible.
4. Something else.

 What does everyone think?

 Nick
 ​

 --
 View this message in context: Discourse: A proposed alternative to the
 Spark User list
 http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851.html
 Sent from the Apache Spark User List mailing list archive
 http://apache-spark-user-list.1001560.n3.nabble.com/ at Nabble.com.



Re: Discourse: A proposed alternative to the Spark User list

2014-12-25 Thread Tobias Pfeiffer
Nick,

uh, I would have expected a rather heated discussion, but the opposite
seems to be the case ;-)

Independent of my personal preferences w.r.t. usability, habits etc., I
think it is not good for a software/tool/framework if questions and
discussions are spread over too many places. I guess everyone of us knows
an example of where this makes/has made it very hard for newcomers to get
started ;-)

As it is now, I think the mailing list has somewhat of an official touch,
while Stack Overflow is, well, Stack Overflow ;-) To introduce another
discussion platform next to the mailing list (your proposal (2.)) would
increase confusion, the number of double-postings and, as you said,
effectively fork the community. Your proposal (1.) sounds attractive, but I
highly doubt that the user experience can match people's expectations
towards the pure solution on either the mailing list or Discourse, given
the rather different discussion styles.

Having said that, I totally agree to the points you mentioned; even just
linking to a thread where a question has been discussed before is very
time-consuming and I would be happy to use a platform where all those
points are addressed. Stack Overflow seems to provide that, too, and except
for the broader range of discussions you mentioned, I don't see the
benefit of using Discourse over Stack Overflow. So personally, I would
suggest to go with (3.) and encourage SO as a platform for questions that
are ok to be asked there and try to reduce/focus mailing list communication
for everything else. (Note that this is pretty much the same state as now
plus encouraging people in an unspecified way, which means that maybe
nothing changes at all.)

Just my 2 cent,
Tobias


On Wed Dec 24 2014 at 21:50:48 Nick Chammas nicholas.cham...@gmail.com
 wrote:

 When people have questions about Spark, there are 2 main places (as far
 as I can tell) where they ask them:

- Stack Overflow, under the apache-spark tag
http://stackoverflow.com/questions/tagged/apache-spark
- This mailing list

 The mailing list is valuable as an independent place for discussion that
 is part of the Spark project itself. Furthermore, it allows for a broader
 range of discussions than would be allowed on Stack Overflow
 http://stackoverflow.com/help/dont-ask.

 As the Spark project has grown in popularity, I see that a few problems
 have emerged with this mailing list:

- It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
interested in, and it’s hard to know when someone has mentioned you
specifically.
- It’s hard to search for existing threads and link information
across disparate threads.
- It’s hard to format code and log snippets nicely, and by extension,
hard to read other people’s posts with this kind of information.

 There are existing solutions to all these (and other) problems based
 around straight-up discipline or client-side tooling, which users have to
 conjure up for themselves.

 I’d like us as a community to consider using Discourse
 http://www.discourse.org/ as an alternative to, or overlay on top of,
 this mailing list, that provides better out-of-the-box solutions to these
 problems.

 Discourse is a modern discussion platform built by some of the same
 people who created Stack Overflow. It has many neat features
 http://v1.discourse.org/about/ that I believe this community would
 benefit from.

 For example:

- When a user starts typing up a new post, they get a panel *showing
existing conversations that look similar*, just like on Stack
Overflow.
- It’s easy to search for posts and link between them.
- *Markdown support* is built-in to composer.
- You can *specifically mention people* and they will be notified.
- Posts can be categorized (e.g. Streaming, SQL, etc.).
- There is a built-in option for mailing list support which forwards
all activity on the forum to a user’s email address and which allows for
creation of new posts via email.

 What do you think of Discourse as an alternative, more manageable way to
 discus Spark?

 There are a few options we can consider:

1. Work with the ASF as well as the Discourse team to allow Discourse
to act as an overlay on top of this mailing list

 https://meta.discourse.org/t/discourse-as-a-front-end-for-existing-asf-mailing-lists/23167?u=nicholaschammas,
allowing people to continue to use the mailing list as-is if they want.
(This is the toughest but perhaps most attractive option.)
2. Create a new Discourse forum for Spark that is not bound to this
user list. This is relatively easy but will effectively fork the community
on this list. (We cannot shut down this mailing in favor of one managed by
Discourse.)
3. Don’t use Discourse. Just encourage people on this list to post
instead on Stack Overflow whenever possible.
4. Something else.

 What does everyone think?

 Nick




Re: Discourse: A proposed alternative to the Spark User list

2014-12-25 Thread Josh Rosen
We have a mirror of the user and developer mailing lists on Nabble, but
unfortunately this has led to significant usability issues because users
may attempt to post messages through Nabble which silently fail to get
posted to the actual Apache list and thus are never read by most
subscribers:
http://apache-spark-developers-list.1001551.n3.nabble.com/Nabble-mailing-list-mirror-errors-quot-This-post-has-NOT-been-accepted-by-the-mailing-list-yet-quot-td9772.html.
In fact, there are replies to this very thread that were not properly
mirrored from the Apache list to Nabble.

Before Spark moved to Apache, our mailing list was hosted on Google
Groups.  Several community members were in favor of keeping the discussion
list on Google Groups, since its interface is a bit more user-friendly:
https://groups.google.com/forum/#!topic/spark-users/vtg-5db8JWY

See also:
https://mail-archives.apache.org/mod_mbox/spark-dev/201308.mbox/%3cce3c361b.fc97f%25chris.a.mattm...@jpl.nasa.gov%3E

Since Andy mentioned IRC, there's actually an #apache-spark channel on
Freenode (I idle there sometimes).

I'll comment more on the actual proposals here in a separate followup
email, but I just wanted to add a bit of additional context in the meantime.

- Josh

On Thu, Dec 25, 2014 at 5:36 PM, Tobias Pfeiffer t...@preferred.jp wrote:

 Nick,

 uh, I would have expected a rather heated discussion, but the opposite
 seems to be the case ;-)

 Independent of my personal preferences w.r.t. usability, habits etc., I
 think it is not good for a software/tool/framework if questions and
 discussions are spread over too many places. I guess everyone of us knows
 an example of where this makes/has made it very hard for newcomers to get
 started ;-)

 As it is now, I think the mailing list has somewhat of an official
 touch, while Stack Overflow is, well, Stack Overflow ;-) To introduce
 another discussion platform next to the mailing list (your proposal (2.))
 would increase confusion, the number of double-postings and, as you said,
 effectively fork the community. Your proposal (1.) sounds attractive, but I
 highly doubt that the user experience can match people's expectations
 towards the pure solution on either the mailing list or Discourse, given
 the rather different discussion styles.

 Having said that, I totally agree to the points you mentioned; even just
 linking to a thread where a question has been discussed before is very
 time-consuming and I would be happy to use a platform where all those
 points are addressed. Stack Overflow seems to provide that, too, and except
 for the broader range of discussions you mentioned, I don't see the
 benefit of using Discourse over Stack Overflow. So personally, I would
 suggest to go with (3.) and encourage SO as a platform for questions that
 are ok to be asked there and try to reduce/focus mailing list communication
 for everything else. (Note that this is pretty much the same state as now
 plus encouraging people in an unspecified way, which means that maybe
 nothing changes at all.)

 Just my 2 cent,
 Tobias


 On Wed Dec 24 2014 at 21:50:48 Nick Chammas nicholas.cham...@gmail.com
 wrote:

 When people have questions about Spark, there are 2 main places (as far
 as I can tell) where they ask them:

- Stack Overflow, under the apache-spark tag
http://stackoverflow.com/questions/tagged/apache-spark
- This mailing list

 The mailing list is valuable as an independent place for discussion that
 is part of the Spark project itself. Furthermore, it allows for a broader
 range of discussions than would be allowed on Stack Overflow
 http://stackoverflow.com/help/dont-ask.

 As the Spark project has grown in popularity, I see that a few problems
 have emerged with this mailing list:

- It’s hard to follow topics (e.g. Streaming vs. SQL) that you’re
interested in, and it’s hard to know when someone has mentioned you
specifically.
- It’s hard to search for existing threads and link information
across disparate threads.
- It’s hard to format code and log snippets nicely, and by
extension, hard to read other people’s posts with this kind of 
 information.

 There are existing solutions to all these (and other) problems based
 around straight-up discipline or client-side tooling, which users have to
 conjure up for themselves.

 I’d like us as a community to consider using Discourse
 http://www.discourse.org/ as an alternative to, or overlay on top of,
 this mailing list, that provides better out-of-the-box solutions to these
 problems.

 Discourse is a modern discussion platform built by some of the same
 people who created Stack Overflow. It has many neat features
 http://v1.discourse.org/about/ that I believe this community would
 benefit from.

 For example:

- When a user starts typing up a new post, they get a panel *showing
existing conversations that look similar*, just like on Stack
Overflow.
- It’s easy to search for posts and link between