we could implement some ‘load balancing’ policies: I think Gerard’s suggestions are good. We need some “official” buy-in from the project’s maintainers and heavy contributors and we should move forward with them.
I know that at least Josh Rosen, Sean Owen, and Tathagata Das, who are active on this list, are also active on SO <http://stackoverflow.com/tags/apache-spark/topusers>. So perhaps we’re already part of the way there. Nick On Thu Jan 22 2015 at 5:32:40 AM Gerard Maas <gerard.m...@gmail.com> wrote: > I've have been contributing to SO for a while now. Here're few > observations I'd like to contribute to the discussion: > > The level of questions on SO is often of more entry-level. "Harder" > questions (that require expertise in a certain area) remain unanswered for > a while. Same questions here on the list (as they are often cross-posted) > receive faster turnaround. > Roughly speaking, there're two groups of questions: Implementing things on > Spark and Running Spark. The second one is borderline on SO guidelines as > they often involve cluster setups, long logs and little idea of what's > going on (mind you, often those questions come from people starting with > Spark) > > In my opinion, Stack Overflow offers a better Q/A experience, in > particular, they have tooling in place to reduce duplicates, something that > often overloads this list (same "getting started issues" or "how to map, > filter, flatmap" over and over again). That said, this list offers a > richer forum, where the expertise pool is a lot deeper. > Also, while SO is fairly strict in requiring posters from showing a > minimal amount of effort in the question being asked, this list is quite > friendly to the same behavior. This could be probably an element that makes > the list 'lower impedance'. > One additional thing on SO is that the [apache-spark] tag is a 'low rep' > tag. Neither questions nor answers get significant voting, reducing the > 'rep gaming' factor (discouraging participation?) > > Thinking about how to improve both platforms: SO[apache-spark] and this > ML, and get back the list to "not overwhelming" message volumes, we could > implement some 'load balancing' policies: > - encourage new users to use Stack Overflow, in particular, redirect > newbie questions to SO the friendly way: "did you search SO already?" or > link to an existing question. > - most how to "map, flatmap, filter, aggregate, reduce, ..." would fall > under this category > - encourage domain experts to hang on SO more often (my impression is > that MLLib, GraphX are fairly underserved) > - have an 'scalation process' in place, where we could post > 'interesting/hard/bug' questions from SO back to the list (or encourage the > poster to do so) > - update our "community guidelines" on [ > http://spark.apache.org/community.html] to implement such policies. > > Those are just some ideas on how to improve the community and better serve > the newcomers while avoiding overload of our existing expertise pool. > > kr, Gerard. > > > On Thu, Jan 22, 2015 at 10:42 AM, Sean Owen <so...@cloudera.com> wrote: > >> Yes, there is some project business like votes of record on releases that >> needs to be carried on in standard, simple accessible place and SO is not >> at all suitable. >> >> Nobody is stuck with Nabble. The suggestion is to enable a different >> overlay on the existing list. SO remains a place you can ask questions too. >> So I agree with Nick's take. >> >> BTW are there perhaps plans to split this mailing list into >> subproject-specific lists? That might also help tune in/out the subset of >> conversations of interest. >> On Jan 22, 2015 10:30 AM, "Petar Zecevic" <petar.zece...@gmail.com> >> wrote: >> >>> >>> Ok, thanks for the clarifications. I didn't know this list has to remain >>> as the only official list. >>> >>> Nabble is really not the best solution in the world, but we're stuck >>> with it, I guess. >>> >>> That's it from me on this subject. >>> >>> Petar >>> >>> >>> On 22.1.2015. 3:55, Nicholas Chammas wrote: >>> >>> I think a few things need to be laid out clearly: >>> >>> 1. This mailing list is the “official” user discussion platform. >>> That is, it is sponsored and managed by the ASF. >>> 2. Users are free to organize independent discussion platforms >>> focusing on Spark, and there is already one such platform in Stack >>> Overflow >>> under the apache-spark and related tags. Stack Overflow works quite >>> well. >>> 3. The ASF will not agree to deprecating or migrating this user list >>> to a platform that they do not control. >>> 4. This mailing list has grown to an unwieldy size and discussions >>> are hard to find or follow; discussion tooling is also lacking. We want >>> to >>> improve the utility and user experience of this mailing list. >>> 5. We don’t want to fragment this “official” discussion community. >>> 6. Nabble is an independent product not affiliated with the ASF. It >>> offers a slightly better interface to the Apache mailing list archives. >>> >>> So to respond to some of your points, pzecevic: >>> >>> Apache user group could be frozen (not accepting new questions, if >>> that’s possible) and redirect users to Stack Overflow (automatic reply?). >>> >>> From what I understand of the ASF’s policies, this is not possible. :( >>> This mailing list must remain the official Spark user discussion platform. >>> >>> Other thing, about new Stack Exchange site I proposed earlier. If a new >>> site is created, there is no problem with guidelines, I think, because >>> Spark community can apply different guidelines for the new site. >>> >>> I think Stack Overflow and the various Spark tags are working fine. I >>> don’t see a compelling need for a Stack Exchange dedicated to Spark, either >>> now or in the near future. Also, I doubt a Spark-specific site can pass the >>> 4 tests in the Area 51 FAQ <http://area51.stackexchange.com/faq>: >>> >>> - Almost all Spark questions are on-topic for Stack Overflow >>> - Stack Overflow already exists, it already has a tag for Spark, and >>> nobody is complaining >>> - You’re not creating such a big group that you don’t have enough >>> experts to answer all possible questions >>> - There’s a high probability that users of Stack Overflow would >>> enjoy seeing the occasional question about Spark >>> >>> I think complaining won’t be sufficient. :) >>> >>> Someone expressed a concern that they won’t allow creating a >>> project-specific site, but there already exist some project-specific sites, >>> like Tor, Drupal, Ubuntu… >>> >>> The communities for these projects are many, many times larger than the >>> Spark community is or likely ever will be, simply due to the nature of the >>> problems they are solving. >>> >>> What we need is an improvement to this mailing list. We need better >>> tooling than Nabble to sit on top of the Apache archives, and we also need >>> some way to control the volume and quality of mail on the list so that it >>> remains a useful resource for the majority of users. >>> >>> Nick >>> >>> >>> On Wed Jan 21 2015 at 3:13:21 PM pzecevic <petar.zece...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> I tried to find the last reply by Nick Chammas (that I received in the >>>> digest) using the Nabble web interface, but I cannot find it (perhaps he >>>> didn't reply directly to the user list?). That's one example of Nabble's >>>> usability. >>>> >>>> Anyhow, I wanted to add my two cents... >>>> >>>> Apache user group could be frozen (not accepting new questions, if >>>> that's >>>> possible) and redirect users to Stack Overflow (automatic reply?). Old >>>> questions remain (and are searchable) on Nabble, new questions go to >>>> Stack >>>> Exchange, so no need for migration. That's the idea, at least, as I'm >>>> not >>>> sure if that's technically doable... Is it? >>>> dev mailing list could perhaps stay on Nabble (it's not that busy), or >>>> have >>>> a special tag on Stack Exchange. >>>> >>>> Other thing, about new Stack Exchange site I proposed earlier. If a new >>>> site >>>> is created, there is no problem with guidelines, I think, because Spark >>>> community can apply different guidelines for the new site. >>>> >>>> There is a FAQ about creating new sites: >>>> http://area51.stackexchange.com/faq >>>> It says: "Stack Exchange sites are free to create and free to use. All >>>> we >>>> ask is that you have an enthusiastic, committed group of expert users >>>> who >>>> check in regularly, asking and answering questions." >>>> I think this requirement is satisfied... >>>> Someone expressed a concern that they won't allow creating a >>>> project-specific site, but there already exist some project-specific >>>> sites, >>>> like Tor, Drupal, Ubuntu... >>>> >>>> Later, though, the FAQ also says: >>>> "If Y already exists, it already has a tag for X, and nobody is >>>> complaining" >>>> (then you should not create a new site). But we could complain :) >>>> >>>> The advantage of having a separate site is that users, who should have >>>> more >>>> privileges, would need to earn them through Spark questions and answers >>>> only. The other thing, already mentioned, is that the community could >>>> create >>>> Spark specific guidelines. There are also 'meta' sites for asking >>>> questions >>>> like this one, etc. >>>> >>>> There is a process for starting a site - it's not instantaneous. New >>>> site >>>> needs to go through private beta and public beta, so that could be a >>>> drawback. >>>> >>>> >>>> Like btiernay, I must say: there might be something about Apache >>>> projects >>>> and mailing lists that I do not know, so excuse me if that is the >>>> case... >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Discourse-A-proposed-alternative-to-the-Spark-User-list-tp20851p21299.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >>>> >>> >