To help track and get the verbiage for the Spark community page and welcome email jump started, here's a working document for us to work with: https://docs.google.com/document/d/1N0pKatcM15cqBPqFWCqIy6jdgNzIoacZlYDCjufBh2s/edit#
Hope this will help us collaborate on this stuff a little faster. On Mon, Nov 7, 2016 at 2:25 PM Maciej Szymkiewicz <mszymkiew...@gmail.com> wrote: > Just a couple of random thoughts regarding Stack Overflow... > > - If we are thinking about shifting focus towards SO all attempts of > micromanaging should be discarded right in the beginning. Especially things > like meta tags, which are discouraged and "burninated" ( > https://meta.stackoverflow.com/tags/burninate-request/info) , or > thread bumping. Depending on a context these won't be manageable, go > against community guidelines or simply obsolete. > - Lack of expertise is unlikely an issue. Even now there is a number > of advanced Spark users on SO. Of course the more the merrier. > > Things that can be easily improved: > > - Identifying, improving and promoting canonical questions and > answers. It means closing duplicate, suggesting edits to improve existing > answers, providing alternative solutions. This can be also used to identify > gaps in the documentation. > - Providing a set of clear posting guidelines to reduce effort > required to identify the problem (think about > http://stackoverflow.com/q/5963269 a.k.a How to make a great R > reproducible example?) > - Helping users decide if question is a good fit for SO (see below). > API questions are great fit, debugging problems like "my cluster is slow" > are not. > - Actively cleaning (closing, deleting) off-topic and low quality > questions. The less junk to sieve through the better chance of good > questions being answered. > - Repurposing and actively moderating SO docs ( > https://stackoverflow.com/documentation/apache-spark/topics). Right > now most of the stuff that goes there is useless, duplicated or > plagiarized, or border case SPAM. > - Encouraging community to monitor featured ( > https://stackoverflow.com/questions/tagged/apache-spark?sort=featured) > and active & upvoted & unanswered ( > https://stackoverflow.com/unanswered/tagged/apache-spark) questions. > - Implementing some procedure to identify questions which are likely > to be bugs or a material for feature requests. Personally I am quite often > tempted to simply send a link to dev list, but I don't think it is really > acceptable. > - Animating Spark related chat room. I tried this a couple of times > but to no avail. Without a certain critical mass of users it just won't > work. > > > > On 11/07/2016 07:32 AM, Reynold Xin wrote: > > This is an excellent point. If we do go ahead and feature SO as a way for > users to ask questions more prominently, as someone who knows SO very well, > would you be willing to help write a short guideline (ideally the shorter > the better, which makes it hard) to direct what goes to user@ and what > goes to SO? > > > Sure, I'll be happy to help if I can. > > > > > On Sun, Nov 6, 2016 at 9:54 PM, Maciej Szymkiewicz <mszymkiew...@gmail.com > > wrote: > > Damn, I always thought that mailing list is only for nice and welcoming > people and there is nothing to do for me here >:) > > To be serious though, there are many questions on the users list which > would fit just fine on SO but it is not true in general. There are dozens > of questions which are to broad, opinion based, ask for external resources > and so on. If you want to direct users to SO you have to help them to > decide if it is the right channel. Otherwise it will just create a really > bad experience for both seeking help and active answerers. Former ones will > be downvoted and bashed, latter ones will have to deal with handling all > the junk and the number of active Spark users with moderation privileges is > really low (with only Massg and me being able to directly close duplicates). > > Believe me, I've seen this before. > On 11/07/2016 05:08 AM, Reynold Xin wrote: > > You have substantially underestimated how opinionated people can be on > mailing lists too :) > > On Sunday, November 6, 2016, Maciej Szymkiewicz <mszymkiew...@gmail.com> > wrote: > > You have to remember that Stack Overflow crowd (like me) is highly > opinionated, so many questions, which could be just fine on the mailing > list, will be quickly downvoted and / or closed as off-topic. Just > saying... > > -- > Best, > Maciej > > > On 11/07/2016 04:03 AM, Reynold Xin wrote: > > OK I've checked on the ASF member list (which is private so there is no > public archive). > > It is not against any ASF rule to recommend StackOverflow as a place for > users to ask questions. I don't think we can or should delete the existing > user@spark list either, but we can certainly make SO more visible than it > is. > > > > On Wed, Nov 2, 2016 at 10:21 AM, Reynold Xin <r...@databricks.com> wrote: > > Actually after talking with more ASF members, I believe the only policy is > that development decisions have to be made and announced on ASF properties > (dev list or jira), but user questions don't have to. > > I'm going to double check this. If it is true, I would actually recommend > us moving entirely over the Q&A part of the user list to stackoverflow, or > at least make that the recommended way rather than the existing user list > which is not very scalable. > > > On Wednesday, November 2, 2016, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > > We’ve discussed several times upgrading our communication tools, as far > back as 2014 and maybe even before that too. The bottom line is that we > can’t due to ASF rules requiring the use of ASF-managed mailing lists. > > For some history, see this discussion: > > - > > https://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3CCAOhmDzfL2COdysV8r5hZN8f=NqXM=f=oy5no2dhwj_kveop...@mail.gmail.com%3E > - > > https://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAOhmDzec1JdsXQq3dDwAv7eLnzRidSkrsKKG0xKw=tktxy_...@mail.gmail.com%3E > > (It’s ironic that it’s difficult to follow the past discussion on why we > can’t change our official communication tools due to those very tools…) > > Nick > > > On Wed, Nov 2, 2016 at 12:24 PM Ricardo Almeida < > ricardo.alme...@actnowib.com> wrote: > > I fell Assaf point is quite relevant if we want to move this project > forward from the Spark user perspective (as I do). In fact, we're still > using 20th century tools (mailing lists) with some add-ons (like Stack > Overflow). > > As usually, Sean and Cody's contributions are very to the point. > I fell it is indeed a matter of of culture (hard to enforce) and tools > (much easier). Isn't it? > > On 2 November 2016 at 16:36, Cody Koeninger <c...@koeninger.org> wrote: > > So concrete things people could do > > - users could tag subject lines appropriately to the component they're > asking about > > - contributors could monitor user@ for tags relating to components > they've worked on. > I'd be surprised if my miss rate for any mailing list questions > well-labeled as Kafka was higher than 5% > > - committers could be more aggressive about soliciting and merging PRs > to improve documentation. > It's a lot easier to answer even poorly-asked questions with a link to > relevant docs. > > On Wed, Nov 2, 2016 at 7:39 AM, Sean Owen <so...@cloudera.com> wrote: > > There's already reviews@ and issues@. dev@ is for project development > itself > > and I think is OK. You're suggesting splitting up user@ and I sympathize > > with the motivation. Experience tells me that we'll have a beginner@ > that's > > then totally ignored, and people will quickly learn to post to advanced@ > to > > get attention, and we'll be back where we started. Putting it in JIRA > > doesn't help. I don't think this a problem that is merely down to lack of > > process. It actually requires cultivating a culture change on the > community > > list. > > > > On Wed, Nov 2, 2016 at 12:11 PM Mendelson, Assaf < > assaf.mendel...@rsa.com> > > wrote: > >> > >> What I am suggesting is basically to fix that. > >> > >> For example, we might say that mailing list A is only for voting, > mailing > >> list B is only for PR and have something like stack overflow for > developer > >> questions (I would even go as far as to have beginner, intermediate and > >> advanced mailing list for users and beginner/advanced for dev). > >> > >> > >> > >> This can easily be done using stack overflow tags, however, that would > >> probably be harder to manage. > >> > >> Maybe using special jira tags and manage it in jira? > >> > >> > >> > >> Anyway as I said, the main issue is not user questions (except maybe > >> advanced ones) but more for dev questions. It is so easy to get lost in > the > >> chatter that it makes it very hard for people to learn spark internals… > >> > >> Assaf. > >> > >> > >> > >> From: Sean Owen [mailto:so...@cloudera.com] > >> Sent: Wednesday, November 02, 2016 2:07 PM > >> To: Mendelson, Assaf; dev@spark.apache.org > >> Subject: Re: Handling questions in the mailing lists > >> > >> > >> > >> I think that unfortunately mailing lists don't scale well. This one has > >> thousands of subscribers with different interests and levels of > experience. > >> For any given person, most messages will be irrelevant. I also find > that a > >> lot of questions on user@ are not well-asked, aren't an SSCCE > >> (http://sscce.org/), not something most people are going to bother > replying > >> to even if they could answer. I almost entirely ignore user@ because > there > >> are higher-priority channels like PRs to deal with, that already have > >> hundreds of messages per day. This is why little of it gets an answer > -- too > >> noisy. > >> > >> > >> > >> We have to have official mailing lists, in any event, to have some > >> official channel for things like votes and announcements. It's not > wrong to > >> ask questions on user@ of course, but a lot of the questions I see > could > >> have been answered with research of existing docs or looking at the > code. I > >> think that given the scale of the list, it's not wrong to assert that > this > >> is sort of a prerequisite for asking thousands of people to answer one's > >> question. But we can't enforce that. > >> > >> > >> > >> The situation will get better to the extent people ask better questions, > >> help other people ask better questions, and answer good questions. I'd > >> encourage anyone feeling this way to try to help along those dimensions. > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On Wed, Nov 2, 2016 at 11:32 AM assaf.mendelson < > assaf.mendel...@rsa.com> > >> wrote: > >> > >> Hi, > >> > >> I know this is a little off topic but I wanted to raise an issue about > >> handling questions in the mailing list (this is true both for the user > >> mailing list and the dev but since there are other options such as stack > >> overflow for user questions, this is more problematic in dev). > >> > >> Let’s say I ask a question (as I recently did). Unfortunately this was > >> during spark summit in Europe so probably people were busy. In any case > no > >> one answered. > >> > >> The problem is, that if no one answers very soon, the question will > almost > >> certainly remain unanswered because new messages will simply drown it. > >> > >> > >> > >> This is a common issue not just for questions but for any comment or > idea > >> which is not immediately picked up. > >> > >> > >> > >> I believe we should have a method of handling this. > >> > >> Generally, I would say these types of things belong in stack overflow, > >> after all, the way it is built is perfect for this. More seasoned spark > >> contributors and committers can periodically check out unanswered > questions > >> and answer them. > >> > >> The problem is that stack overflow (as well as other targets such as the > >> databricks forums) tend to have a more user based orientation. This > means > >> that any spark internal question will almost certainly remain > unanswered. > >> > >> > >> > >> I was wondering if we could come up with a solution for this. > >> > >> > >> > >> Assaf. > >> > >> > >> > >> > >> > >> ________________________________ > >> > >> View this message in context: Handling questions in the mailing lists > >> Sent from the Apache Spark Developers List mailing list archive at > >> Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > > > > > > > -- > Maciej Szymkiewicz > >