Re: [discuss] ending support for Java 6?

2015-04-30 Thread Vinod Kumar Vavilapalli
FYI, after enough consideration, we the Hadoop community dropped support for 
JDK 6 starting release Apache Hadoop 2.7.x.

Thanks
+Vinod

On Apr 30, 2015, at 12:02 PM, Reynold Xin r...@databricks.com wrote:

 This has been discussed a few times in the past, but now Oracle has ended
 support for Java 6 for over a year, I wonder if we should just drop Java 6
 support.
 
 There is one outstanding issue Tom has brought to my attention: PySpark on
 YARN doesn't work well with Java 7/8, but we have an outstanding pull
 request to fix that.
 
 https://issues.apache.org/jira/browse/SPARK-6869
 https://issues.apache.org/jira/browse/SPARK-1920


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli
 
 
 
 
 
 
 On 4/22/15, 1:25 PM, Mark Hamstra m...@clearstorydata.com wrote:
 
 Agreed.  The Spark project and community that Vinod describes do not
 resemble the ones with which I am familiar.
 
 On Wed, Apr 22, 2015 at 1:20 PM, Patrick Wendell pwend...@gmail.com
 wrote:
 
 Hi Vinod,
 
 Thanks for you thoughts - However, I do not agree with your sentiment
 and implications. Spark is broadly quite an inclusive project and we
 spend a lot of effort culturally to help make newcomers feel welcome.
 
 - Patrick
 
 On Wed, Apr 22, 2015 at 1:11 PM, Vinod Kumar Vavilapalli
 vino...@hortonworks.com wrote:
 Actually what this community got away with is pretty much an
 anti-pattern compared to every other Apache project I have seen. And
 may I
 say in a not so Apache way.
 
 Waiting for a committer to assign a patch to someone leaves it as a
 privilege to a committer. Not alluding to anything fishy in practice,
 but
 this also leaves a lot of open ground for self-interest. Committers
 defining notions of good fit / level of experience do not work, highly
 subjective and lead to group control.
 
 In terms of semantics, here is what most other projects (dare I say
 every Apache project?) that I have seen do
 - A new contributor comes in who is not yet added to the JIRA
 project.
 He/she requests one of the project's JIRA admins to add him/her.
 - After that, he or she is free to assign tickets to themselves.
 - What this means
-- Assigning a ticket to oneself is a signal to the rest of the
 community that he/she is actively working on the said patch.
-- If multiple contributors want to work on the same patch, it
 needs
 to resolved amicably through open communication. On JIRA, or on mailing
 lists. Not by the whim of a committer.
 - Common issues
-- Land grabbing: Other contributors can nudge him/her in case of
 inactivity and take them over. Again, amicably instead of a committer
 making subjective decisions.
-- Progress stalling: One contributor assigns the ticket to
 himself/herself is actively debating but with no real code/docs
 contribution or with any real intention of making progress. Here
 workable,
 reviewable code for review usually wins.
 
 Assigning patches is not a privilege. Contributors at Apache are a
 bunch
 of volunteers, the PMC should let volunteers contribute as they see
 fit. We
 do not assign work at Apache.
 
 +Vinod
 
 On Apr 22, 2015, at 12:32 PM, Patrick Wendell pwend...@gmail.com
 wrote:
 
 One over arching issue is that it's pretty unclear what Assigned to
 X in JIAR means from a process perspective. Personally I actually
 feel it's better for this to be more historical - i.e. who ended up
 submitting a patch for this feature that was merged - rather than
 creating an exclusive reservation for a particular user to work on
 something.
 
 If an issue is assigned to person X, but some other person Y
 submits
 a great patch for it, I think we have some obligation to Spark users
 and to the community to merge the better patch. So the idea of
 reserving the right to add a feature, it just seems overall off to
 me.
 IMO, its fine if multiple people want to submit competing patches
 for
 something, provided everyone comments on JIRA saying they are
 intending to submit a patch, and everyone understands there is
 duplicate effort. So commenting with an intention to submit a patch,
 IMO seems like the healthiest workflow since it is non exclusive.
 
 To me the main benefit of assigning something ahead of time is if
 you have a committer that really wants to see someone specific work
 on
 a patch, it just acts as a strong signal that there is someone
 endorsed to work on that patch. That doesn't mean no one else can
 submit a patch, but it is IMO more of a warning that there may be
 existing work which is likely to be high quality, to avoid
 duplicated
 effort.
 
 When it was really easy to assign features to themselves, I saw a
 lot
 of anti-patterns in the community that seemed unhealthy,
 specifically:
 
 - It was really unclear what it means semantically if someone is
 assigned to a JIRA.
 - People assign JIRA's to themselves that aren't a good fit, given
 the
 authors level of experience.
 - People expect if they assign JIRA's to themselves that others
 won't
 submit patches, and become upset if they do.
 - People are discouraged from working on a patch because someone
 else
 was officially assigned.
 
 - Patrick
 
 On Wed, Apr 22, 2015 at 11:13 AM, Sean Owen so...@cloudera.com
 wrote:
 Anecdotally, there are a number of people asking to set the
 Assignee
 field. This is currently restricted to Committers in JIRA. I know
 the
 logic was to prevent people from Assigning a JIRA and then leaving
 it;
 it also matters a bit for questions of credit.
 
 Still I wonder if it's best to just let people go ahead and set it,
 as
 the lesser evil. People can already do a lot like resolve JIRAs
 and
 set shepherd and critical priority and all that.
 
 I think the intent was to let Developers set

Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli

I watch these lists, so I have a fair understanding of how things work around 
here. I don't give direct input in the day to day activities though, like Greg 
Stein on the other thread, so I can understand if it looks like it came from up 
above. Apache Members come around and give opinions time to time, you don't 
need to take it as somebody up above forcing things down.

Thanks
+Vinod

On Apr 22, 2015, at 2:33 PM, Nicholas Chammas 
nicholas.cham...@gmail.commailto:nicholas.cham...@gmail.com wrote:

I want to take this opportunity to call out the approach to communication you 
took here.

As a random contributor to Spark and active participant on this list, my 
reaction when I read your email was this:

  *   You do not know how the Spark community actually works.
  *   You read a thread that contains some trigger phrases.
  *   You wrote a lengthy response as a knee-jerk reaction.

I’m not trying to mock, but I want to be direct and honest about how you came 
off in this thread to me and probably many others.

Why not ask questions first—many questions? Why not make doubly sure that you 
understand the situation correctly before responding?

In many ways this is much like filing a bug report. “I’m seeing this. It seems 
wrong to me. Is this expected?” I think we all know from experience that this 
kind of bug report is polite and will likely lead to a productive discussion. 
On the other hand: “You’re returning a -1 here? This is obviously wrong! And, 
boy, lemme tell you how wrong you are!!!” No-one likes to deal with bug reports 
like this. More importantly, they get in the way of fixing the actual problem, 
if there is one.

This is not about the Apache Way or not. It’s about basic etiquette and 
effective communication.

I understand that there are legitimate potential concerns here, and it’s 
important that, as an Apache project, Spark work according to Apache 
principles. But when some person who has never participated on this list pops 
up out of nowhere with a lengthy lecture on the Apache Way and whatnot, I have 
to say that that is not an effective way to communicate. Pretty much the same 
thing happened with Greg Stein on an earlier thread some months ago about 
designating maintainers for components.

The concerns are legitimate, I’m sure, and we want to keep Spark in line with 
the Apache Way. And certainly, there have been many times when a project veered 
off course and needed to corrected.

But when we want to make things right, I hope we can do it in a way that 
respectfully and tactfully engages the community. These “lectures delivered 
from above” — which is how they come off — are not helpful.

Nick


Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli
Actually what this community got away with is pretty much an anti-pattern 
compared to every other Apache project I have seen. And may I say in a not so 
Apache way.

Waiting for a committer to assign a patch to someone leaves it as a privilege 
to a committer. Not alluding to anything fishy in practice, but this also 
leaves a lot of open ground for self-interest. Committers defining notions of 
good fit / level of experience do not work, highly subjective and lead to group 
control.

In terms of semantics, here is what most other projects (dare I say every 
Apache project?) that I have seen do
 - A new contributor comes in who is not yet added to the JIRA project. He/she 
requests one of the project's JIRA admins to add him/her.
 - After that, he or she is free to assign tickets to themselves.
 - What this means
-- Assigning a ticket to oneself is a signal to the rest of the community 
that he/she is actively working on the said patch.
-- If multiple contributors want to work on the same patch, it needs to 
resolved amicably through open communication. On JIRA, or on mailing lists. Not 
by the whim of a committer.
 - Common issues
-- Land grabbing: Other contributors can nudge him/her in case of 
inactivity and take them over. Again, amicably instead of a committer making 
subjective decisions.
-- Progress stalling: One contributor assigns the ticket to himself/herself 
is actively debating but with no real code/docs contribution or with any real 
intention of making progress. Here workable, reviewable code for review usually 
wins.

Assigning patches is not a privilege. Contributors at Apache are a bunch of 
volunteers, the PMC should let volunteers contribute as they see fit. We do not 
assign work at Apache.

+Vinod

On Apr 22, 2015, at 12:32 PM, Patrick Wendell pwend...@gmail.com wrote:

 One over arching issue is that it's pretty unclear what Assigned to
 X in JIAR means from a process perspective. Personally I actually
 feel it's better for this to be more historical - i.e. who ended up
 submitting a patch for this feature that was merged - rather than
 creating an exclusive reservation for a particular user to work on
 something.
 
 If an issue is assigned to person X, but some other person Y submits
 a great patch for it, I think we have some obligation to Spark users
 and to the community to merge the better patch. So the idea of
 reserving the right to add a feature, it just seems overall off to me.
 IMO, its fine if multiple people want to submit competing patches for
 something, provided everyone comments on JIRA saying they are
 intending to submit a patch, and everyone understands there is
 duplicate effort. So commenting with an intention to submit a patch,
 IMO seems like the healthiest workflow since it is non exclusive.
 
 To me the main benefit of assigning something ahead of time is if
 you have a committer that really wants to see someone specific work on
 a patch, it just acts as a strong signal that there is someone
 endorsed to work on that patch. That doesn't mean no one else can
 submit a patch, but it is IMO more of a warning that there may be
 existing work which is likely to be high quality, to avoid duplicated
 effort.
 
 When it was really easy to assign features to themselves, I saw a lot
 of anti-patterns in the community that seemed unhealthy, specifically:
 
 - It was really unclear what it means semantically if someone is
 assigned to a JIRA.
 - People assign JIRA's to themselves that aren't a good fit, given the
 authors level of experience.
 - People expect if they assign JIRA's to themselves that others won't
 submit patches, and become upset if they do.
 - People are discouraged from working on a patch because someone else
 was officially assigned.
 
 - Patrick
 
 On Wed, Apr 22, 2015 at 11:13 AM, Sean Owen so...@cloudera.com wrote:
 Anecdotally, there are a number of people asking to set the Assignee
 field. This is currently restricted to Committers in JIRA. I know the
 logic was to prevent people from Assigning a JIRA and then leaving it;
 it also matters a bit for questions of credit.
 
 Still I wonder if it's best to just let people go ahead and set it, as
 the lesser evil. People can already do a lot like resolve JIRAs and
 set shepherd and critical priority and all that.
 
 I think the intent was to let Developers set this, but maybe due to
 an error, that's not how the current JIRA permission is implemented.
 
 I ask because I'm about to ping INFRA to update our scheme.
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org
 


-
To unsubscribe, 

Re: Should we let everyone set Assignee?

2015-04-22 Thread Vinod Kumar Vavilapalli

If it is true what you say, what is the reason for this 
committer-only-assigns-JIRA tickets policy? If anyone can send a pull request, 
anyone should be able to assign tickets to himself/herself too.

+Vinod

On Apr 22, 2015, at 1:18 PM, Reynold Xin 
r...@databricks.commailto:r...@databricks.com wrote:

Woh hold on a minute.

Spark has been among the projects that are the most welcoming to new 
contributors. And thanks to this, the sheer number of activities in Spark is 
much larger than other projects, and our workflow has to accommodate this fact.

In practice, people just create pull requests on github, which is a newer  
friendlier  better model given the constraints. We even have tools that 
automatically tags a ticket with a link to the pull requests.


On Wed, Apr 22, 2015 at 1:11 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.commailto:vino...@hortonworks.com wrote:
Actually what this community got away with is pretty much an anti-pattern 
compared to every other Apache project I have seen. And may I say in a not so 
Apache way.

Waiting for a committer to assign a patch to someone leaves it as a privilege 
to a committer. Not alluding to anything fishy in practice, but this also 
leaves a lot of open ground for self-interest. Committers defining notions of 
good fit / level of experience do not work, highly subjective and lead to group 
control.

In terms of semantics, here is what most other projects (dare I say every 
Apache project?) that I have seen do
 - A new contributor comes in who is not yet added to the JIRA project. He/she 
requests one of the project's JIRA admins to add him/her.
 - After that, he or she is free to assign tickets to themselves.
 - What this means
-- Assigning a ticket to oneself is a signal to the rest of the community 
that he/she is actively working on the said patch.
-- If multiple contributors want to work on the same patch, it needs to 
resolved amicably through open communication. On JIRA, or on mailing lists. Not 
by the whim of a committer.
 - Common issues
-- Land grabbing: Other contributors can nudge him/her in case of 
inactivity and take them over. Again, amicably instead of a committer making 
subjective decisions.
-- Progress stalling: One contributor assigns the ticket to himself/herself 
is actively debating but with no real code/docs contribution or with any real 
intention of making progress. Here workable, reviewable code for review usually 
wins.

Assigning patches is not a privilege. Contributors at Apache are a bunch of 
volunteers, the PMC should let volunteers contribute as they see fit. We do not 
assign work at Apache.

+Vinod

On Apr 22, 2015, at 12:32 PM, Patrick Wendell 
pwend...@gmail.commailto:pwend...@gmail.com wrote:

 One over arching issue is that it's pretty unclear what Assigned to
 X in JIAR means from a process perspective. Personally I actually
 feel it's better for this to be more historical - i.e. who ended up
 submitting a patch for this feature that was merged - rather than
 creating an exclusive reservation for a particular user to work on
 something.

 If an issue is assigned to person X, but some other person Y submits
 a great patch for it, I think we have some obligation to Spark users
 and to the community to merge the better patch. So the idea of
 reserving the right to add a feature, it just seems overall off to me.
 IMO, its fine if multiple people want to submit competing patches for
 something, provided everyone comments on JIRA saying they are
 intending to submit a patch, and everyone understands there is
 duplicate effort. So commenting with an intention to submit a patch,
 IMO seems like the healthiest workflow since it is non exclusive.

 To me the main benefit of assigning something ahead of time is if
 you have a committer that really wants to see someone specific work on
 a patch, it just acts as a strong signal that there is someone
 endorsed to work on that patch. That doesn't mean no one else can
 submit a patch, but it is IMO more of a warning that there may be
 existing work which is likely to be high quality, to avoid duplicated
 effort.

 When it was really easy to assign features to themselves, I saw a lot
 of anti-patterns in the community that seemed unhealthy, specifically:

 - It was really unclear what it means semantically if someone is
 assigned to a JIRA.
 - People assign JIRA's to themselves that aren't a good fit, given the
 authors level of experience.
 - People expect if they assign JIRA's to themselves that others won't
 submit patches, and become upset if they do.
 - People are discouraged from working on a patch because someone else
 was officially assigned.

 - Patrick

 On Wed, Apr 22, 2015 at 11:13 AM, Sean Owen 
 so...@cloudera.commailto:so...@cloudera.com wrote:
 Anecdotally, there are a number of people asking to set the Assignee
 field. This is currently restricted to Committers in JIRA. I know the
 logic was to prevent people from Assigning a JIRA

Re: [VOTE] Designating maintainers for some Spark components

2014-11-06 Thread Vinod Kumar Vavilapalli
 With the maintainer model, the process is as follows:
 
 - Any committer could review the patch and merge it, but they would need to 
 forward it to me (or another core API maintainer) to make sure we also approve
 - At any point during this process, I could come in and -1 it, or give 
 feedback
 - In addition, any other committer beyond me is still allowed to -1 this patch
 
 The only change in this model is that committers are responsible to forward 
 patches in these areas to certain other committers. If every committer had 
 perfect oversight of the project, they could have also seen every patch to 
 their component on their own, but this list ensures that they see it even if 
 they somehow overlooked it.


Having done the job of playing an informal 'maintainer' of a project myself, 
this is what I think you really need:

The so called 'maintainers' do one of the below
 - Actively poll the lists and watch over contributions. And follow what is 
repeated often around here: Trust but verify.
 - Setup automated mechanisms to send all bug-tracker updates of a specific 
component to a list that people can subscribe to

And/or
 - Individual contributors send review requests to unofficial 'maintainers' 
over dev-lists or through tools. Like many projects do with review boards and 
other tools.

Note that none of the above is a required step. It must not be, that's the 
point. But once set as a convention, they will all help you address your 
concerns with project scalability.

Anything else that you add is bestowing privileges to a select few and forming 
dictatorships. And contrary to what the proposal claims, this is neither 
scalable nor confirming to Apache governance rules.

+Vinod


signature.asc
Description: Message signed with OpenPGP using GPGMail