Re: Improving metadata in Spark JIRA

2015-02-06 Thread Nicholas Chammas
Do we need some new components to be added to the JIRA project?

Like:

   -

   scheduler
-

   YARN
- spark-submit
   - …?

Nick
​

On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas 
nicholas.cham...@gmail.com wrote:

 +9000 on cleaning up JIRA.

 Thank you Sean for laying out some specific things to tackle. I will
 assist with this.

 Regarding email, I think Sandy is right. I only get JIRA email for issues
 I'm watching.

 Nick

 On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza sandy.r...@cloudera.com
 wrote:

 JIRA updates don't go to this list, they go to iss...@spark.apache.org.
 I
 don't think many are signed up for that list, and those that are probably
 have a flood of emails anyway.

 So I'd definitely be in favor of any JIRA cleanup that you're up for.

 -Sandy

 On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen so...@cloudera.com wrote:

  I've wasted no time in wielding the commit bit to complete a number of
  small, uncontroversial changes. I wouldn't commit anything that didn't
  already appear to have review, consensus and little risk, but please
  let me know if anything looked a little too bold, so I can calibrate.
 
 
  Anyway, I'd like to continue some small house-cleaning by improving
  the state of JIRA's metadata, in order to let it give us a little
  clearer view on what's happening in the project:
 
  a. Add Component to every (open) issue that's missing one
  b. Review all Critical / Blocker issues to de-escalate ones that seem
  obviously neither
  c. Correct open issues that list a Fix version that has already been
  released
  d. Close all issues Resolved for a release that has already been
 released
 
  The problem with doing so is that it will create a tremendous amount
  of email to the list, like, several hundred. It's possible to make
  bulk changes and suppress e-mail though, which could be done for all
  but b.
 
  Better to suppress the emails when making such changes? or just not
  bother on some of these?
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 




Re: Improving metadata in Spark JIRA

2015-02-06 Thread Patrick Wendell
Per Nick's suggestion I added two components:

1. Spark Submit
2. Spark Scheduler

I figured I would just add these since if we decide later we don't
want them, we can simply merge them into Spark Core.

On Fri, Feb 6, 2015 at 11:53 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
 Do we need some new components to be added to the JIRA project?

 Like:

-

scheduler
 -

YARN
 - spark-submit
- ...?

 Nick


 On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 +9000 on cleaning up JIRA.

 Thank you Sean for laying out some specific things to tackle. I will
 assist with this.

 Regarding email, I think Sandy is right. I only get JIRA email for issues
 I'm watching.

 Nick

 On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza sandy.r...@cloudera.com
 wrote:

 JIRA updates don't go to this list, they go to iss...@spark.apache.org.
 I
 don't think many are signed up for that list, and those that are probably
 have a flood of emails anyway.

 So I'd definitely be in favor of any JIRA cleanup that you're up for.

 -Sandy

 On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen so...@cloudera.com wrote:

  I've wasted no time in wielding the commit bit to complete a number of
  small, uncontroversial changes. I wouldn't commit anything that didn't
  already appear to have review, consensus and little risk, but please
  let me know if anything looked a little too bold, so I can calibrate.
 
 
  Anyway, I'd like to continue some small house-cleaning by improving
  the state of JIRA's metadata, in order to let it give us a little
  clearer view on what's happening in the project:
 
  a. Add Component to every (open) issue that's missing one
  b. Review all Critical / Blocker issues to de-escalate ones that seem
  obviously neither
  c. Correct open issues that list a Fix version that has already been
  released
  d. Close all issues Resolved for a release that has already been
 released
 
  The problem with doing so is that it will create a tremendous amount
  of email to the list, like, several hundred. It's possible to make
  bulk changes and suppress e-mail though, which could be done for all
  but b.
 
  Better to suppress the emails when making such changes? or just not
  bother on some of these?
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Improving metadata in Spark JIRA

2015-02-06 Thread Hari Shreedharan
+1. Jira cleanup would be good. Please let me know if I can help in some way!




Thanks, Hari

On Fri, Feb 6, 2015 at 11:56 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:

 Do we need some new components to be added to the JIRA project?
 Like:
-
scheduler
 -
YARN
 - spark-submit
- …?
 Nick
 ​
 On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:
 +9000 on cleaning up JIRA.

 Thank you Sean for laying out some specific things to tackle. I will
 assist with this.

 Regarding email, I think Sandy is right. I only get JIRA email for issues
 I'm watching.

 Nick

 On Fri Feb 06 2015 at 9:52:58 AM Sandy Ryza sandy.r...@cloudera.com
 wrote:

 JIRA updates don't go to this list, they go to iss...@spark.apache.org.
 I
 don't think many are signed up for that list, and those that are probably
 have a flood of emails anyway.

 So I'd definitely be in favor of any JIRA cleanup that you're up for.

 -Sandy

 On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen so...@cloudera.com wrote:

  I've wasted no time in wielding the commit bit to complete a number of
  small, uncontroversial changes. I wouldn't commit anything that didn't
  already appear to have review, consensus and little risk, but please
  let me know if anything looked a little too bold, so I can calibrate.
 
 
  Anyway, I'd like to continue some small house-cleaning by improving
  the state of JIRA's metadata, in order to let it give us a little
  clearer view on what's happening in the project:
 
  a. Add Component to every (open) issue that's missing one
  b. Review all Critical / Blocker issues to de-escalate ones that seem
  obviously neither
  c. Correct open issues that list a Fix version that has already been
  released
  d. Close all issues Resolved for a release that has already been
 released
 
  The problem with doing so is that it will create a tremendous amount
  of email to the list, like, several hundred. It's possible to make
  bulk changes and suppress e-mail though, which could be done for all
  but b.
 
  Better to suppress the emails when making such changes? or just not
  bother on some of these?
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



Improving metadata in Spark JIRA

2015-02-06 Thread Sean Owen
I've wasted no time in wielding the commit bit to complete a number of
small, uncontroversial changes. I wouldn't commit anything that didn't
already appear to have review, consensus and little risk, but please
let me know if anything looked a little too bold, so I can calibrate.


Anyway, I'd like to continue some small house-cleaning by improving
the state of JIRA's metadata, in order to let it give us a little
clearer view on what's happening in the project:

a. Add Component to every (open) issue that's missing one
b. Review all Critical / Blocker issues to de-escalate ones that seem
obviously neither
c. Correct open issues that list a Fix version that has already been released
d. Close all issues Resolved for a release that has already been released

The problem with doing so is that it will create a tremendous amount
of email to the list, like, several hundred. It's possible to make
bulk changes and suppress e-mail though, which could be done for all
but b.

Better to suppress the emails when making such changes? or just not
bother on some of these?

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Data source API | sizeInBytes should be to *Scan

2015-02-06 Thread Aniket Bhatnagar
Hi Spark SQL committers

I have started experimenting with data sources API and I was wondering if
it makes sense to move the method sizeInBytes from BaseRelation to Scan
interfaces. This is because that a relation may be able to leverage filter
push down to estimate size potentially making a very large relation
broadcast-able. Thoughts?

Aniket


Re: Improving metadata in Spark JIRA

2015-02-06 Thread Sandy Ryza
JIRA updates don't go to this list, they go to iss...@spark.apache.org.  I
don't think many are signed up for that list, and those that are probably
have a flood of emails anyway.

So I'd definitely be in favor of any JIRA cleanup that you're up for.

-Sandy

On Fri, Feb 6, 2015 at 6:45 AM, Sean Owen so...@cloudera.com wrote:

 I've wasted no time in wielding the commit bit to complete a number of
 small, uncontroversial changes. I wouldn't commit anything that didn't
 already appear to have review, consensus and little risk, but please
 let me know if anything looked a little too bold, so I can calibrate.


 Anyway, I'd like to continue some small house-cleaning by improving
 the state of JIRA's metadata, in order to let it give us a little
 clearer view on what's happening in the project:

 a. Add Component to every (open) issue that's missing one
 b. Review all Critical / Blocker issues to de-escalate ones that seem
 obviously neither
 c. Correct open issues that list a Fix version that has already been
 released
 d. Close all issues Resolved for a release that has already been released

 The problem with doing so is that it will create a tremendous amount
 of email to the list, like, several hundred. It's possible to make
 bulk changes and suppress e-mail though, which could be done for all
 but b.

 Better to suppress the emails when making such changes? or just not
 bother on some of these?

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org