Re: Bluesky calls for a new mentor!

2011-06-30 Thread SamuelKevin
Hi, Ralph:
 I am not avoiding the truth that we suck during the last three years,
though we were once at the verge of release.  It's* just* we  *Bluesky Team
@ XJTU ,Xi'an China* fail to make it good, please remember it well. I
believe projects from  school could also rock in Apache as well. You can
look down up on us but you can't deny others.
 I had a propose that community give us the last chance for 1-2month and
certain member could become our mentor to lead us finish  releasing the
newest version.   During this time slot. What you would see includes:

   1. gradually increasing discussion in bluesky-dev mailing list.
   Meaningless discussion would not count.
   2. committing of source code after they were cleaned up.
   Inactive committers would be revoked and new committers would apply to join
   in.
   3. preparing for what release needs and make the release successful. Thus
   the new developers and committers could completely experienced the release
   process and know about How things are done in Apache community better.

 If community accept my suggestion, individually, i want the BlueSky
project under strict surveillance by community members. If we can't fulfill
what we just promised, then just kick us out of here and i would have noting
to say.
 Well, suppose we live through that, besides working in Apache way, we
would continually working on to evolve BlueSky to make it much more easier
to use in  e-learning area and used in a larger scope(now bluesky has
been deployed in China and is about to be applied in India ), so that more
students in undeveloped district could share the same high quality education
as the developed area.
  Sincerely, i would invite you Ralph to be our mentor in this 1-2 month
if you were not busy enough and willing to guide us. Don't feel sorry if you
want to refuse me.TOT
regards,
Kevin

2011/6/30 Ralph Goers ralph.go...@dslextreme.com

 Sorry, but the explanation below makes things sound even worse. Apache
 projects are not here to give students a place to do school work. What you
 have described is not a community.  If the project cannot build a community
 of people who are interested in the project for more than a school term then
 it doesn't belong here.

 Ralph

 On Jun 29, 2011, at 8:12 PM, SamuelKevin wrote:

  Hi, Noel:
 
  2011/6/30 Noel J. Bergman n...@devtech.com
 
  Joe Schaefer wrote:
  Chen Liu wrote:
  We propose to move future development of BlueSky to the Apache
 Software
  Foundation in order to build a broader user and developer community.
 
  You are supposed to be doing your development work in the ASF
 subversion
  repository, using ASF mailing lists, as peers.
 
  Chen, as Joe points out, these are what BlueSky should have been doing
 for
  the past three (3) years, and yet we still here a proposal for the
 future.
 
  Looking at the (limited) commit history, there is a total imbalance
  between
  the number of people associated with the development work (20+) and the
  number of people with Apache accounts here (2).
 
  I guess i can explain that. Most of the developers of BlueSky project
 are
  students. As you all know, students come  when they join in school and go
  after they graduate. So the active developers are around 10. Like we used
 to
  have 5 committers, but now we only have 2 committers in active.
 
  Again, as Joe points out, ALL of BlueSky development should been done
 via
  the ASF infrastructure, not periodically synchronized.  We are a
  development
  community, not a remote archive.
 
  What we really need you to discuss are *plans*, how you will implement
  them,
  who will implement them, and how you will collaborate in the codebase
 as
  peers.
 
  Joe, again, has this on the money.  The BlueSky project must immediately
  make significant strides to rectify these issues.  Now, not later.
 
  We should see:
 
  1) All current code in the ASF repository.
  2) All development via ASF accounts (get the rest of the people signed
  up).
  3) Ddevelopment discussion on the mailing list.
  4) All licensing issues cleaned up.
 
  According to what you've listed, i would forward your suggestion to
 bluesky
  dev list and wish we could make a quick response after
  discussion. Appreciate your help.
  regards,
  Kevin
 
--- Noel
 
 
 
  -
  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
  For additional commands, e-mail: general-h...@incubator.apache.org
 
 
 
 
  --
  Bowen Ma a.k.a Samuel Kevin @ Bluesky Dev TeamXJTU
  Shaanxi Province Key Lab. of Satellite and Terrestrial Network Tech
  http://incubator.apache.org/bluesky/


 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org




-- 
Bowen Ma a.k.a Samuel Kevin @ Bluesky Dev TeamXJTU
Shaanxi Province Key Lab. 

Re: [VOTE] Oozie to join the Incubator

2011-06-30 Thread Mayank Bansal
+1 (non-binding)

On Wed, Jun 29, 2011 at 12:10 PM, Mohammad Islam misla...@yahoo.com wrote:

 Hi All,

 The discussion about Oozie proposal is settling down. Therefore I would
 like to
 initiate a vote to accept Oozie as an Apache Incubator project.

 The latest proposal is pasted at the end and it could be found in the wiki
 as
 well:

 http://wiki.apache.org/incubator/OozieProposal


 The related discussion thread is at:
 http://www.mail-archive.com/general@incubator.apache.org/msg29633.html


 Please cast your votes:

 [  ] +1 Accept Oozie for incubation
 [  ] +0 Indifferent to Oozie incubation
 [  ] -1 Reject Oozie for incubation

 This vote will close 72 hours  from now.

 Regards,
 Mohammad


 Abstract
 Oozie is a server-based workflow scheduling and coordination system to
 manage
 data processing jobs for Apache HadoopTM.

 Proposal
 Oozie is an  extensible, scalable and reliable system to define, manage,
 schedule,  and execute complex Hadoop workloads via web services. More
 specifically, this includes:

* XML-based declarative framework to specify a job or a complex
 workflow of
 dependent jobs.

* Support different types of job such as Hadoop Map-Reduce, Pipe,
 Streaming,
 Pig, Hive and custom java applications.

* Workflow scheduling based on frequency and/or data availability.
* Monitoring capability, automatic retry and failure handing of
 jobs.
* Extensible and pluggable architecture to allow arbitrary grid
 programming
 paradigms.

* Authentication, authorization, and capacity-aware load throttling
 to allow
 multi-tenant software as a service.

 Background
 Most data  processing applications require multiple jobs to achieve their
 goals,
 with inherent dependencies among the jobs. A dependency could be
  sequential,
 where one job can only start after another job has finished.  Or it could
 be
 conditional, where the execution of a job depends on the  return value or
 status
 of another job. In other cases, parallel  execution of multiple jobs may be
 permitted – or desired – to exploit  the massive pool of compute nodes
 provided
 by Hadoop.

 These  job dependencies are often expressed as a Directed Acyclic Graph,
 also
 called a workflow. A node in the workflow is typically a job (a
  computation on
 the grid) or another type of action such as an eMail  notification.
 Computations
 can be expressed in map/reduce, Pig, Hive or  any other programming
 paradigm
 available on the grid. Edges of the graph  represent transitions from one
 node
 to the next, as the execution of a  workflow proceeds.

 Describing  a workflow in a declarative way has the advantage of decoupling
 job
 dependencies and execution control from application logic. Furthermore,
  the
 workflow is modularized into jobs that can be reused within the same
  workflow
 or across different workflows. Execution of the workflow is  then driven by
 a
 runtime system without understanding the application  logic of the jobs.
 This
 runtime system specializes in reliable and  predictable execution: It can
 retry
 actions that have failed or invoke a  cleanup action after termination of
 the
 workflow; it can monitor  progress, success, or failure of a workflow, and
 send
 appropriate alerts  to an administrator. The application developer is
 relieved
 from  implementing these generic procedures.

 Furthermore,  some applications or workflows need to run in periodic
 intervals
 or  when dependent data is available. For example, a workflow could be
  executed
 every day as soon as output data from the previous 24 instances  of
 another,
 hourly workflow is available. The workflow coordinator  provides such
 scheduling
 features, along with prioritization, load  balancing and throttling to
 optimize
 utilization of resources in the  cluster. This makes it easier to maintain,
 control, and coordinate  complex data applications.

 Nearly  three years ago, a team of Yahoo! developers addressed these
 critical
 requirements for Hadoop-based data processing systems by developing a  new
 workflow management and scheduling system called Oozie. While it was
  initially
 developed as a Yahoo!-internal project, it was designed and  implemented
 with
 the intention of open-sourcing. Oozie was released as a GitHub project in
 early
 2010. Oozie is used in production within Yahoo and  since it has been
 open-sourced it has been gaining adoption with  external developers

 Rationale
 Commonly,  applications that run on Hadoop require multiple Hadoop jobs in
 order
 to  obtain the desired results. Furthermore, these Hadoop jobs are commonly
  a
 combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes
 map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and
 shell
 scripts.

 Because  of this, developers find themselves writing ad-hoc glue programs
 to
 combine these Hadoop jobs. These ad-hoc programs are difficult to
  schedule,
 manage, monitor and recover.

 

Re: [PROPOSAL] Deft for incubation

2011-06-30 Thread Niklas Gustavsson
On Wed, Jun 29, 2011 at 4:05 PM, Mohammad Nour El-Din
nour.moham...@gmail.com wrote:
   You can sign me in.

You've been added to the wiki page.

One or two more mentors would be outstanding.

/niklas

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: KEYS and releases

2011-06-30 Thread Robert Burrell Donkin
On Tue, Jun 28, 2011 at 10:20 AM, Christian Grobmeier
grobme...@gmail.com wrote:
 we copy a KEYS file into that directory upon succesful VOTE of the release
 artifacts (which also include the KEYS file).

 Perhaps, but the point we're getting at was explicitly stated by Benson,
 The goal here is to allow and encourage consumers to independently verify
 signatures.  That calls for KEYS somewhere else than inside the package.

 I am sorry to ask it again, but why can't the incubator have a policy
 to make people use:
 https://id.apache.org/
 to store their signing key.

 Then we have them listed for each projects there:
 https://people.apache.org/keys/

 Was it not meant that way?

AIUI  this infrastructure is relative new and intended to add defense-in-depth

IMHO the IPMC should only document (any volunteers?) a strong
recommendation but leave policy in this area to the experts over in
infrastructure

Robert

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Oozie to join the Incubator

2011-06-30 Thread Chris Douglas
+1 (binding) -C

On Wednesday, June 29, 2011, Mohammad Islam misla...@yahoo.com wrote:
 Hi All,

 The discussion about Oozie proposal is settling down. Therefore I would like 
 to
 initiate a vote to accept Oozie as an Apache Incubator project.

 The latest proposal is pasted at the end and it could be found in the wiki as
 well:

 http://wiki.apache.org/incubator/OozieProposal


 The related discussion thread is at:
 http://www.mail-archive.com/general@incubator.apache.org/msg29633.html


 Please cast your votes:

 [  ] +1 Accept Oozie for incubation
 [  ] +0 Indifferent to Oozie incubation
 [  ] -1 Reject Oozie for incubation

 This vote will close 72 hours  from now.

 Regards,
 Mohammad


 Abstract
 Oozie is a server-based workflow scheduling and coordination system to manage
 data processing jobs for Apache HadoopTM.

 Proposal
 Oozie is an  extensible, scalable and reliable system to define, manage,
 schedule,  and execute complex Hadoop workloads via web services. More
 specifically, this includes:

         * XML-based declarative framework to specify a job or a complex 
 workflow of
 dependent jobs.

         * Support different types of job such as Hadoop Map-Reduce, Pipe, 
 Streaming,
 Pig, Hive and custom java applications.

         * Workflow scheduling based on frequency and/or data availability.
         * Monitoring capability, automatic retry and failure handing of jobs.
         * Extensible and pluggable architecture to allow arbitrary grid 
 programming
 paradigms.

         * Authentication, authorization, and capacity-aware load throttling 
 to allow
 multi-tenant software as a service.

 Background
 Most data  processing applications require multiple jobs to achieve their 
 goals,
 with inherent dependencies among the jobs. A dependency could be  sequential,
 where one job can only start after another job has finished.  Or it could be
 conditional, where the execution of a job depends on the  return value or 
 status
 of another job. In other cases, parallel  execution of multiple jobs may be
 permitted – or desired – to exploit  the massive pool of compute nodes 
 provided
 by Hadoop.

 These  job dependencies are often expressed as a Directed Acyclic Graph, also
 called a workflow. A node in the workflow is typically a job (a  computation 
 on
 the grid) or another type of action such as an eMail  notification. 
 Computations
 can be expressed in map/reduce, Pig, Hive or  any other programming paradigm
 available on the grid. Edges of the graph  represent transitions from one node
 to the next, as the execution of a  workflow proceeds.

 Describing  a workflow in a declarative way has the advantage of decoupling 
 job
 dependencies and execution control from application logic. Furthermore,  the
 workflow is modularized into jobs that can be reused within the same  workflow
 or across different workflows. Execution of the workflow is  then driven by a
 runtime system without understanding the application  logic of the jobs. This
 runtime system specializes in reliable and  predictable execution: It can 
 retry
 actions that have failed or invoke a  cleanup action after termination of the
 workflow; it can monitor  progress, success, or failure of a workflow, and 
 send
 appropriate alerts  to an administrator. The application developer is relieved
 from  implementing these generic procedures.

 Furthermore,  some applications or workflows need to run in periodic intervals
 or  when dependent data is available. For example, a workflow could be  
 executed
 every day as soon as output data from the previous 24 instances  of another,
 hourly workflow is available. The workflow coordinator  provides such 
 scheduling
 features, along with prioritization, load  balancing and throttling to 
 optimize
 utilization of resources in the  cluster. This makes it easier to maintain,
 control, and coordinate  complex data applications.

 Nearly  three years ago, a team of Yahoo! developers addressed these critical
 requirements for Hadoop-based data processing systems by developing a  new
 workflow management and scheduling system called Oozie. While it was  
 initially
 developed as a Yahoo!-internal project, it was designed and  implemented with
 the intention of open-sourcing. Oozie was released as a GitHub project in 
 early
 2010. Oozie is used in production within Yahoo and  since it has been
 open-sourced it has been gaining adoption with  external developers

 Rationale
 Commonly,  applications that run on Hadoop require multiple Hadoop jobs in 
 order
 to  obtain the desired results. Furthermore, these Hadoop jobs are commonly  a
 combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes
 map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and 
 shell
 scripts.

 Because  of this, developers find themselves writing ad-hoc glue programs to
 combine these Hadoop jobs. These ad-hoc programs are difficult to  schedule,
 manage, monitor and recover.

 Workflow  

Re: [PROPOSAL] Deft for incubation

2011-06-30 Thread Mohammad Nour El-Din
Thanks Niklas

On Thu, Jun 30, 2011 at 9:24 AM, Niklas Gustavsson nik...@protocol7.com wrote:
 On Wed, Jun 29, 2011 at 4:05 PM, Mohammad Nour El-Din
 nour.moham...@gmail.com wrote:
   You can sign me in.

 You've been added to the wiki page.

 One or two more mentors would be outstanding.

 /niklas

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org





-- 
Thanks
- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
  http://www.redbooks.ibm.com/abstracts/sg247585.html
- LinkedIn: http://www.linkedin.com/in/mnour
- Blog: http://tadabborat.blogspot.com

Life is like riding a bicycle. To keep your balance you must keep moving
- Albert Einstein

Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best.
- Clean Code: A Handbook of Agile Software Craftsmanship

Stay hungry, stay foolish.
- Steve Jobs

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Oozie to join the Incubator

2011-06-30 Thread Mohammad Nour El-Din
+1 (Binding)

On Thu, Jun 30, 2011 at 10:04 AM, Chris Douglas cdoug...@apache.org wrote:
 +1 (binding) -C

 On Wednesday, June 29, 2011, Mohammad Islam misla...@yahoo.com wrote:
 Hi All,

 The discussion about Oozie proposal is settling down. Therefore I would like 
 to
 initiate a vote to accept Oozie as an Apache Incubator project.

 The latest proposal is pasted at the end and it could be found in the wiki as
 well:

 http://wiki.apache.org/incubator/OozieProposal


 The related discussion thread is at:
 http://www.mail-archive.com/general@incubator.apache.org/msg29633.html


 Please cast your votes:

 [  ] +1 Accept Oozie for incubation
 [  ] +0 Indifferent to Oozie incubation
 [  ] -1 Reject Oozie for incubation

 This vote will close 72 hours  from now.

 Regards,
 Mohammad


 Abstract
 Oozie is a server-based workflow scheduling and coordination system to manage
 data processing jobs for Apache HadoopTM.

 Proposal
 Oozie is an  extensible, scalable and reliable system to define, manage,
 schedule,  and execute complex Hadoop workloads via web services. More
 specifically, this includes:

         * XML-based declarative framework to specify a job or a complex 
 workflow of
 dependent jobs.

         * Support different types of job such as Hadoop Map-Reduce, Pipe, 
 Streaming,
 Pig, Hive and custom java applications.

         * Workflow scheduling based on frequency and/or data availability.
         * Monitoring capability, automatic retry and failure handing of jobs.
         * Extensible and pluggable architecture to allow arbitrary grid 
 programming
 paradigms.

         * Authentication, authorization, and capacity-aware load throttling 
 to allow
 multi-tenant software as a service.

 Background
 Most data  processing applications require multiple jobs to achieve their 
 goals,
 with inherent dependencies among the jobs. A dependency could be  sequential,
 where one job can only start after another job has finished.  Or it could be
 conditional, where the execution of a job depends on the  return value or 
 status
 of another job. In other cases, parallel  execution of multiple jobs may be
 permitted – or desired – to exploit  the massive pool of compute nodes 
 provided
 by Hadoop.

 These  job dependencies are often expressed as a Directed Acyclic Graph, also
 called a workflow. A node in the workflow is typically a job (a  computation 
 on
 the grid) or another type of action such as an eMail  notification. 
 Computations
 can be expressed in map/reduce, Pig, Hive or  any other programming paradigm
 available on the grid. Edges of the graph  represent transitions from one 
 node
 to the next, as the execution of a  workflow proceeds.

 Describing  a workflow in a declarative way has the advantage of decoupling 
 job
 dependencies and execution control from application logic. Furthermore,  the
 workflow is modularized into jobs that can be reused within the same  
 workflow
 or across different workflows. Execution of the workflow is  then driven by a
 runtime system without understanding the application  logic of the jobs. This
 runtime system specializes in reliable and  predictable execution: It can 
 retry
 actions that have failed or invoke a  cleanup action after termination of the
 workflow; it can monitor  progress, success, or failure of a workflow, and 
 send
 appropriate alerts  to an administrator. The application developer is 
 relieved
 from  implementing these generic procedures.

 Furthermore,  some applications or workflows need to run in periodic 
 intervals
 or  when dependent data is available. For example, a workflow could be  
 executed
 every day as soon as output data from the previous 24 instances  of another,
 hourly workflow is available. The workflow coordinator  provides such 
 scheduling
 features, along with prioritization, load  balancing and throttling to 
 optimize
 utilization of resources in the  cluster. This makes it easier to maintain,
 control, and coordinate  complex data applications.

 Nearly  three years ago, a team of Yahoo! developers addressed these critical
 requirements for Hadoop-based data processing systems by developing a  new
 workflow management and scheduling system called Oozie. While it was  
 initially
 developed as a Yahoo!-internal project, it was designed and  implemented with
 the intention of open-sourcing. Oozie was released as a GitHub project in 
 early
 2010. Oozie is used in production within Yahoo and  since it has been
 open-sourced it has been gaining adoption with  external developers

 Rationale
 Commonly,  applications that run on Hadoop require multiple Hadoop jobs in 
 order
 to  obtain the desired results. Furthermore, these Hadoop jobs are commonly  
 a
 combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes
 map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and 
 shell
 scripts.

 Because  of this, developers find themselves writing ad-hoc glue programs to
 combine these 

Re: Bluesky calls for a new mentor!

2011-06-30 Thread Christian Grobmeier
 I believe projects from  school could also rock in Apache as well. You can
 look down up on us but you can't deny others.

This is not the point.

The point is, if you have contributors who have no apache id, they

a) need to sign an ICLA
b) need to create an Jira issue and attach an svn diff there, ticking
the allowed to use for the ASF box
c) need to ask development questions on the mailinglist, not by ICQ,
MSN or whatever

You can actually work with students, no problem. But it should happen
visible. Even when you are in the same room, it should be visible to
all other parties around the world. Otherwise you will never get an
development community.


     I had a propose that community give us the last chance for 1-2month and
 certain member could become our mentor to lead us finish  releasing the
 newest version.   During this time slot. What you would see includes:

I am not sure if the term mentor is used well here. A mentor is
not here to help you in development questions. A mentors role is to
oversee how the project progresses, guide people to work after the
apache way. After 3 years you should already know about the apache way
and mentor should be obsolet (from a teaching role only). A mentor is
for sure NO project lead. He can point you to the according docs of
how to release code, for example.

   1. gradually increasing discussion in bluesky-dev mailing list.
   Meaningless discussion would not count.

ALL discussion must happen on list, from now on. If it didn't happen
on list, it didn't happen, as a wise man once said.

   2. committing of source code after they were cleaned up.
   Inactive committers would be revoked and new committers would apply to join
   in.

Now or never. It is commit then review
Potential new committers must show their interest on the mailing list
- otherwise your mentors cannot decide if they should support a
invitation or not. As you know, new committers must be voted in. The
discussion should also happen before, on list.

   3. preparing for what release needs and make the release successful. Thus
   the new developers and committers could completely experienced the release
   process and know about How things are done in Apache community better.

You should start working on the apache way even before the release. If
it didn't work well before, it will not work well while releasing.

     If community accept my suggestion, individually, i want the BlueSky
 project under strict surveillance by community members. If we can't fulfill
 what we just promised, then just kick us out of here and i would have noting
 to say.

I (personally) have no problems with waiting just another 2 or 3
months. I cannot imagine anyone would like to step up as a mentor at
the moment. My suggestion: try to work out the apache way now. Use
jira and the mailinglist. Students contribute patches through jira.
Committers apply them. And so on. If that all happens, your Jira is
full of contributions and your mailinglist full of discussions. If
that is the case, come back to this list and ask for a mentor again -
probably somebody is willling to step up again.
If you have more questions on how apache works, I am pretty sure
you'll get an answer on this list.

Cheers,
Christian


     Well, suppose we live through that, besides working in Apache way, we
 would continually working on to evolve BlueSky to make it much more easier
 to use in  e-learning area and used in a larger scope(now bluesky has
 been deployed in China and is about to be applied in India ), so that more
 students in undeveloped district could share the same high quality education
 as the developed area.
      Sincerely, i would invite you Ralph to be our mentor in this 1-2 month
 if you were not busy enough and willing to guide us. Don't feel sorry if you
 want to refuse me.TOT
 regards,
 Kevin

 2011/6/30 Ralph Goers ralph.go...@dslextreme.com

 Sorry, but the explanation below makes things sound even worse. Apache
 projects are not here to give students a place to do school work. What you
 have described is not a community.  If the project cannot build a community
 of people who are interested in the project for more than a school term then
 it doesn't belong here.

 Ralph

 On Jun 29, 2011, at 8:12 PM, SamuelKevin wrote:

  Hi, Noel:
 
  2011/6/30 Noel J. Bergman n...@devtech.com
 
  Joe Schaefer wrote:
  Chen Liu wrote:
  We propose to move future development of BlueSky to the Apache
 Software
  Foundation in order to build a broader user and developer community.
 
  You are supposed to be doing your development work in the ASF
 subversion
  repository, using ASF mailing lists, as peers.
 
  Chen, as Joe points out, these are what BlueSky should have been doing
 for
  the past three (3) years, and yet we still here a proposal for the
 future.
 
  Looking at the (limited) commit history, there is a total imbalance
  between
  the number of people associated with the development work (20+) and the
  number of people with 

Re: Bluesky calls for a new mentor!

2011-06-30 Thread Upayavira
Personally, I see a *HEAP* of stuff Bluesky would need to handle before
doing an ASF release. 

I would get that right out of your head from the start. Firstly, you
would have to have demonstrated that all the code is covered by software
grants or ICLAs that are held by the Apache Software Foundation.
Secondly, you would have to go through the entire codebase, and remove
all code that cannot be included in a work covered by the Apache
License. This would mean excluding any LGPL/GPL code, and possibly more. 

Secondly, you should be committing code *before* you clean it up. The
clean-up should happen in public, on ASF lists. Otherwise it smacks of
'over the wall' development, meaning other developers not in your
immediate team would have no capacity to engage in the development, as
all they can see at Apache is a sequence of code drops, of code that was
actually developed elsewhere.

Here are the steps I would see the project needing to complete, and
probably within a month, to survive:

(a) Get all code onto Apache SVN, immediately (it is okay to include 
LGPL code in SVN, it just can't be released)
(b) Every change to the code needs to be a real change, not a code drop
(c) Patches made by students who are not committers should be uploaded 
to JIRA, with correct provenance (ICLA signed) before they are 
committed
(d) All development happens on the ASF list
(e) Any idea of doing a release at Apache within six months must be 
dropped

Upayavira

On Thu, 30 Jun 2011 14:49 +0800, SamuelKevin lovesumm...@gmail.com
wrote:
 Hi, Ralph:
  I am not avoiding the truth that we suck during the last three
  years,
 though we were once at the verge of release.  It's* just* we  *Bluesky
 Team
 @ XJTU ,Xi'an China* fail to make it good, please remember it well. I
 believe projects from  school could also rock in Apache as well. You can
 look down up on us but you can't deny others.
  I had a propose that community give us the last chance for 1-2month
  and
 certain member could become our mentor to lead us finish  releasing the
 newest version.   During this time slot. What you would see includes:
 
1. gradually increasing discussion in bluesky-dev mailing list.
Meaningless discussion would not count.
2. committing of source code after they were cleaned up.
Inactive committers would be revoked and new committers would apply to
join
in.
3. preparing for what release needs and make the release successful.
Thus
the new developers and committers could completely experienced the
release
process and know about How things are done in Apache community
better.
 
  If community accept my suggestion, individually, i want the BlueSky
 project under strict surveillance by community members. If we can't
 fulfill
 what we just promised, then just kick us out of here and i would have
 noting
 to say.
  Well, suppose we live through that, besides working in Apache way,
  we
 would continually working on to evolve BlueSky to make it much more
 easier
 to use in  e-learning area and used in a larger scope(now bluesky has
 been deployed in China and is about to be applied in India ), so that
 more
 students in undeveloped district could share the same high quality
 education
 as the developed area.
   Sincerely, i would invite you Ralph to be our mentor in this 1-2
   month
 if you were not busy enough and willing to guide us. Don't feel sorry if
 you
 want to refuse me.TOT
 regards,
 Kevin
 
 2011/6/30 Ralph Goers ralph.go...@dslextreme.com
 
  Sorry, but the explanation below makes things sound even worse. Apache
  projects are not here to give students a place to do school work. What you
  have described is not a community.  If the project cannot build a community
  of people who are interested in the project for more than a school term then
  it doesn't belong here.
 
  Ralph
 
  On Jun 29, 2011, at 8:12 PM, SamuelKevin wrote:
 
   Hi, Noel:
  
   2011/6/30 Noel J. Bergman n...@devtech.com
  
   Joe Schaefer wrote:
   Chen Liu wrote:
   We propose to move future development of BlueSky to the Apache
  Software
   Foundation in order to build a broader user and developer community.
  
   You are supposed to be doing your development work in the ASF
  subversion
   repository, using ASF mailing lists, as peers.
  
   Chen, as Joe points out, these are what BlueSky should have been doing
  for
   the past three (3) years, and yet we still here a proposal for the
  future.
  
   Looking at the (limited) commit history, there is a total imbalance
   between
   the number of people associated with the development work (20+) and the
   number of people with Apache accounts here (2).
  
   I guess i can explain that. Most of the developers of BlueSky project
  are
   students. As you all know, students come  when they join in school and go
   after they graduate. So the active developers are around 10. Like we used
  to
   have 5 committers, but 

Re: Bluesky calls for a new mentor!

2011-06-30 Thread Bernd Fondermann
On Thu, Jun 30, 2011 at 10:37, Christian Grobmeier grobme...@gmail.com wrote:
 I believe projects from  school could also rock in Apache as well. You can
 look down up on us but you can't deny others.

 This is not the point.

 The point is, if you have contributors who have no apache id, they

 a) need to sign an ICLA
 b) need to create an Jira issue and attach an svn diff there, ticking
 the allowed to use for the ASF box
 c) need to ask development questions on the mailinglist, not by ICQ,
 MSN or whatever

 You can actually work with students, no problem. But it should happen
 visible. Even when you are in the same room, it should be visible to
 all other parties around the world. Otherwise you will never get an
 development community.


     I had a propose that community give us the last chance for 1-2month and
 certain member could become our mentor to lead us finish  releasing the
 newest version.   During this time slot. What you would see includes:

 I am not sure if the term mentor is used well here. A mentor is
 not here to help you in development questions. A mentors role is to
 oversee how the project progresses, guide people to work after the
 apache way. After 3 years you should already know about the apache way
 and mentor should be obsolet (from a teaching role only). A mentor is
 for sure NO project lead. He can point you to the according docs of
 how to release code, for example.

   1. gradually increasing discussion in bluesky-dev mailing list.
   Meaningless discussion would not count.

 ALL discussion must happen on list, from now on. If it didn't happen
 on list, it didn't happen, as a wise man once said.

   2. committing of source code after they were cleaned up.
   Inactive committers would be revoked and new committers would apply to join
   in.

 Now or never. It is commit then review
 Potential new committers must show their interest on the mailing list
 - otherwise your mentors cannot decide if they should support a
 invitation or not. As you know, new committers must be voted in. The
 discussion should also happen before, on list.

   3. preparing for what release needs and make the release successful. Thus
   the new developers and committers could completely experienced the release
   process and know about How things are done in Apache community better.

 You should start working on the apache way even before the release. If
 it didn't work well before, it will not work well while releasing.

     If community accept my suggestion, individually, i want the BlueSky
 project under strict surveillance by community members. If we can't fulfill
 what we just promised, then just kick us out of here and i would have noting
 to say.

 I (personally) have no problems with waiting just another 2 or 3
 months. I cannot imagine anyone would like to step up as a mentor at
 the moment. My suggestion: try to work out the apache way now. Use
 jira and the mailinglist. Students contribute patches through jira.
 Committers apply them. And so on. If that all happens, your Jira is
 full of contributions and your mailinglist full of discussions. If
 that is the case, come back to this list and ask for a mentor again -
 probably somebody is willling to step up again.

We had this discussion multiple times over the last year.
I firmly think, without immediate new mentors this project should not continue.

  Bernd

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Oozie to join the Incubator

2011-06-30 Thread Bertrand Delacretaz
On Wed, Jun 29, 2011 at 9:10 PM, Mohammad Islam misla...@yahoo.com wrote:
...
 [X  ] +1 Accept Oozie for incubation
...

-Bertrand

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: KEYS and releases

2011-06-30 Thread Daniel Shahaf
Robert Burrell Donkin wrote on Thu, Jun 30, 2011 at 08:31:38 +0100:
 On Tue, Jun 28, 2011 at 10:20 AM, Christian Grobmeier
 grobme...@gmail.com wrote:
  we copy a KEYS file into that directory upon succesful VOTE of the release
  artifacts (which also include the KEYS file).
 
  Perhaps, but the point we're getting at was explicitly stated by Benson,
  The goal here is to allow and encourage consumers to independently verify
  signatures.  That calls for KEYS somewhere else than inside the package.
 
  I am sorry to ask it again, but why can't the incubator have a policy
  to make people use:
  https://id.apache.org/
  to store their signing key.
 
  Then we have them listed for each projects there:
  https://people.apache.org/keys/
 
  Was it not meant that way?
 
 AIUI  this infrastructure is relative new and intended to add defense-in-depth
 

Yes, it's new, and yes, it isn't meant to replaced PGP trust paths.

What it does behind the scenes is 'gpg --recv-key keyid  committer.asc'
and publish the result over https, where the key id (or fingerprint) is
provided by the committer (authenticating with their svn password).

 IMHO the IPMC should only document (any volunteers?) a strong
 recommendation but leave policy in this area to the experts over in
 infrastructure
 
 Robert
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



RE: Bluesky calls for a new mentor!

2011-06-30 Thread Noel J. Bergman
Samuel Kevin wrote:

 Most of the developers of BlueSky project are students. As you all know,
 students come  when they join in school and go after they graduate.  So
 the active developers are around 10. Like we used to have 5 committers,
 but now we only have 2 committers in active.

As others have pointed out, and I believe you acknowledge (q.v., I am not
avoiding the truth that we suck during the last three years), there are
better and necessary ways to address this issue.  And we've worked with
Google every year during the Summer of Code, so we're not exactly
inexperienced working with students.

 According to what you've listed, i would forward your suggestion to
bluesky
 dev list and wish we could make a quick response after discussion.

Incorporate all of the feedback you're getting from folks.  It is urgent
that you take the advice, get all of the current code into source control
ASAP, get students onto the mailing list now, start doing discussion and
coding in public, and submit changes on a regular basis via SVN and/or JIRA.
These are the same things you've also read from Christian and Upayavira.
You don't need a new Mentor to do those things.  Demonstrate change and
we'll try to help you succeed.

--- Noel



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Oozie to join the Incubator

2011-06-30 Thread Alan D. Cabrera
+1 binding


Regards,
Alan

On Jun 29, 2011, at 12:10 PM, Mohammad Islam wrote:

 Hi All,
 
 The discussion about Oozie proposal is settling down. Therefore I would like 
 to 
 initiate a vote to accept Oozie as an Apache Incubator project.
 
 The latest proposal is pasted at the end and it could be found in the wiki as 
 well:
 
 http://wiki.apache.org/incubator/OozieProposal
 
 
 The related discussion thread is at:
 http://www.mail-archive.com/general@incubator.apache.org/msg29633.html
 
 
 Please cast your votes:
 
 [  ] +1 Accept Oozie for incubation
 [  ] +0 Indifferent to Oozie incubation
 [  ] -1 Reject Oozie for incubation
 
 This vote will close 72 hours  from now.
 
 Regards,
 Mohammad
 
 
 Abstract
 Oozie is a server-based workflow scheduling and coordination system to manage 
 data processing jobs for Apache HadoopTM. 
 
 Proposal
 Oozie is an  extensible, scalable and reliable system to define, manage, 
 schedule,  and execute complex Hadoop workloads via web services. More  
 specifically, this includes: 
 
   * XML-based declarative framework to specify a job or a complex 
 workflow of 
 dependent jobs. 
 
   * Support different types of job such as Hadoop Map-Reduce, Pipe, 
 Streaming, 
 Pig, Hive and custom java applications. 
 
   * Workflow scheduling based on frequency and/or data availability. 
   * Monitoring capability, automatic retry and failure handing of jobs. 
   * Extensible and pluggable architecture to allow arbitrary grid 
 programming 
 paradigms. 
 
   * Authentication, authorization, and capacity-aware load throttling to 
 allow 
 multi-tenant software as a service. 
 
 Background
 Most data  processing applications require multiple jobs to achieve their 
 goals,  
 with inherent dependencies among the jobs. A dependency could be  sequential, 
 where one job can only start after another job has finished.  Or it could be 
 conditional, where the execution of a job depends on the  return value or 
 status 
 of another job. In other cases, parallel  execution of multiple jobs may be 
 permitted – or desired – to exploit  the massive pool of compute nodes 
 provided 
 by Hadoop. 
 
 These  job dependencies are often expressed as a Directed Acyclic Graph, also 
  
 called a workflow. A node in the workflow is typically a job (a  computation 
 on 
 the grid) or another type of action such as an eMail  notification. 
 Computations 
 can be expressed in map/reduce, Pig, Hive or  any other programming paradigm 
 available on the grid. Edges of the graph  represent transitions from one 
 node 
 to the next, as the execution of a  workflow proceeds. 
 
 Describing  a workflow in a declarative way has the advantage of decoupling 
 job  
 dependencies and execution control from application logic. Furthermore,  the 
 workflow is modularized into jobs that can be reused within the same  
 workflow 
 or across different workflows. Execution of the workflow is  then driven by a 
 runtime system without understanding the application  logic of the jobs. This 
 runtime system specializes in reliable and  predictable execution: It can 
 retry 
 actions that have failed or invoke a  cleanup action after termination of the 
 workflow; it can monitor  progress, success, or failure of a workflow, and 
 send 
 appropriate alerts  to an administrator. The application developer is 
 relieved 
 from  implementing these generic procedures. 
 
 Furthermore,  some applications or workflows need to run in periodic 
 intervals 
 or  when dependent data is available. For example, a workflow could be  
 executed 
 every day as soon as output data from the previous 24 instances  of another, 
 hourly workflow is available. The workflow coordinator  provides such 
 scheduling 
 features, along with prioritization, load  balancing and throttling to 
 optimize 
 utilization of resources in the  cluster. This makes it easier to maintain, 
 control, and coordinate  complex data applications. 
 
 Nearly  three years ago, a team of Yahoo! developers addressed these critical 
  
 requirements for Hadoop-based data processing systems by developing a  new 
 workflow management and scheduling system called Oozie. While it was  
 initially 
 developed as a Yahoo!-internal project, it was designed and  implemented with 
 the intention of open-sourcing. Oozie was released as a GitHub project in 
 early 
 2010. Oozie is used in production within Yahoo and  since it has been 
 open-sourced it has been gaining adoption with  external developers 
 
 Rationale
 Commonly,  applications that run on Hadoop require multiple Hadoop jobs in 
 order 
 to  obtain the desired results. Furthermore, these Hadoop jobs are commonly  
 a 
 combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes  
 map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and 
 shell 
 scripts. 
 
 Because  of this, developers find themselves writing ad-hoc glue programs to  
 combine these Hadoop jobs. 

Re: [VOTE] Retire ALOIS podling

2011-06-30 Thread Christian Grobmeier
I would like to close this vote with +1 from:

Betrand Delacretaz
Alan Cabrera
Henri Yandell
Mohamma Nour El-Din
Noel Bergman
Christian Grobmeier

I will try to do the necessary retirement steps as soon as I can.

Thanks for your time!
Christian

On Tue, Jun 21, 2011 at 5:52 PM, Christian Grobmeier
grobme...@gmail.com wrote:
 Hello,

 as already mentioned last week, the ALOIS project is dead and it seems
 there is no way to recover in near future (or even later). The
 developers told me in a private message in March that they cannot
 continue due to personal reasons. It seem this has become truth.

 I have set up a vote on the dev mailinglist:
  * http://s.apache.org/eBx
 (Note: one of the voters responded on the private list - I counted the vote)

 So far, no releases have been made.

 This vote passed before a few hour after being open for 5 days.

 Please vote for retirement of the alois podling. If this vote passes,
 I will step to the discussions on retirement and finally retire it.

 Thanks,
 Christian

 [] +1 - please retire
 [] +/-0
 [] -1 - please don't retire, because...




-- 
http://www.grobmeier.de

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Oozie to join the Incubator

2011-06-30 Thread Ahmed Radwan
+1 (non-binding)
Good luck

On Wed, Jun 29, 2011 at 12:10 PM, Mohammad Islam misla...@yahoo.com wrote:

 Hi All,

 The discussion about Oozie proposal is settling down. Therefore I would
 like to
 initiate a vote to accept Oozie as an Apache Incubator project.

 The latest proposal is pasted at the end and it could be found in the wiki
 as
 well:

 http://wiki.apache.org/incubator/OozieProposal


 The related discussion thread is at:
 http://www.mail-archive.com/general@incubator.apache.org/msg29633.html


 Please cast your votes:

 [  ] +1 Accept Oozie for incubation
 [  ] +0 Indifferent to Oozie incubation
 [  ] -1 Reject Oozie for incubation

 This vote will close 72 hours  from now.

 Regards,
 Mohammad


 Abstract
 Oozie is a server-based workflow scheduling and coordination system to
 manage
 data processing jobs for Apache HadoopTM.

 Proposal
 Oozie is an  extensible, scalable and reliable system to define, manage,
 schedule,  and execute complex Hadoop workloads via web services. More
 specifically, this includes:

* XML-based declarative framework to specify a job or a complex
 workflow of
 dependent jobs.

* Support different types of job such as Hadoop Map-Reduce, Pipe,
 Streaming,
 Pig, Hive and custom java applications.

* Workflow scheduling based on frequency and/or data availability.
* Monitoring capability, automatic retry and failure handing of
 jobs.
* Extensible and pluggable architecture to allow arbitrary grid
 programming
 paradigms.

* Authentication, authorization, and capacity-aware load throttling
 to allow
 multi-tenant software as a service.

 Background
 Most data  processing applications require multiple jobs to achieve their
 goals,
 with inherent dependencies among the jobs. A dependency could be
  sequential,
 where one job can only start after another job has finished.  Or it could
 be
 conditional, where the execution of a job depends on the  return value or
 status
 of another job. In other cases, parallel  execution of multiple jobs may be
 permitted – or desired – to exploit  the massive pool of compute nodes
 provided
 by Hadoop.

 These  job dependencies are often expressed as a Directed Acyclic Graph,
 also
 called a workflow. A node in the workflow is typically a job (a
  computation on
 the grid) or another type of action such as an eMail  notification.
 Computations
 can be expressed in map/reduce, Pig, Hive or  any other programming
 paradigm
 available on the grid. Edges of the graph  represent transitions from one
 node
 to the next, as the execution of a  workflow proceeds.

 Describing  a workflow in a declarative way has the advantage of decoupling
 job
 dependencies and execution control from application logic. Furthermore,
  the
 workflow is modularized into jobs that can be reused within the same
  workflow
 or across different workflows. Execution of the workflow is  then driven by
 a
 runtime system without understanding the application  logic of the jobs.
 This
 runtime system specializes in reliable and  predictable execution: It can
 retry
 actions that have failed or invoke a  cleanup action after termination of
 the
 workflow; it can monitor  progress, success, or failure of a workflow, and
 send
 appropriate alerts  to an administrator. The application developer is
 relieved
 from  implementing these generic procedures.

 Furthermore,  some applications or workflows need to run in periodic
 intervals
 or  when dependent data is available. For example, a workflow could be
  executed
 every day as soon as output data from the previous 24 instances  of
 another,
 hourly workflow is available. The workflow coordinator  provides such
 scheduling
 features, along with prioritization, load  balancing and throttling to
 optimize
 utilization of resources in the  cluster. This makes it easier to maintain,
 control, and coordinate  complex data applications.

 Nearly  three years ago, a team of Yahoo! developers addressed these
 critical
 requirements for Hadoop-based data processing systems by developing a  new
 workflow management and scheduling system called Oozie. While it was
  initially
 developed as a Yahoo!-internal project, it was designed and  implemented
 with
 the intention of open-sourcing. Oozie was released as a GitHub project in
 early
 2010. Oozie is used in production within Yahoo and  since it has been
 open-sourced it has been gaining adoption with  external developers

 Rationale
 Commonly,  applications that run on Hadoop require multiple Hadoop jobs in
 order
 to  obtain the desired results. Furthermore, these Hadoop jobs are commonly
  a
 combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes
 map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and
 shell
 scripts.

 Because  of this, developers find themselves writing ad-hoc glue programs
 to
 combine these Hadoop jobs. These ad-hoc programs are difficult to
  schedule,
 manage, monitor and recover.

Re: [VOTE] Retire ALOIS podling

2011-06-30 Thread Henri Yandell
Hi Christan,

To confirm, will you be doing the following steps:

* Investigate whether the source was covered by CLAs.
* If so then update the web page before moving to retired.
* Otherwise delete the source from svn.

?

Hen

On Thu, Jun 30, 2011 at 10:17 AM, Christian Grobmeier
grobme...@gmail.com wrote:
 I would like to close this vote with +1 from:

 Betrand Delacretaz
 Alan Cabrera
 Henri Yandell
 Mohamma Nour El-Din
 Noel Bergman
 Christian Grobmeier

 I will try to do the necessary retirement steps as soon as I can.

 Thanks for your time!
 Christian

 On Tue, Jun 21, 2011 at 5:52 PM, Christian Grobmeier
 grobme...@gmail.com wrote:
 Hello,

 as already mentioned last week, the ALOIS project is dead and it seems
 there is no way to recover in near future (or even later). The
 developers told me in a private message in March that they cannot
 continue due to personal reasons. It seem this has become truth.

 I have set up a vote on the dev mailinglist:
  * http://s.apache.org/eBx
 (Note: one of the voters responded on the private list - I counted the vote)

 So far, no releases have been made.

 This vote passed before a few hour after being open for 5 days.

 Please vote for retirement of the alois podling. If this vote passes,
 I will step to the discussions on retirement and finally retire it.

 Thanks,
 Christian

 [] +1 - please retire
 [] +/-0
 [] -1 - please don't retire, because...




 --
 http://www.grobmeier.de

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] Oozie for the Apache Incubator

2011-06-30 Thread brookwc

Strong +1 (non-binding). 

Thanks, 

Chao
-- 
View this message in context: 
http://old.nabble.com/-PROPOSAL--Oozie-for-the-Apache-Incubator-tp31922563p31970721.html
Sent from the Apache Incubator - General mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org