Re: [openstack-dev] [sahara] Nominate Telles Mota Vidal Nóbrega for core team

2016-08-11 Thread Trevor McKay
+2

On Fri, 2016-08-12 at 00:56 +0800, lu jander wrote:
> +2 from me thx Telles 
> 
> 2016-08-12 0:20 GMT+08:00 Sergey Reshetnyak
> :
> +2 from me
> 
> 2016-08-11 19:15 GMT+03:00 Sergey Lukjanov
> :
> +2
> 
> On Thu, Aug 11, 2016 at 8:48 AM, Elise Gafford
>  wrote:
> Hearty +2. Telles has been working on Sahara
> for years, and has been a consistent and
> incisive reviewer and code contributor.
> 
> 
> Congratulations Telles; very well deserved!
> 
> 
> - Elise
> 
> 
> On Thu, Aug 11, 2016 at 11:37 AM, Vitaly
> Gridnev  wrote:
> 
> Hello core team,
> 
> 
> I'd like to nominate Telles Mota Vidal
> Nóbrega for core reviewer team. Let's
> vote with +2/-2 for his candidacy.
> Review/Commits stats can be found at
> [0].
> 
> 
> [0] 
> http://stackalytics.com/?module=sahara-group_id=tellesmvn
> 
> -- 
> Best Regards,
> 
> Vitaly Gridnev,
> Project Technical Lead of OpenStack
> DataProcessing Program (Sahara)
> Mirantis, Inc
> 
> 
> 
> __
> OpenStack Development Mailing List
> (not for usage questions)
> Unsubscribe:
> 
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for
> usage questions)
> Unsubscribe:
> 
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> 
> -- 
> Sincerely yours,
> Sergey Lukjanov
> Sr. Development Manager
> Mirantis Inc.
> 
> 
> __
> OpenStack Development Mailing List (not for usage
> questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Sahara Job Binaries Storage

2016-05-27 Thread Trevor McKay
Hi Jerico,

  we talked about it at Summit in one of the design sessions, but afaik
there is no blueprint or spec yet. I don't see why it can't happen in
Newton, however.

Best,

Trevor

On Thu, 2016-05-26 at 16:14 +1000, Jerico Revote wrote:
> Hi Trevor,
> 
> Just revisiting this,
> has there been any progress to deprecate sahara jobs -> internal db mechanism?
> and/or the config option to disable internal db storage?
>  
> Regards,
> 
> Jerico
> 
> 
> 
> > On 18 Mar 2016, at 12:55 AM, Trevor McKay <tmc...@redhat.com> wrote:
> > 
> > Hi Jerico,
> > 
> >  Internal db storage for job binaries was added at
> > the start of EDP as an alternative for sites that do
> > not have swift running. Since then, we've also added
> > integration with manila so that job binaries can be
> > stored in manila shares.
> > 
> >  You are correct, storing lots of binaries in the
> > sahara db could make the database grow very large.
> > Swift or manila should be used for production, internal
> > storage is a good option for development/test.
> > 
> >  There is currently no way to disable internal storage.
> > We can took a look at adding such an option -- in fact
> > we have talked informally about the possibility of
> > deprecating internal db storage since swift and manila
> > are both mature at this point. We should discuss that
> > at the upcoming summit.
> > 
> > Best,
> > 
> > Trevor
> > 
> > On Thu, 2016-03-17 at 10:27 +1100, Jerico Revote wrote:
> >> Hello,
> >> 
> >> 
> >> When deploying Sahara, Sahara docos suggests to
> >> increase max_allowed_packet to 256MB,
> >> for internal database storing of job binaries.
> >> There could be hundreds of job binaries to be uploaded/created into
> >> Sahara,
> >> which would then cause the database to grow as well.
> >> Does anyone using Sahara encountered database sizing issues using
> >> internal db storage?
> >> 
> >> 
> >> It looks like swift is the more logical place for storing job
> >> binaries 
> >> (in our case we have a global swift cluster), and this is also
> >> available to the user.
> >> Is there a way to only enable the swift way for storing job binaries?
> >> 
> >> Thanks,
> >> 
> >> 
> >> Jerico
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> __
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> > 
> > 
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Nominating new members to Sahara Core

2016-05-16 Thread Trevor McKay
+2 for all

On Fri, 2016-05-13 at 18:33 +0300, Vitaly Gridnev wrote:
> Hello Sahara core folks!
> 
> 
> I'd like to bring the following folks to Sahara Core:
> 
> 
> 1. Lu Huichun
> 2. Nikita Konovalov
> 3. Chad Roberts
> 
> 
> Let's vote with +2/-2 for additions above.
> 
> 
> [0] http://stackalytics.com/?module=sahara-group
> [1] http://stackalytics.com/?module=sahara-group=mitaka
> 
> 
> -- 
> Best Regards,
> 
> Vitaly Gridnev,
> Project Technical Lead of OpenStack DataProcessing Program (Sahara)
> Mirantis, Inc
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Sahara][QA] Notes about the move of the Sahara Tempest API test to sahara-tests

2016-03-22 Thread Trevor McKay
Thanks Luigi,

  sounds good to me. I'll be happy to help with reviews/approvals as
needed.

Trevor

On Mon, 2016-03-21 at 12:05 +0100, Luigi Toscano wrote:
> On Monday 21 of March 2016 10:50:30 Evgeny Sikachev wrote:
> > Hi, Luigi!
> > 
> > Thanks for this short spec :)
> > Changes looking good for me, and I think Sahara-team can help with pushing
> > this changes to sahara-tests repo.
> > 
> > But I have a question: why we need to use detached branch for it? 
> A branch with no parent, as it's history is totally detached from the 
> existing 
> code. The merge makes the two branches converge. You can see it in the graph 
> of the history on my work repository:
> https://github.com/ltoscano-rh/sahara-tests/commits/master
> > And then
> > we have the next problem "Temporarily exclude a specific branch from the
> > CI". Maybe force push can be easier and faster?
> 
> In my understanding, force push was totally ruled out by Infra.
> 
> Ciao



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Sahara Job Binaries Storage

2016-03-19 Thread Trevor McKay
Hi Jerico,

  Internal db storage for job binaries was added at
the start of EDP as an alternative for sites that do
not have swift running. Since then, we've also added
integration with manila so that job binaries can be
stored in manila shares.

  You are correct, storing lots of binaries in the
sahara db could make the database grow very large.
Swift or manila should be used for production, internal
storage is a good option for development/test.

  There is currently no way to disable internal storage.
We can took a look at adding such an option -- in fact
we have talked informally about the possibility of
deprecating internal db storage since swift and manila
are both mature at this point. We should discuss that
at the upcoming summit.

Best,

Trevor

On Thu, 2016-03-17 at 10:27 +1100, Jerico Revote wrote:
> Hello,
> 
> 
> When deploying Sahara, Sahara docos suggests to
> increase max_allowed_packet to 256MB,
> for internal database storing of job binaries.
> There could be hundreds of job binaries to be uploaded/created into
> Sahara,
> which would then cause the database to grow as well.
> Does anyone using Sahara encountered database sizing issues using
> internal db storage?
> 
> 
> It looks like swift is the more logical place for storing job
> binaries 
> (in our case we have a global swift cluster), and this is also
> available to the user.
> Is there a way to only enable the swift way for storing job binaries?
> 
> Thanks,
> 
> 
> Jerico
> 
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Sahara Job Binaries Storage

2016-03-19 Thread Trevor McKay
Hi Jerico,

  Internal db storage for job binaries was added at
the start of EDP as an alternative for sites that do
not have swift running. Since then, we've also added
integration with manila so that job binaries can be
stored in manila shares.

  You are correct, storing lots of binaries in the
sahara db could make the database grow very large.
Swift or manila should be used for production, internal
storage is a good option for development/test.

  There is currently no way to disable internal storage.
We can took a look at adding such an option -- in fact
we have talked informally about the possibility of
deprecating internal db storage since swift and manila
are both mature at this point. We should discuss that
at the upcoming summit.

Best,

Trevor

On Thu, 2016-03-17 at 10:27 +1100, Jerico Revote wrote:
> Hello,
> 
> 
> When deploying Sahara, Sahara docos suggests to
> increase max_allowed_packet to 256MB,
> for internal database storing of job binaries.
> There could be hundreds of job binaries to be uploaded/created into
> Sahara,
> which would then cause the database to grow as well.
> Does anyone using Sahara encountered database sizing issues using
> internal db storage?
> 
> 
> It looks like swift is the more logical place for storing job
> binaries 
> (in our case we have a global swift cluster), and this is also
> available to the user.
> Is there a way to only enable the swift way for storing job binaries?
> 
> Thanks,
> 
> 
> Jerico
> 
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara]FFE Request for resume EDP job

2016-03-07 Thread Trevor McKay
My 2 cents, I agree that it is low risk -- the impl for resume is
analogous/parallel to the impl for suspend. And, it makes little 
sense to me to include suspend without resume.

In my mind, these two operations are halves of the same feature,
and since it is already partially implemented and approved, I the
FFE should be granted.

Best,

Trev

On Mon, 2016-03-07 at 09:07 -0500, Trevor McKay wrote:
> For some reason the link below is wrong for me, it goes to a different
> review. Here is a good one (I hope!):
> 
> https://review.openstack.org/#/c/285839/
> 
> Trev
> 
> On Mon, 2016-03-07 at 14:28 +0800, lu jander wrote:
> > Hi folks,
> > 
> > I would like to request a FFE for the feature “Resume EDP job”: 
> > 
> >  
> > 
> > BP:
> > https://blueprints.launchpad.net/sahara/+spec/add-suspend-resume-ability-for-edp-jobs
> > 
> > 
> > Spec has been merged. https://review.openstack.org/#/c/198264/  
> > 
> > 
> > Suspend EDP patch has been merged.
> >  https://review.openstack.org/#/c/201448/ 
> > 
> > 
> > Code Review: https://review.openstack.org/#/c/285839/
> > 
> >  
> > 
> > code is ready for review. 
> > 
> >  
> > 
> > The Benefits for this change: after suspend job, we can resume this
> > job.
> > 
> >  
> > 
> > The Risk: The risk would be low for this patch, since the code of
> > suspend patch has been long time reviewed.
> > 
> >  
> > 
> > Thanks,
> > 
> > luhuichun
> > 
> > 
> > 
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara]FFE Request for resume EDP job

2016-03-07 Thread Trevor McKay
For some reason the link below is wrong for me, it goes to a different
review. Here is a good one (I hope!):

https://review.openstack.org/#/c/285839/

Trev

On Mon, 2016-03-07 at 14:28 +0800, lu jander wrote:
> Hi folks,
> 
> I would like to request a FFE for the feature “Resume EDP job”: 
> 
>  
> 
> BP:
> https://blueprints.launchpad.net/sahara/+spec/add-suspend-resume-ability-for-edp-jobs
> 
> 
> Spec has been merged. https://review.openstack.org/#/c/198264/  
> 
> 
> Suspend EDP patch has been merged.
>  https://review.openstack.org/#/c/201448/ 
> 
> 
> Code Review: https://review.openstack.org/#/c/285839/
> 
>  
> 
> code is ready for review. 
> 
>  
> 
> The Benefits for this change: after suspend job, we can resume this
> job.
> 
>  
> 
> The Risk: The risk would be low for this patch, since the code of
> suspend patch has been long time reviewed.
> 
>  
> 
> Thanks,
> 
> luhuichun
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] shelve/unshelve cluster

2016-02-02 Thread Trevor McKay
And of course, don't forget the sahara specs repo:

https://github.com/openstack/sahara-specs

This is always a good way to propose a new idea.
Create a high-level blueprint, then submit an associated spec
through gerrit in the openstack/sahara-specs project.

Trev


On Mon, 2016-02-01 at 18:59 +0300, Vitaly Gridnev wrote:
> Hi,
> 
> 
> Can explain more precisely, what does it mean exactly to
> "shelve/unshelve" cluster?
> 
> On Mon, Feb 1, 2016 at 6:39 PM, Yacine SAÏBI
>  wrote:
> Hello,
> 
> I need to add features "shelve/unshelve" a cluster (sahara
> shelve ).
> 
> Is there anyone who had already worked on this issue ?
> 
> Any suggestions will be welcome.
> 
> Best regards,
> 
> Yacine Saïbi
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> -- 
> Best Regards,
> 
> Vitaly Gridnev
> Mirantis, Inc
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Proposing Vitaly Gridnev to core reviewer team

2015-10-16 Thread Trevor McKay
Vitaly,

 yes, welcome! I would have voted +1 but I was on PTO :)

Trev

On Thu, 2015-10-15 at 10:51 -0400, michael mccune wrote:
> congrats Vitaly!
> 
> On 10/15/2015 10:38 AM, Sergey Lukjanov wrote:
> > I think we have a quorum.
> >
> > Vitaly, congrats!
> >
> > On Tue, Oct 13, 2015 at 6:39 PM, Matthew Farrellee  > > wrote:
> >
> > +1!
> >
> > On 10/12/2015 07:19 AM, Sergey Lukjanov wrote:
> >
> > Hi folks,
> >
> > I'd like to propose Vitaly Gridnev as a member of the Sahara core
> > reviewer team.
> >
> > Vitaly contributing to Sahara for a long time and doing a great
> > job on
> > reviewing and improving Sahara. Here are the statistics for reviews
> > [0][1][2] and commits [3].
> >
> > Existing Sahara core reviewers, please vote +1/-1 for the
> > addition of
> > Vitaly to the core reviewer team.
> >
> > Thanks.
> >
> > [0]
> > 
> > https://review.openstack.org/#/q/reviewer:%22Vitaly+Gridnev+%253Cvgridnev%2540mirantis.com%253E%22,n,z
> > [1] http://stackalytics.com/report/contribution/sahara-group/180
> > [2] http://stackalytics.com/?metric=marks_id=vgridnev
> > [3]
> > 
> > https://review.openstack.org/#/q/status:merged+owner:%22Vitaly+Gridnev+%253Cvgridnev%2540mirantis.com%253E%22,n,z
> >
> > --
> > Sincerely yours,
> > Sergey Lukjanov
> > Sahara Technical Lead
> > (OpenStack Data Processing)
> > Principal Software Engineer
> > Mirantis Inc.
> >
> >
> > 
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > 
> > 
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > 
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > 
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> >
> > --
> > Sincerely yours,
> > Sergey Lukjanov
> > Sahara Technical Lead
> > (OpenStack Data Processing)
> > Principal Software Engineer
> > Mirantis Inc.
> >
> >
> > __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [Horizon] [Sahara] FFE request for Sahara unified job interface map UI

2015-09-08 Thread Trevor McKay
+1 from me as well.  It would be a shame to see this go to the next
cycle.

On Fri, 2015-09-04 at 10:40 -0400, Ethan Gafford wrote:
> Hello all,
> 
> I request a FFE for the change at: https://review.openstack.org/#/c/209683/
> 
> This change enables a significant improvement to UX in Sahara's elastic data 
> processing flow which is already in the server and client layers of Sahara. 
> Because it specifically aims at improving ease of use and comprehensibility, 
> Horizon integration is critical to the success of the feature. The change 
> itself is reasonably modular and thus low-risk; it will have no impact 
> outside Sahara's job template creation and launch flow, and (failing 
> unforseen issues) no impact to users of the existing flow who choose not to 
> use this feature.
> 
> Thank you,
> Ethan
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Proposing Ethan Gafford for the core reviewer team

2015-08-14 Thread Trevor McKay
Hi Telles, 

 you technically don't get a vote, but thanks anyway :)

Trev

On Fri, 2015-08-14 at 12:14 +, Telles Nobrega wrote:
 +1
 
 On Fri, Aug 14, 2015 at 7:11 AM Alexander Ignatov
 aigna...@mirantis.com wrote:
 
 +1
 
 Regards,
 Alexander Ignatov
 
 
 
 
 
  On 13 Aug 2015, at 18:29, Sergey Reshetnyak
  sreshetn...@mirantis.com wrote:
  
  +2
  
  2015-08-13 18:07 GMT+03:00 Matthew Farrellee
  m...@redhat.com:
  On 08/13/2015 10:56 AM, Sergey Lukjanov wrote:
  Hi folks,
  
  I'd like to propose Ethan Gafford as a
  member of the Sahara core
  reviewer team.
  
  Ethan contributing to Sahara for a long time
  and doing a great job on
  reviewing and improving Sahara. Here are the
  statistics for reviews
  [0][1][2] and commits [3]. BTW Ethan is
  already stable maint team core
  for Sahara.
  
  Existing Sahara core reviewers, please vote
  +1/-1 for the addition of
  Ethan to the core reviewer team.
  
  Thanks.
  
  [0]
  https://review.openstack.org/#/q/reviewer:%
  22Ethan+Gafford+%253Cegafford%2540redhat.com
  %253E%22,n,z
  [1]
  
 http://stackalytics.com/report/contribution/sahara-group/90
  [2]
  
 http://stackalytics.com/?user_id=egaffordmetric=marks
  [3]
  https://review.openstack.org/#/q/owner:%
  22Ethan+Gafford+%253Cegafford%2540redhat.com
  %253E%22+status:merged,n,z
  
  --
  Sincerely yours,
  Sergey Lukjanov
  Sahara Technical Lead
  (OpenStack Data Processing)
  Principal Software Engineer
  Mirantis Inc.
  
  
  +1 ethan has really taken to sahara, providing
  valuable input to both development and deployments
  as well has taking on the manila integration
  
  
  
  
 __
  OpenStack Development Mailing List (not for usage
  questions)
  Unsubscribe:
  
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
  
 __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 -- 
 
 Telles Nobrega
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Proposing Ethan Gafford for the core reviewer team

2015-08-14 Thread Trevor McKay
Flavio,

  thanks, bad joke on my part. I work with Telles on Sahara, just poking
him in jest.  Apologies, didn't mean to create an issue on the list.

Trev

On Fri, 2015-08-14 at 17:30 +0200, Flavio Percoco wrote:
 On 14/08/15 09:29 -0400, Trevor McKay wrote:
 Hi Telles,
 
  you technically don't get a vote, but thanks anyway :)
 
 Hi Trevor,
 
 Technically, everyone gets to vote and speak up. Regardless of whether
 you're a core-reviewer or not. Most of the time, non-core contributors
 provide amazing feedback on what their experience has been while
 receiving reviews from the nominated person.
 
 Regardless of the comment, we as a community always welcome
 contributor's opinions and encourage folks to speak up.
 
 I knew your intentions are good but I thought it'd be a good time to
 share the above so that it would work as a reminder for others as
 well.
 
 Thank you both and +1 for Ethan ;)
 Flavio
 
 
 Trev
 
 On Fri, 2015-08-14 at 12:14 +, Telles Nobrega wrote:
  +1
 
  On Fri, Aug 14, 2015 at 7:11 AM Alexander Ignatov
  aigna...@mirantis.com wrote:
 
  +1
 
  Regards,
  Alexander Ignatov
 
 
 
 
 
   On 13 Aug 2015, at 18:29, Sergey Reshetnyak
   sreshetn...@mirantis.com wrote:
  
   +2
  
   2015-08-13 18:07 GMT+03:00 Matthew Farrellee
   m...@redhat.com:
   On 08/13/2015 10:56 AM, Sergey Lukjanov wrote:
   Hi folks,
  
   I'd like to propose Ethan Gafford as a
   member of the Sahara core
   reviewer team.
  
   Ethan contributing to Sahara for a long time
   and doing a great job on
   reviewing and improving Sahara. Here are the
   statistics for reviews
   [0][1][2] and commits [3]. BTW Ethan is
   already stable maint team core
   for Sahara.
  
   Existing Sahara core reviewers, please vote
   +1/-1 for the addition of
   Ethan to the core reviewer team.
  
   Thanks.
  
   [0]
   https://review.openstack.org/#/q/reviewer:%
   22Ethan+Gafford+%253Cegafford%2540redhat.com
   %253E%22,n,z
   [1]
   
  http://stackalytics.com/report/contribution/sahara-group/90
   [2]
   
  http://stackalytics.com/?user_id=egaffordmetric=marks
   [3]
   https://review.openstack.org/#/q/owner:%
   22Ethan+Gafford+%253Cegafford%2540redhat.com
   %253E%22+status:merged,n,z
  
   --
   Sincerely yours,
   Sergey Lukjanov
   Sahara Technical Lead
   (OpenStack Data Processing)
   Principal Software Engineer
   Mirantis Inc.
  
  
   +1 ethan has really taken to sahara, providing
   valuable input to both development and deployments
   as well has taking on the manila integration
  
  
  
   
  __
   OpenStack Development Mailing List (not for usage
   questions)
   Unsubscribe:
   
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
   
  __
   OpenStack Development Mailing List (not for usage questions)
   Unsubscribe:
   openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
 
 
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe:
  openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  --
 
  Telles Nobrega
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [sahara] Proposing Ethan Gafford for the core reviewer team

2015-08-13 Thread Trevor McKay
+1, welcome addition

On Thu, 2015-08-13 at 17:56 +0300, Sergey Lukjanov wrote:
 Hi folks,
 
 
 I'd like to propose Ethan Gafford as a member of the Sahara core
 reviewer team.
 
 
 Ethan contributing to Sahara for a long time and doing a great job on
 reviewing and improving Sahara. Here are the statistics for reviews
 [0][1][2] and commits [3]. BTW Ethan is already stable maint team core
 for Sahara.
 
 
 Existing Sahara core reviewers, please vote +1/-1 for the addition of
 Ethan to the core reviewer team.
 
 
 Thanks.
 
 
 [0] https://review.openstack.org/#/q/reviewer:%22Ethan+Gafford+%
 253Cegafford%2540redhat.com%253E%22,n,z
 [1] http://stackalytics.com/report/contribution/sahara-group/90
 [2] http://stackalytics.com/?user_id=egaffordmetric=marks
 [3] https://review.openstack.org/#/q/owner:%22Ethan+Gafford+%
 253Cegafford%2540redhat.com%253E%22+status:merged,n,z
 
 
 -- 
 Sincerely yours,
 Sergey Lukjanov
 Sahara Technical Lead
 (OpenStack Data Processing)
 Principal Software Engineer
 Mirantis Inc.
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Sahara] [EDP] about get_job_status in oozie engine

2015-06-23 Thread Trevor McKay
Hi Lu,

  yes, you're right.  Return is a dictionary and for the other EDP
engines only status is returned (and we primarily care about
status).  For Oozie, there is more information.

  I'm fine with changing the name to get_job_info() throughout the
job_manager and EDP.

  It actually raises the question for me about whether or not in the
Oozie case we really even need the extra Oozie information in the Sahara
database.  I don't think we use it anywhere, not even sure the UI
displays it (but it might) or how much comes through the REST responses.

  Maybe we should have get_job_status() which returns only status, and
an optional get_job_info() that returns more? But that may be a bigger
discussion.

Best,

Trevor

On Tue, 2015-06-23 at 15:18 +0800, lu jander wrote:
 Hi Trevor
 
 
 in sahara oozie engine (sahara/service/edp/oozie/engine.py
 sahara/service/edp/oozie/oozie.py)
 
 
 function get_job_status actually returns not only the status of the
 job, but it returns all the info about the job, so i think that we
 should  rename this function as get_job_info maybe more convenient for
 us? cause I want add a function named get_job_info but i find that it
 already exists here with a confused name. 



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara][CDH] Is it possible to add CDH5.4 into Kilo release now?

2015-04-29 Thread Trevor McKay
Kilo is closed to everything but critical bugs at this point.

Going forward, this change probably counts as a new feature and is
forbidden as a backport to Kilo too. So Liberty is the earliest
opportunity.

https://wiki.openstack.org/wiki/StableBranch#Appropriate_Fixes

Trev

On Wed, 2015-04-29 at 04:49 +, Chen, Ken wrote:
 Hi all,
 
 Currently Cloudera has already release CDH5.4.0 version. I have
 already registered a bp and submitted two patches for it
 (https://blueprints.launchpad.net/sahara/+spec/cdh-5-4-support) .
 However, they are for master stream, and Cloudera hope it can be added
 to the latest release version of Sahara (Kilo release) so that they
 can give better support to their customers. I am not sure whether it
 is possible to do this at this stage?
 
  
 
 -Ken
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] About Sahara EDP New Ideas for Liberty

2015-04-22 Thread Trevor McKay
Hi Ken,

  responses inline

On Wed, 2015-04-22 at 12:36 +, Chen, Ken wrote:
 Hi Trevor, 
 I saw below items in Proposed Sprint Topics of sahara liberty.
 https://etherpad.openstack.org/p/sahara-liberty-proposed-sessions. I
 guess these are the EDP ideas we want to discuss on Vancouver design
 summit. We have some comments as below: 

Yes, feel free to add anything else to the pad.  We'll talk about as
much as we have time for.  I'm thinking that most of it will be covered
on Friday, or in between sessions during the week if folks are around.

 o   job scheduler (proposed by weiting) 
  we already have a spec on this, please help review it and give
 your comments and ideas. https://review.openstack.org/#/c/175719/  

Great! thanks

 o   more complex workflows (job dependencies, DAGs, etc. Do we
 rely on Oozie, or something else? 
  Huichun is now figuring this. I am not whether you guys already
 have some detail ideas about this? If needed we can contribute some
 effort. If no details are ready, we can help draw a draft version
 first. 

No work on this so far, although we have talked about it off and on for
a few cycles. Oozie has a lot of capabilities for coordination, but we
are not Oozie-only, so what do we do?  This is the central question.

 o   job interface mapping
 https://blueprints.launchpad.net/sahara/+spec/unified-job-interface-map 
 proposed in Kilo but moved to Liberty 
    ++ high priority in my opinion.  Should be done early, awesome
 feature 
  seems interesting. We agree EDP UI should be improved. In fact
 we have some unclear thinking about EDP inside our team. Some guys do
 not like current EDP design, and think it is more like a re-design
 of oozie or spark UI, instead of a universal interface to users.
 However, we have not a clear strategy on this part. 

Yes, Oozie had a heavy influence on EDP. This is partly historical --
EDP was written rapidly between Havanna and Icehouse and based on Oozie
since it offered handling of jobs and multiple types out of the box. It
was a quick path to EDP functionality.

However, EDP should be more of a universal interface. We only support a
few conceptual operations -- run job, cancel job, and job status. With
those three operations, we should be able to run anything. For example,
recently Telles has been working on Storm support.

The job interface mapping will help generalize how arguments are passed
to jobs and allow us to remove some assumptions about jobs.  I am all
for other generalizations that will move EDP further in the direction of
a general interface.

 o   early error detection to help transient clusters -- how many
 things can we detect early that can go wrong with an EDP job so that
 we return an error before spinning up the cluster (only to find that
 the job fails once the cluster is launched?) Ex, bad swift paths 
  seems easier, but may include some trivial work. 

Some of this will be folded into the job interface mapping.  Ethan has
just updated the spec to include input_datasource and
output_datasource as argument types. If we know what will be done with
a datasource, we can potentially validate it before the job runs.

 •   Spark plugins -- we have an independent Spark plugin, but we
 also have Spark supported by mapr, and in the future it will be
 supported by Ambari.  Should we continue to carry a simple Spark
 standalone plugin?  Or should we work toward shifting our Spark
 support to one or more vendor plugins? 
 Not sure what this will impact. 

Thought about this more. Overlap in the plugins is fine, as long as
there is someone in the community willing and able to support it. 

 -Ken 



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] About Sahara EDP New Ideas for Liberty

2015-04-22 Thread Trevor McKay
On Wed, 2015-04-22 at 12:36 +, Chen, Ken wrote:
 o more complex workflows (job dependencies, DAGs, etc. Do we rely on 
 Oozie, or something else?
  Huichun is now figuring this. I am not whether you guys already have 
 some detail ideas about this? If needed we can contribute some effort. If no 
 details are ready, we can help draw a draft version first.

I just made a note on the pad 

https://etherpad.openstack.org/p/sahara-liberty-proposed-sessions

Maybe the right approach here is to develop a mapping notation that can
be expressed as a JSON object (like the proposed job interface mapping).

If we can develop an abstract way to describe relationships between
jobs, then the individual EDP engines can implement it. For the Oozie
EDP engine, maybe it uses Oozie features in workflows.  For Spark, or
Storm, maybe it uses some existing opensource coordinator or one is
written.

The key idea would be to make job coordination part of the EDP engine,
with a well defined set of objects to describe the relationships.

What do you think? Just a rough idea.  Maybe there is a better way.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Global Cluster Template in Sahara

2015-04-16 Thread Trevor McKay
Thanks, Sergey. I agree -- my first thought was default templates plus
new ACLs!

Yanchao, note that with the current default template mechanism, the
templates can be added per tenant -- you can run the script multiple
times and change the tenant id. We thought this provided enough
functionality in the short term, until new ACLs can be added.

Trev

On Thu, 2015-04-16 at 13:22 +0300, Sergey Lukjanov wrote:
 Hi,
 
 
 first of all - yes, we've implemented mechanism for default templates
 addition in Kilo, please, take a look on this spec and related
 changes:
 http://specs.openstack.org/openstack/sahara-specs/specs/kilo/default-templates.html
  
 
 
 Regarding to your case, it's in fact about the admin-only writable
 templates shared between all tenants. We have a blueprint for
 implementing ACL for all Sahara resources -
 https://blueprints.launchpad.net/sahara/+spec/resources-acl . It's
 about implementing extended and flexible way to configure ACLs for
 resources and to provide end-users an ability to have the following
 types of resources:
 
 
 * default - tenant specific, anyone in tenant could edit or delete
 * public - shared between tenants in read-only mode, writable for
 users in tenant where it was created
 * protected - if True than could not be removed before updated to
 False using the resource update operation 
 * admin or protected=Admin - to make only admin users able to
 write/delete resource
 
 
 during the Kilo cycle we've been discussing this idea and initially
 agreed on it, because it sounds like the most OpenStackish way to
 provide such functionality. I have a draft spec for it (not yet
 published), I will publish it today/tomorrow and send a link to it to
 this thread.
 
 
 Yanchao, does this ACL mechanism covers your use case? Any feedback
 appreciated.
 
 
 
 Thanks.
 
 On Thu, Apr 16, 2015 at 3:19 AM, lu jander juvenboy1...@gmail.com
 wrote:
 We have already implement the default template for sahara
 
 https://blueprints.launchpad.net/sahara/+spec/default-templates
 
 
 2015-04-16 5:22 GMT+08:00 Liang, Yanchao yanli...@ebay.com:
 
 Dear Openstack Developers,
 
 
 My name is Yanchao Liang. I am a software engineer in
 eBay, working on Hadoop as a Service on top of
 Openstack cloud.
 
 
 Right now we are using Sahara, Juno version. We want
 to stay current and introduce global template into
 sahara.
 
 
 In order to simplify the cluster creation process for
 user, we would like to create some cluster
 templates available for all users. User can just go to
 the horizon webUI, select one of the pre-popluated
 templates and create a hadoop cluster, in just a few
 clicks.  
 
 
 Here is how I would implement this feature: 
   * In the database, Create a new column
 in “cluster_templates table
 called “is_global”, which is a boolean value
 indicating whether the template is available
 for all users or not.
   * When user getting the cluster template from
 database,  add another function similar to
 “cluster_template_get”, which query the
 database for global templates.
   * When creating cluster, put the user’s tenant
 id in the “merged_values” config variable,
 instead of the tenant id from cluster
 template.
   * Use an admin account create and manage global
 cluster templates
 Since I don’t know the code base as well as you do,
 what do you think about the global template idea? How
 would you implement this new feature? 
 
 
 We would like to contribute this feature back to the
 Openstack community. Any feedback would be greatly
 appreciated. Thank you.
 
 
 Best,
 Yanchao
 
 
 
 
 
 __
 OpenStack Development Mailing List (not for usage
 questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 

Re: [openstack-dev] About Sahara EDP New Ideas for Liberty

2015-03-24 Thread Trevor McKay
Weiting, Andrew,

Agreed, great ideas!  As Andrew noted, we have discussed some of these
things before and it would be great to discuss them in Vancouver.

I think that a Sahara-side workflow manager is the right approach. Oozie
has a lot of capability for job coordination, but it won't work for all
of our cluster and job types.

Notes on Spark in particular -- when we implemented Spark EDP, we looked
at various implementations for a Spark job server.  One was to extend
Oozie, one was to use the Ooyala Spark job server, and one was to use
ssh around spark-submit.  We chose the last, notes are here:

https://etherpad.openstack.org/p/sahara_spark_edp

We could potentially revisit the Ooyala job server.  My impression at
the time was that for the functions we wanted, it was pretty heavy. But
if we are going to add job coordination as a general feature, it may be
appropriate. I believe in the Spark community it is the dominant
solution for job management, open source is here:

https://github.com/spark-jobserver/spark-jobserver

As part of the Spark investigation, I posted on this JIRA, too. This is
a JIRA for developing a REST api to the spark job server, which may be
enough for us to build our own coordination system:

https://issues.apache.org/jira/browse/SPARK-3644

Best,

Trevor

On Tue, 2015-03-24 at 01:55 +, Chen, Weiting wrote:
 Hi Andrew.
 
  
 
 Thanks for response. My reply in line.
 
  
 
 From: Andrew Lazarev [mailto:alaza...@mirantis.com] 
 Sent: Saturday, March 21, 2015 12:10 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] About Sahara EDP New Ideas for Liberty
 
  
 
 Hi Weiting,
 
  
 
 
 1. Add a schedule feature to run the jobs on time:
 
 
 This request comes from the customer, they usually run the job in a
 specific time every day. So it should be great if there
 
 
  is a scheduler to help arrange the regular job to run.
 
 
 Looks like a great feature. And should be quite easy to implement.
 Feel free to create spec for that.
 
 
 [Weiting] We are working on the spec and the bp has already been
 registered in
 https://blueprints.launchpad.net/sahara/+spec/enable-scheduled-edp-jobs.
 
  
 
 
 2. A more complex workflow design in Sahara EDP:
 
 
 Current EDP only provide one job that is running on one cluster.
 
 
 Yes. And ability to run several jobs in one oozie workflow is
 discussed on every summit (e.g. 'coordinated jobs' at
 https://etherpad.openstack.org/p/kilo-summit-sahara-edp). But for now
 it was not a priority
 
 
  
 
 
 But in a real case, it should be more complex, they usually use
 multiple jobs to calculate the data and may use several different type
 clusters to process it..
 
 
 It means that workflow manager should be on Sahara side. Looks like a
 complicated feature. But we would be happy to help with designing and
 implementing it. Please file proposal for design session on ongoing
 summit. Are you going to Vancouver?
 
 
 [Weiting] I’m not sure I will be there because the plan is still not
 ready yet. We are also looking for some customer’s real case in big
 data area and see how they are using data processing in current
 environment. However, for any idea we can update later. 
 
  
 
 
 Another concern is about Spark, for Spark it cannot use Oozie to do
 this. So we need to create an abstract layer to help to implement this
 kind of scenarios.
 
 
 If workflow is on Sahara side it should work automatically for all
 engines.
 
 [Weiting] Yes, agree.
 
 
  
 
 
 Thanks,
 
 
 Andrew.
 
  
 
 
  
 
 
  
 
 On Sun, Mar 8, 2015 at 3:17 AM, Chen, Weiting weiting.c...@intel.com
 wrote:
 
 Hi all.
 
  
 
 We got several feedbacks about Sahara EDP’s future from some
 China customers.
 
 Here are some ideas we would like to share with you and need
 your input if we can implement them in Sahara(Liberty).
 
  
 
 1. Add a schedule feature to run the jobs on time:
 
 This request comes from the customer, they usually run the job
 in a specific time every day. So it should be great if there
 is a scheduler to help arrange the regular job to run.
 
  
 
 2. A more complex workflow design in Sahara EDP:
 
 Current EDP only provide one job that is running on one
 cluster. 
 
 But in a real case, it should be more complex, they usually
 use multiple jobs to calculate the data and may use several
 different type clusters to process it.
 
 For example: Raw Data - Job A(Cluster A) - Job B(Cluster B)
 - Job C(Cluster A) - Result
 
 Actually in my opinion, this kind of job could be easy to
 implement by using Oozie as a workflow engine. But for current
 EDP, it doesn’t implement this kind of complex case.
 
 Another concern is about Spark, for 

Re: [openstack-dev] [Sahara][Horizon] Can't open Data Processing panel after update sahara horizon

2015-03-18 Thread Trevor McKay
Hi Li,

  I am using a fresh devstack with Horizon deployed as part of devstack.
I am running Sahara separately from the command line from the git
sources (master branch).

  I use a little script to register the Sahara endpoint so that Horizon
sees it.
The only change I had to make was to register the service type  as
data-processing instead
of data_processing (below). Other than that, I don't have any problems.

  If you are really stuck, you can always wipe out the database and
rebuild to get beyond the issue.
With mysql I use

$ mysqladmin drop sahara
$ mysqladmin create sahara
$ sahara-db-manage --config-file /etc/sahara/sahara.conf upgrade head

If this error is reliably reproducible, would you create a bug in
launchpad with detailed steps to reproduce?
It's not clear to me what the issue is.

Thanks,

Trevor

-- 

#!/bin/bash
keystone service-create --name sahara --type data-processing
keystone endpoint-create --region RegionOne --service sahara --publicurl
'http://localhost:8386/v1.1/$(tenant_id)s'

On Wed, 2015-03-18 at 03:05 +, Li, Chen wrote:
 Hi all,
 
  
 
 I’m working under Ubuntu14.04 with devstack.
 
  
 
 After the fresh devstack installation, I run a integration test to
 test the environment.
 
 After the test, cluster and tested edp jobs remains in my environment.
 
  
 
 Then I updated sahara to the lasted code.
 
 To make the newest code work, I also did :
 
 1.  manually download python-novaclient and by running “python
 setup.py install” to install it
 
 2.  run “sahara-db-manage --config-file /etc/sahara/sahara.conf
 upgrade head”
 
  
 
 Then I restarted sahara.
 
  
 
 I tried to delete things remained from last test from dashboard, but
  :
 
 1.  The table for “job_executions” can’t be opened anymore.
 
 2.  When I try to delete “job”, an error happened:
 
  
 
 2015-03-18 10:34:33.031 ERROR oslo_db.sqlalchemy.exc_filters [-]
 DBAPIError exception wrapped from (IntegrityError) (1451, 'Cannot
 delete or update a parent row: a foreign key constraint fails
 (`sahara`.`job_executions`, CONSTRAINT `job_executions_ibfk_3` FOREIGN
 KEY (`job_id`) REFERENCES `jobs` (`id`))') 'DELETE FROM jobs WHERE
 jobs.id = %s' ('10c36a9b-a855-44b6-af60-0effee31efc9',)
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters Traceback
 (most recent call last):
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters   File
 /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py,
 line 951, in _execute_context
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters
 context)
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters   File
 /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py,
 line 436, in do_execute
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters
 cursor.execute(statement, parameters)
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters   File
 /usr/lib/python2.7/dist-packages/MySQLdb/cursors.py, line 174, in
 execute
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters
 self.errorhandler(self, exc, value)
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters   File
 /usr/lib/python2.7/dist-packages/MySQLdb/connections.py, line 36, in
 defaulterrorhandler
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters raise
 errorclass, errorvalue
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters
 IntegrityError: (1451, 'Cannot delete or update a parent row: a
 foreign key constraint fails (`sahara`.`job_executions`, CONSTRAINT
 `job_executions_ibfk_3` FOREIGN KEY (`job_id`) REFERENCES `jobs`
 (`id`))')
 
 2015-03-18 10:34:33.031 TRACE oslo_db.sqlalchemy.exc_filters
 
 2015-03-18 10:34:33.073 DEBUG sahara.openstack.common.periodic_task
 [-] Running periodic task
 SaharaPeriodicTasks.terminate_unneeded_transient_clusters from
 (pid=8084)
 run_periodic_tasks 
 /opt/stack/sahara/sahara/openstack/common/periodic_task.py:219
 
 2015-03-18 10:34:33.073 DEBUG sahara.service.periodic [-] Terminating
 unneeded transient clusters from (pid=8084)
 terminate_unneeded_transient_clusters 
 /opt/stack/sahara/sahara/service/periodic.py:131
 
 2015-03-18 10:34:33.108 ERROR sahara.utils.api [-] Validation Error
 occurred: error_code=400, error_message=Job deletion failed on foreign
 key constraint
 
 Error ID: e65b3fb1-b142-45a7-bc96-416efb14de84,
 error_name=DELETION_FAILED
 
  
 
 I assume this might be caused by an old horizon version, so I did :
 
 1.  update horizon code.
 
 2.  python manage.py compress
 
 3.  sudo python setup.py install
 
 4.  sudo service apache2 restart
 
  
 
 But these only make things worse.
 
 Now, when I click “Data Processing” on dashboard, there is no return
 action anymore.
 
  
 
 Anyone can help me here ?
 
 What I did wrong ?
 
 How can I fix this ?
 
  
 
 I tested sahara CLI, command like “sahara job-list”  ”sahara
 job-delete” can still work.
 
 So I guess sahara is working fine.
 
  
 
 Thanks.
 
 -chen
 
 
 

[openstack-dev] [sahara] Feedback on default-templates implementation

2015-02-27 Thread Trevor McKay
Hi Sahara folks,

  please checkout

https://review.openstack.org/#/c/159872/

and respond there in comments, or here on the email thread. We have some
things to figure out for this new CLI, and I want to make sure that we
make
sane choices and set a good precedent.

Once we have a consensus we can go back and extend the spec with more
detail and merge approved changes.  The original spec did not get into
this
kind of detail (because nobody had sat down and tried to do it :) )

The only other CLI we have at this point that touches Sahara components
(other than the python-saharaclient) is sahara-db-manage, but that uses
alembic commands to drive the database directly.  It doesn't really
touch
Sahara in any semantic way, it just drives migration.  It is blissfully
ignorant
of any Sahara object relationships and semantics outside of the table
definitions.

The default-templates CLI will be a new kind of tool, I believe, that we
don't
have yet. It will be tightly integrated with sahara-all and horizon. So
how it
is designed matters a lot imho.

Best,

Trevor

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [stable] New to project stable maintenance, question on requirements changes

2015-02-24 Thread Trevor McKay
Hi folks,

I've just joined the stable maintenance team for Sahara.

We have this review here, from OpenStack proposal bot:

https://review.openstack.org/158775/

Since it came from the proposal bot, there's no justification in the
commit message and no cherry pick.

I didn't see this case covered as one of the strict set in

https://wiki.openstack.org/wiki/StableBranch

Do we trust the proposal bot? How do I know I should trust it? On
master, I assume if there
is a mistake it will soon be rectified, but stable ...  Do we have a doc
that talks about stable maintenance
and requirements changes?  Should we?

Am I being paranoid? :)

Thanks,

Trevor 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [stable] New to project stable maintenance, question on requirements changes

2015-02-24 Thread Trevor McKay
Sean, 

 thanks!  I feel better already. I'll check out the review.

Trevor

On Tue, 2015-02-24 at 14:39 -0500, Sean Dague wrote:
 On 02/24/2015 02:34 PM, Trevor McKay wrote:
  Hi folks,
  
  I've just joined the stable maintenance team for Sahara.
  
  We have this review here, from OpenStack proposal bot:
  
  https://review.openstack.org/158775/
  
  Since it came from the proposal bot, there's no justification in the
  commit message and no cherry pick.
  
  I didn't see this case covered as one of the strict set in
  
  https://wiki.openstack.org/wiki/StableBranch
  
  Do we trust the proposal bot? How do I know I should trust it? On
  master, I assume if there
  is a mistake it will soon be rectified, but stable ...  Do we have a doc
  that talks about stable maintenance
  and requirements changes?  Should we?
  
  Am I being paranoid? :)
 
 Slightly, but that's probably good.
 
 Requirements proposal bot changes had to first be Approved on the
 corresponding requirements stable branch, so that should be both safe,
 and mandatory to go in.
 
 I agree that it would be nicer to have more justification in there.
 There is the beginning of the patch up to do something a bit better here
 - https://review.openstack.org/#/c/145932/ - though it could use to be
 improved.
 
   -Sean
 



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hi folks,

Here is another way to do this.  Lu had mentioned Oozie shell actions
previously.
Sahara doesn't support them, but I played with it from the Oozie command
line
to verify that it solves our hbase problem, too.

We can potentially create a blueprint to build a simple Shell action
around a
user-supplied script and supporting files.  The script and files would
all be stored
in Sahara as job binaries (Swift or internal db) and referenced the same
way. The exec
target must be on the path at runtime, or included in the working dir.

To do this, I simply put workflow.xml, doit.sh, and the test jar into
a directory in hdfs.  Then I ran it with the Oozie cli using the job.xml
config file
configured to point at the hdfs dir.  Nothing special here, just
standard Oozie
job execution.

I've attached everything here but the test jar.

$ oozie job -oozie http://localhost:11000/oozie -config job.xml -run

Best,

Trev

On Thu, 2015-02-12 at 08:39 -0500, Trevor McKay wrote:

 Hi Lu, folks,
 
 I've been investigating how to run Java actions in Sahara EDP that
 depend on 
 HBase libraries (see snippet from the original question by Lu below).
 
 In a nutshell, we need to use Oozie sharelibs for this. I am working
 on a spec now, thinking 
 about the best way to support this in Sahara, but here is a
 semi-manual intermediate solution
 that will work if you would like to run such a job from Sahara.
 
 1) Create your own Oozie sharelib that contains the HBase jars.
 
 This ultimately is just an HDFS dir holding the jars.  On any node in
 your cluster with 
 HBase installed, run the attached script or something like it (I like
 Python better than bash :) )
 It simply separates the classpath and uploads all the jars to the
 specified HDFS dir.
 
 $ parsePath.py /user/myhbaselib
 
 2) Run your Java action from EDP, but use the oozie.libpath
 configuration value when you
 launch the job.  For example, on the job configure tab set
 oozie.libpath like this:
 
 NameValue
 
 oozie.libpathhdfs://namenode:8020/user/myhbaselib
 
 (note, support for this was added in
 https://review.openstack.org/#/c/154214/)
 
 That's it! In general, you can add any jars that you want to a
 sharelib and then set the
 oozie.libpath for the job to access them.
 
 Here is a good blog entry about sharelibs and extra jars in Oozie
 jobs:
 
 http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/
 
 Best,
 
 Trevor
 
 --- original question
 (1) EDP job in Java action
 
The background is that we want write integration test case for
 newly added services like HBase, zookeeper just like the way the
 edp-examples does( sample code under sahara/etc/edp-examples/). So I
 thought I can wrote an example via edp job by Java action to test
 HBase Service, then I wrote the HBaseTest.java and packaged as a jar
 file, and run this jar manually with the command java -cp `hbase
 classpath` HBaseTest.jar HBaseTest, it works well in the
 vm(provisioned by sahara with cdh plugin). 
 “/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp
 HBaseTest.jar:`hbase classpath` HBaseTest”
 So I want run this job via horizon in sahara job execution page, but
 found no place to pass the `hbase classpath` parameter.(I have tried
 java_opt and configuration and args, all failed). When I pass the “-cp
 `hbase classpath`” to java_opts in horizon job execution page. Oozie
 raise this error as below
 
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




workflow.xml
Description: XML document


doit.sh
Description: application/shellscript


job.xml
Description: XML document
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Shell Action, Re: Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hmm, my attachments were removed :)

Well, the interesting parts were the doit.sh and workflow.xml:

$ more doit.sh 
#!/bin/bash
/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp HBaseTest.jar:`hbase
classpath` HBaseTest

$ more workflow.xml
workflow-app xmlns='uri:oozie:workflow:0.3' name='shell-wf'
start to='shell1' /
action name='shell1'
shell xmlns=uri:oozie:shell-action:0.1
job-tracker${jobTracker}/job-tracker
name-node${nameNode}/name-node
configuration
property
  namemapred.job.queue.name/name
  valuedefault/value
/property
/configuration
execdoit.sh/exec
fileHBaseTest.jar/file
filedoit.sh/file
/shell
ok to=end /
error to=fail /
/action
kill name=fail
messageScript failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]/message
/kill
end name='end' /
/workflow-app



On Thu, 2015-02-12 at 17:15 -0500, Trevor McKay wrote:

 Hi folks,
 
 Here is another way to do this.  Lu had mentioned Oozie shell actions
 previously.
 Sahara doesn't support them, but I played with it from the Oozie
 command line
 to verify that it solves our hbase problem, too.
 
 We can potentially create a blueprint to build a simple Shell action
 around a
 user-supplied script and supporting files.  The script and files would
 all be stored
 in Sahara as job binaries (Swift or internal db) and referenced the
 same way. The exec
 target must be on the path at runtime, or included in the working dir.
 
 To do this, I simply put workflow.xml, doit.sh, and the test jar into
 a directory in hdfs.  Then I ran it with the Oozie cli using the
 job.xml config file
 configured to point at the hdfs dir.  Nothing special here, just
 standard Oozie
 job execution.
 
 I've attached everything here but the test jar.
 
 $ oozie job -oozie http://localhost:11000/oozie -config job.xml -run
 
 Best,
 
 Trev
 
 On Thu, 2015-02-12 at 08:39 -0500, Trevor McKay wrote:
 
  Hi Lu, folks,
  
  I've been investigating how to run Java actions in Sahara EDP that
  depend on 
  HBase libraries (see snippet from the original question by Lu
  below).
  
  In a nutshell, we need to use Oozie sharelibs for this. I am working
  on a spec now, thinking 
  about the best way to support this in Sahara, but here is a
  semi-manual intermediate solution
  that will work if you would like to run such a job from Sahara.
  
  1) Create your own Oozie sharelib that contains the HBase jars.
  
  This ultimately is just an HDFS dir holding the jars.  On any node
  in your cluster with 
  HBase installed, run the attached script or something like it (I
  like Python better than bash :) )
  It simply separates the classpath and uploads all the jars to the
  specified HDFS dir.
  
  $ parsePath.py /user/myhbaselib
  
  2) Run your Java action from EDP, but use the oozie.libpath
  configuration value when you
  launch the job.  For example, on the job configure tab set
  oozie.libpath like this:
  
  NameValue
  
  oozie.libpathhdfs://namenode:8020/user/myhbaselib
  
  (note, support for this was added in
  https://review.openstack.org/#/c/154214/)
  
  That's it! In general, you can add any jars that you want to a
  sharelib and then set the
  oozie.libpath for the job to access them.
  
  Here is a good blog entry about sharelibs and extra jars in Oozie
  jobs:
  
  http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/
  
  Best,
  
  Trevor
  
  --- original question
  (1) EDP job in Java action
  
 The background is that we want write integration test case for
  newly added services like HBase, zookeeper just like the way the
  edp-examples does( sample code under sahara/etc/edp-examples/). So I
  thought I can wrote an example via edp job by Java action to test
  HBase Service, then I wrote the HBaseTest.java and packaged as a jar
  file, and run this jar manually with the command java -cp `hbase
  classpath` HBaseTest.jar HBaseTest, it works well in the
  vm(provisioned by sahara with cdh plugin). 
  “/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp
  HBaseTest.jar:`hbase classpath` HBaseTest”
  So I want run this job via horizon in sahara job execution page, but
  found no place to pass the `hbase classpath` parameter.(I have tried
  java_opt and configuration and args, all failed). When I pass the
  “-cp `hbase classpath`” to java_opts in horizon job execution page.
  Oozie raise this error as below
  
  
  
  __
  OpenStack Development Mailing List (not for usage questions)
  Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 __
 OpenStack Development

[openstack-dev] [sahara] Running HBase Jobs (was: About Sahara Oozie plan)

2015-02-12 Thread Trevor McKay
Hi Lu, folks,

I've been investigating how to run Java actions in Sahara EDP that
depend on 
HBase libraries (see snippet from the original question by Lu below).

In a nutshell, we need to use Oozie sharelibs for this. I am working on
a spec now, thinking 
about the best way to support this in Sahara, but here is a semi-manual
intermediate solution
that will work if you would like to run such a job from Sahara.

1) Create your own Oozie sharelib that contains the HBase jars.

This ultimately is just an HDFS dir holding the jars.  On any node in
your cluster with 
HBase installed, run the attached script or something like it (I like
Python better than bash :) )
It simply separates the classpath and uploads all the jars to the
specified HDFS dir.

$ parsePath.py /user/myhbaselib

2) Run your Java action from EDP, but use the oozie.libpath
configuration value when you
launch the job.  For example, on the job configure tab set oozie.libpath
like this:

NameValue

oozie.libpathhdfs://namenode:8020/user/myhbaselib

(note, support for this was added in
https://review.openstack.org/#/c/154214/)

That's it! In general, you can add any jars that you want to a sharelib
and then set the
oozie.libpath for the job to access them.

Here is a good blog entry about sharelibs and extra jars in Oozie jobs:

http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/

Best,

Trevor

--- original question
(1) EDP job in Java action

   The background is that we want write integration test case for newly
added services like HBase, zookeeper just like the way the edp-examples
does( sample code under sahara/etc/edp-examples/). So I thought I can
wrote an example via edp job by Java action to test HBase Service, then
I wrote the HBaseTest.java and packaged as a jar file, and run this jar
manually with the command java -cp `hbase classpath` HBaseTest.jar
HBaseTest, it works well in the vm(provisioned by sahara with cdh
plugin). 
“/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp HBaseTest.jar:`hbase
classpath` HBaseTest”
So I want run this job via horizon in sahara job execution page, but
found no place to pass the `hbase classpath` parameter.(I have tried
java_opt and configuration and args, all failed). When I pass the “-cp
`hbase classpath`” to java_opts in horizon job execution page. Oozie
raise this error as below


#!/usr/bin/python
import sys
import os
import subprocess

def main():
subprocess.Popen(hadoop fs -mkdir %s % sys.argv[1], shell=True).wait()
cp, stderr = subprocess.Popen(hbase classpath, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
paths = cp.split(':')		
for p in paths:
if p.endswith(.jar):
print(p)
subprocess.Popen(hadoop fs -put %s %s % (os.path.realpath(p), sys.argv[1]), shell=True).wait()

if __name__ == __main__:
main()
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Spark CDH followup and questions related to DIB

2015-02-02 Thread Trevor McKay
Hello all,

  I tried a Spark image with the cdh5 element Daniele describes below,
but it did not fix the jackson version issue. The spark assembly still
depends on inconsistent versions.

  Looking into the spark git a little bit more, I discovered that in the
cdh5-1.2.0_5.3.0 branch the jackson version is settable. I built spark
on this branch with jackson 1.9.13 and was able to run Spark EDP without
any classpath manipulations. But, it doesn't appear to be released yet.

  A couple questions come out of this:

1) When do we move to cdh5.3 for spark images? Do we try to do this in
Kilo?

The work is already started, as noted below.  Daniele has done initial
work using cdh5 for the spark plugin and the Intel folks are working on 
cdh5 and cdh5.3 for the CDH plugin.

2) Do we carry a Spark assembly for Sahara ourselves, or wait for a
release tarball from CDH that uses this branch and sets a consistent
jackson version?  

I asked about any plans to release a tarball from this
branch on the apache spark users list, waiting for a response.

One alternative is for us to host our own spark build that we can use in
sahara-image-elements. The other idea is for us to wait for a release
tarball at http://archive.apache.org/dist/spark/ and continue to use the
classpath workaround in spark EDP for the time being.

3) Do we fix up sahara-image-elements to support multiple spark
versions? 

Historically sahara-image-elements only supports a single version for
spark images.  This is different from the other plugins.  Since we have
agreed to carry support for a release cycle of older versions after
introducing a new one, should we support both cdh4 and cdh5.x? This will
require changes in diskimage_create.sh.

4) Like #3, do we fix up the spark plugin in Sahara to handle multiple
versions? This is similar to the work the Intel folks are doing now to
separate cdh5 and cdh5.3 code in the cdh plugin.

I am wondering if the above 4 issues result in too much work to add to
kilo-3. Do we make an incremental improvement over Juno, having
spark-swift integration in EDP on cdh4 but without other changes and
address the above issues in L, or do we push on and try to resolve it
all for Kilo?

Best regards,

Trevor

On Wed, 2015-01-28 at 11:57 -0500, Trevor McKay wrote:
 Daniele,
 
   Excellent! I'll have to keep a closer eye on bigfoot activity :) I'll
 pursue this.
 
 Best,
 
 Trevor
 
 On Wed, 2015-01-28 at 17:40 +0100, Daniele Venzano wrote:
  Hello everyone,
  
  there is already some code in our repository:
  https://github.com/bigfootproject/savanna-image-elements
  
  I did the necessary changes to have the Spark element use the cdh5
  element. I updated also to Spark 1.2. The old cloudera HDFS-only
  element is still needed for generating cdh4 images (but probably cdh4
  support can be thrown away).
  
  Unfortunately I do not have the time to do the necessary
  testing/validation and submit for review. I also changed the CDH
  element so that it can install only HDFS, if so required.
  The changes I made are simple and all contained in the last commit on
  the master branch of that repo.
  
  The image generated with this code runs in Sahara without any further
  changes. Feel free to take the code, clean it up and submit for review.
  
  Dan
  
  On Wed, Jan 28, 2015 at 10:43:30AM -0500, Trevor McKay wrote:
   Intel folks,
   
   Belated welcome to Sahara!  Thank you for your recent commits.
   
   Moving this thread to openstack-dev so others may contribute, cc'ing
   Daniele and Pietro who pioneered the Spark plugin.
   
   I'll respond with another email about Oozie work, but I want to
   address the Spark/Swift issue in CDH since I have been working
   on it and there is a task which still needs to be done -- that
   is to upgrade the CDH version in the spark image and see if
   the situation improves (see below)
   
   Relevant reviews are here:
   
   https://review.openstack.org/146659
   https://review.openstack.org/147955
   https://review.openstack.org/147985
   https://review.openstack.org/146659
   
   In the first review, you can see that we set an extra driver
   classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.
   
   This is because the spark-assembly JAR in CDH4 contains classes from
   jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
   hadoop-swift.jar dereferences a Swift path, it calls into code
   from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
   was removed in jackson-core-asl-1.9.x, so there is an exception.
   
   Therefore, we need to use the classpath to either upgrade the version of
   jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
   to 1.8.8 (both work in my testing).  However, the first of these options
   requires us to bundle an extra jar.  Since /usr/lib/hadoop already
   contains jackson-core-asl-1.8.8, it is easier to just add that to the
   classpath and downgrade the jackson version.
   
   Note

Re: [openstack-dev] About Sahara Oozie plan

2015-02-02 Thread Trevor McKay
Hi,

  Thanks for your patience.  I have been consumed with spark-swift, but
I can start to address these questions now :)

On (1) (a) below, I will try to reproduce and look at how we can better
support classpath in EDP. I'll let you know what I find.
We may need to add some configuration options for EDP or change how it
works.

On (1) (b) below, in the edp-move-examples.rst spec for Juno we
described a directory structure that could be used
for separating hadoop1 vs hadoop2 specific directories.  Maybe we can do
something similar based on plugins

For instance, if we have some hbase examples, we can make subdirectories
for each plugin.  Common parts can be
shared, plugin-specific files can be stored in the subdirectories.

(and perhaps the hadoop2 example already there should just be a
subdirectory under edp-java)

Best,

Trevor

--

Hi McKay
Thx for your support
I will talk details of these items as below:

(1) EDP job in Java action

   The background is that we want write integration test case for newly
added services like HBase, zookeeper just like the way the edp-examples
does( sample code under sahara/etc/edp-examples/). So I thought I can
wrote an example via edp job by Java action to test HBase Service, then
I wrote the HBaseTest.java and packaged as a jar file, and run this jar
manually with the command java -cp `hbase classpath` HBaseTest.jar
HBaseTest, it works well in the vm(provisioned by sahara with cdh
plugin).
“/usr/lib/jvm/java-7-oracle-cloudera/bin/java -cp HBaseTest.jar:`hbase
classpath` HBaseTest”
So I want run this job via horizon in sahara job execution page, but
found no place to pass the `hbase classpath` parameter.(I have tried
java_opt and configuration and args, all failed). When I pass the “-cp
`hbase classpath`” to java_opts in horizon job execution page. Oozie
raise this error as below

“2015-01-15 16:43:26,074 WARN
org.apache.oozie.action.hadoop.JavaActionExecutor:
SERVER[hbase-master-copy-copy-001.novalocal] USER[hdfs] GROUP[-] TOKEN[]
APP[job-wf] JOB[045-150105050354389-oozie-oozi-W]
ACTION[045-150105050354389-oozie-oozi-W@job-node] LauncherMapper
died, check Hadoop LOG for job
[hbase-master-copy-copy-001.novalocal:8032:job_1420434100219_0054]
2015-01-15 16:43:26,172 INFO
org.apache.oozie.command.wf.ActionEndXCommand:
SERVER[hbase-master-copy-copy-001.novalocal] USER[hdfs] GROUP[-] TOKEN[]
APP[job-wf] JOB[045-150105050354389-oozie-oozi-W]
ACTION[045-150105050354389-oozie-oozi-W@job-node] ERROR is
considered as FAILED for SLA”

So I stuck with this issue, I can’t write the integration test in sahara
( could not pass the classpath parameter), 
I have check oozie official site .
https://cwiki.apache.org/confluence/display/OOZIE/Java+Cookbook found no
help info.
 
   So about the EDP job in java, I have two problems right now:
a)  How to pass classpath to java action as I mention before. So
this also reminds me that we can allow user to modify or upload this own
workflow.xml, then we can provide more options for user.
b)  I concern that it’s hard to have a common edp-example for HBase
for all plugin(cdh dhp), because the example code depends on third party
jars(for example hbase-client.jar…) and different platform(CDH HDP) they
may have different version hbase-client.jar, for example, cdh use
hbase-client-0.98.6-cdh5.2.1.jar. 

attached is a zip file which contains HBaseTest.jar and the source code.
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] About Sahara Oozie plan

2015-02-02 Thread Trevor McKay
Answers to other questions:

2) (first part) Yes, I think Oozie shell actions are a great idea. I can
help work on a spec for this.

In general, Sahara should be able to support any kind of Oozie action.
Each will require a new job type, changes to the Oozie engine, and a UI
form to handle submission. We talked about shell actions once upon a
time. I don't think a spec for that will be too difficult.

Typically when adding new Oozie actions, I start by running things with
the Oozie command line to figure out what's possible and what the
workflow.xml looks like in general.


We also talked about allowing a user to upload raw workflows -- the
difficulty there is figuring out what Sahara generates vs what the user
writes, so this may be a more complicated topic. I think it will have to
wait for another cycle.

2) (error information)

Yes, the lack of good error information is a big problem in my opinion,
but we have no plan for it at this time.

The OpenStack approach seems to be to look through lots of log files to
identify errors.  For EDP, we may need to support a similar approach by
allowing job logs to be easily retrieved from clusters and written
somewhere a user can parse through them for error information.  Any
ideas on how to do this are welcome.

Trevor

-- 

(2) Sahara oozie plan

So when I search the solution for HBase test case, I found
http://archive.cloudera.com/cdh5/cdh/5/oozie/DG_ShellActionExtension.html , it 
talks about oozie shell action job type, I believe my first issue in EDP job in 
java action can be solved by shell action, because I can set execjava/exec 
argument`hbase classpath`/argument in workflow.xml, just like the way I run 
this jar in the vm console by command. So I raise a bp for adding oozie shell 
action https://blueprints.launchpad.net/sahara/+spec/add-edp-shell-action  I 
will make further research on the bp/specs and update the spec. In today’s 
meeting , you mentioned about allow user to upload his own workflow.xml, I am 
interesting about this , we can provide our support to this part, so can you 
provide some bp/specs or other docs for me? So we can discuss for more.

For more, is there any plan to provide edp job error info to the user? I
think this is also important, currently we just have killed label, no
more information.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] About Sahara Oozie plan and Spark CDH Issues

2015-01-28 Thread Trevor McKay
Intel folks,

Belated welcome to Sahara!  Thank you for your recent commits.

Moving this thread to openstack-dev so others may contribute, cc'ing
Daniele and Pietro who pioneered the Spark plugin.

I'll respond with another email about Oozie work, but I want to
address the Spark/Swift issue in CDH since I have been working
on it and there is a task which still needs to be done -- that
is to upgrade the CDH version in the spark image and see if
the situation improves (see below)

Relevant reviews are here:

https://review.openstack.org/146659
https://review.openstack.org/147955
https://review.openstack.org/147985
https://review.openstack.org/146659

In the first review, you can see that we set an extra driver
classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.

This is because the spark-assembly JAR in CDH4 contains classes from
jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
hadoop-swift.jar dereferences a Swift path, it calls into code
from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
was removed in jackson-core-asl-1.9.x, so there is an exception.

Therefore, we need to use the classpath to either upgrade the version of
jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
to 1.8.8 (both work in my testing).  However, the first of these options
requires us to bundle an extra jar.  Since /usr/lib/hadoop already
contains jackson-core-asl-1.8.8, it is easier to just add that to the
classpath and downgrade the jackson version.

Note, there are some references to this problem on the spark mailing list,
we are not the only ones to encounter it.

However, I am not completely comfortable with mixing versions and
patching the classpath this way.  It looks to me like the Spark assembly
used in CDH5 has consistent versions, and I would like to try updating
the CDH version in sahara-image-elments to CDH5 for Spark. If this fixes
the problem and removes the need for the extra classpath, that would be
great.

Would someone like to take on this change? (modifying sahara-image-elements
to use CDH5 for Spark images) I can make a blueprint for
it.

More to come about Oozie topics.

Best regards,

Trevor

On Thu, 2015-01-15 at 15:34 +, Chen, Weiting wrote:
 Hi Mckay.
 
  
 
 We are Intel team and contributing OpenStack Sahara project.
 
 We are new in Sahara and would like to do more contributions in this
 project.
 
 So far, we are focusing on Sahara CDH Plugin.
 
 So if there is any issues related on this, please feel free to discuss
 with us.
 
  
 
 During IRC meeting, there are two issues you mentioned and we would
 like to discuss with you.
 
 1.  Oozie Workflow Support: 
 
 Do you have any plan could share with us about your idea?
 
 Because in our case, we are testing to run a java action job with
 HBase library support and also facing some problems about Oozie
 support.
 
 So it should be good to share the experience with each other.
 
 
 
 2.  Spark CDH Issues: 
 
 Could you provide more information about this issue? In CDH Plugin, we
 have used CDH 5 to finish swift test. So it should be fine to upgrade
 CDH 4 to 5.
 
  
 
 



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] About Sahara Oozie plan and Spark CDH Issues

2015-01-28 Thread Trevor McKay
Daniele,

  Excellent! I'll have to keep a closer eye on bigfoot activity :) I'll
pursue this.

Best,

Trevor

On Wed, 2015-01-28 at 17:40 +0100, Daniele Venzano wrote:
 Hello everyone,
 
 there is already some code in our repository:
 https://github.com/bigfootproject/savanna-image-elements
 
 I did the necessary changes to have the Spark element use the cdh5
 element. I updated also to Spark 1.2. The old cloudera HDFS-only
 element is still needed for generating cdh4 images (but probably cdh4
 support can be thrown away).
 
 Unfortunately I do not have the time to do the necessary
 testing/validation and submit for review. I also changed the CDH
 element so that it can install only HDFS, if so required.
 The changes I made are simple and all contained in the last commit on
 the master branch of that repo.
 
 The image generated with this code runs in Sahara without any further
 changes. Feel free to take the code, clean it up and submit for review.
 
 Dan
 
 On Wed, Jan 28, 2015 at 10:43:30AM -0500, Trevor McKay wrote:
  Intel folks,
  
  Belated welcome to Sahara!  Thank you for your recent commits.
  
  Moving this thread to openstack-dev so others may contribute, cc'ing
  Daniele and Pietro who pioneered the Spark plugin.
  
  I'll respond with another email about Oozie work, but I want to
  address the Spark/Swift issue in CDH since I have been working
  on it and there is a task which still needs to be done -- that
  is to upgrade the CDH version in the spark image and see if
  the situation improves (see below)
  
  Relevant reviews are here:
  
  https://review.openstack.org/146659
  https://review.openstack.org/147955
  https://review.openstack.org/147985
  https://review.openstack.org/146659
  
  In the first review, you can see that we set an extra driver
  classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.
  
  This is because the spark-assembly JAR in CDH4 contains classes from
  jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
  hadoop-swift.jar dereferences a Swift path, it calls into code
  from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
  was removed in jackson-core-asl-1.9.x, so there is an exception.
  
  Therefore, we need to use the classpath to either upgrade the version of
  jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
  to 1.8.8 (both work in my testing).  However, the first of these options
  requires us to bundle an extra jar.  Since /usr/lib/hadoop already
  contains jackson-core-asl-1.8.8, it is easier to just add that to the
  classpath and downgrade the jackson version.
  
  Note, there are some references to this problem on the spark mailing list,
  we are not the only ones to encounter it.
  
  However, I am not completely comfortable with mixing versions and
  patching the classpath this way.  It looks to me like the Spark assembly
  used in CDH5 has consistent versions, and I would like to try updating
  the CDH version in sahara-image-elments to CDH5 for Spark. If this fixes
  the problem and removes the need for the extra classpath, that would be
  great.
  
  Would someone like to take on this change? (modifying sahara-image-elements
  to use CDH5 for Spark images) I can make a blueprint for
  it.
  
  More to come about Oozie topics.
  
  Best regards,
  
  Trevor
  
  On Thu, 2015-01-15 at 15:34 +, Chen, Weiting wrote:
   Hi Mckay.
   

   
   We are Intel team and contributing OpenStack Sahara project.
   
   We are new in Sahara and would like to do more contributions in this
   project.
   
   So far, we are focusing on Sahara CDH Plugin.
   
   So if there is any issues related on this, please feel free to discuss
   with us.
   

   
   During IRC meeting, there are two issues you mentioned and we would
   like to discuss with you.
   
   1.  Oozie Workflow Support: 
   
   Do you have any plan could share with us about your idea?
   
   Because in our case, we are testing to run a java action job with
   HBase library support and also facing some problems about Oozie
   support.
   
   So it should be good to share the experience with each other.
   
   
   
   2.  Spark CDH Issues: 
   
   Could you provide more information about this issue? In CDH Plugin, we
   have used CDH 5 to finish swift test. So it should be fine to upgrade
   CDH 4 to 5.
   

   
   
  
  
  
 



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Asia friendly IRC meeting time

2014-11-25 Thread Trevor McKay
+1 for me, too.  But I agree with Mike -- we need input from Saratov and
similar time zones.

I think though that the folks in Saratov generally try to shift hours
anyway, to better align with the rest of us.  So I think something that
works for the US will likely work for them.

Trev

On Tue, 2014-11-25 at 09:54 -0500, michael mccune wrote:
 On 11/25/2014 02:37 AM, Zhidong Yu wrote:
  I know it's difficult to find a time good for both US, Europe and Asia.
  I suggest we could have two different meeting series with different
  time, i.e. US/Europe meeting this week, US/Asia meeting next week, and
  so on.
 
 i'd be ok with alternating week schedules
 
  My proposal:
   18:00UTC: Moscow (9pm)China(2am) US West(10am)
   00:00UTC: Moscow (3am)China(8am) US West(4pm)
 
 
 this works for me, but i realize it might be difficult for the folks in 
 Russia. not sure if there is a better option though.
 
 thanks for putting this forward Zhidong
 
 regards,
 mike
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Discussion on cm_api for global requirement

2014-11-24 Thread Trevor McKay
Hello all,

  at our last Sahara IRC meeting we started discussing whether or not to add
a global requirement for cm_api.py https://review.openstack.org/#/c/130153/

  One issue (but not the only issue) is that cm_api is not packaged for Fedora,
Centos, or Ubuntu currently. The global requirements README points out that 
adding
requirements for a new dependency more or less forces the distros to package 
the 
dependency for the next OS release.

  Given that cm_api is needed for a plugin, but not for core Sahara 
functionality,
should we request that the global requirement be added, or should we seek to add
a global requirement only if/when cm_api is packaged?

  Alternatively, can we support the plugin with additional documentation (ie, 
how
to install cm_api on the Sahara node)?

  Those present at the meeting agreed that it was probably better to defer a 
global
requirement until/unless cm_api is packaged to avoid a burden on the distros.

  Thoughts?

Best,

Trevor

Minutes: 
http://eavesdrop.openstack.org/meetings/sahara/2014/sahara.2014-11-20-18.01.html
Logs: 
http://eavesdrop.openstack.org/meetings/sahara/2014/sahara.2014-11-20-18.01.log.html
https://github.com/openstack/requirements




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Nominate Michael McCune to sahara-core

2014-11-11 Thread Trevor McKay

+ 2

On 11/11/2014 12:37 PM, Sergey Lukjanov wrote:

Hi folks,

I'd like to propose Michael McCune to sahara-core. He has a good 
knowledge of codebase and implemented important features such as Swift 
auth using trusts. Mike has been consistently giving us very well 
thought out and constructive reviews for Sahara project.


Sahara core team members, please, vote +/- 2.

Thanks.


--
Sincerely yours,
Sergey Lukjanov
Sahara Technical Lead
(OpenStack Data Processing)
Principal Software Engineer
Mirantis Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Nominate Sergey Reshetniak to sahara-core

2014-11-11 Thread Trevor McKay

+2

On 11/11/2014 12:35 PM, Sergey Lukjanov wrote:

Hi folks,

I'd like to propose Sergey to sahara-core. He's made a lot of work on 
different parts of Sahara and he has a very good knowledge of 
codebase, especially in plugins area.  Sergey has been consistently 
giving us very well thought out and constructive reviews for Sahara 
project.


Sahara core team members, please, vote +/- 2.

Thanks.


--
Sincerely yours,
Sergey Lukjanov
Sahara Technical Lead
(OpenStack Data Processing)
Principal Software Engineer
Mirantis Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] changing host name and /etc/hosts in container

2014-10-30 Thread Trevor McKay
Zhidong,

 Thanks for your question.  I personally don't have an answer, but I
think we definitely should bring up the possibility of dockerization for
Sahara at the design summit next week.  It may be something we want to
formalize for Kilo.  Will you be at the summit?

Just to be clear, are you running Sahara itself in a container, or
launching node instances in containers?

I'll take a look and see if I can find anything useful about the ip
assignment/hostname sequence for node instances during launch.

Best,

Trevor

On Thu, 2014-10-30 at 16:46 +0800, Zhidong Yu wrote:
 Hello hackers,
 
 We are experimenting Sahara with Docker container (nova-docker) and
 ran into an issue that Sahara needs to change the host name
 and /etc/hosts which is not allowed in container. I am wondering if
 there is any easy way to work around this by hacking into Sahara?
 
 
 thanks, Zhidong
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Sahara][FFE] Requesting exception for Swift trust authentication blueprint

2014-09-05 Thread Trevor McKay
Not sure how this is done, but I'm a core member for Sahara, and I
hereby sponsor it.

On Fri, 2014-09-05 at 09:57 -0400, Michael McCune wrote:
 hey folks,
 
 I am requesting an exception for the Swift trust authentication blueprint[1]. 
 This blueprint addresses a security bug in Sahara and represents a 
 significant move towards increased security for Sahara clusters. There are 
 several reviews underway[2] with 1 or 2 more starting today or monday.
 
 This feature is initially implemented as optional and as such will have 
 minimal impact on current user deployments. By default it is disabled and 
 requires no additional configuration or management from the end user.
 
 My feeling is that there has been vigorous debate and discussion surrounding 
 the implementation of this blueprint and there is consensus among the team 
 that these changes are needed. The code reviews for the bulk of the work have 
 been positive thus far and I have confidence these patches will be accepted 
 within the next week.
 
 thanks for considering this exception,
 mike
 
 
 [1]: 
 https://blueprints.launchpad.net/sahara/+spec/edp-swift-trust-authentication
 [2]: 
 https://review.openstack.org/#/q/status:open+topic:bp/edp-swift-trust-authentication,n,z
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] integration tests in python-saharaclient

2014-09-04 Thread Trevor McKay
Yes, I wrote them.  I use them all the time -- no typo that I know of.
They are great for spinning up a cluster and running EDP jobs.

They may need some polish, but the point is to test the whole chain of
operations from the CLI.  This is contrary to what most OpenStack
projects traditionally do -- most CLI testing is only transformation
testing, that is it tests the output of CLI commands in Tempest but does
not test any kind of integration from the CLI.

Different communities however will have different requirements.  At Red
Hat, for instance, many of our customers rely heavily on the command
line, and our testing includes integration tests from the CLI as the
entry point.  We want this kind of testing.

In fact, in the Icehouse release I found a bug by running the CLI
integration tests.  There was a mismatch between the CLI and Sahara.

These tests are not run in CI currently, however, when/if we end up with
more horsepower in CI they should be.  They should not be deleted.

Best,

Trevor

On Wed, 2014-09-03 at 14:58 -0700, Andrew Lazarev wrote:
 Hi team,
 
 
 Today I've realized that we have some tests called 'integration'
 in python-saharaclient. Also I've found out that Jenkins doesn't use
 them and they can't be run starting from April because of typo in
 tox.ini.
 
 
 Does anyone know what these tests are? Does anyone mind if I delete
 them since we don't use them anyway?
 
 
 Thanks,
 Andrew.
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] integration tests in python-saharaclient

2014-09-04 Thread Trevor McKay
by the way, what typo?

Trev

On Wed, 2014-09-03 at 14:58 -0700, Andrew Lazarev wrote:
 Hi team,
 
 
 Today I've realized that we have some tests called 'integration'
 in python-saharaclient. Also I've found out that Jenkins doesn't use
 them and they can't be run starting from April because of typo in
 tox.ini.
 
 
 Does anyone know what these tests are? Does anyone mind if I delete
 them since we don't use them anyway?
 
 
 Thanks,
 Andrew.
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Notes on developing Sahara Spark EDP to work with swift:// paths

2014-08-28 Thread Trevor McKay
Hi folks,

  I've updated this etherpad with notes from an investigation of
Spark/Swift and the hadoop-openstack plugin carried in the sahara-extra
repo.
  
  Following the notes there, I was able to access swift:// paths from
Spark jobs on a Spark standalone cluster launched from Sahara and then
fixed up by hand.

  Comments welcome.  This is a POC at this point imho, we have work to
do to fully integrate this into Sahara.

https://etherpad.openstack.org/p/sahara_spark_edp

Best,

Trevor


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Notes on developing Sahara Spark EDP to work with swift:// paths

2014-08-28 Thread Trevor McKay
Gil,

  thanks! I'll take a look.

Trevor

On Thu, 2014-08-28 at 19:31 +0300, Gil Vernik wrote:
 Hi, 
 
 In case this is helpful for you, this is the patch i submitted to
 Spark about Swift and Spark integration ( about to be merged ) 
 https://github.com/apache/spark/pull/1010 
 
 I sent information about this patch to this mailing list about two
 months ago. 
 
 All the best, 
 Gil. 
 
 
 
 
 
 From:Trevor McKay tmc...@redhat.com 
 To:OpenStack Development Mailing List
 openstack-dev@lists.openstack.org 
 Date:28/08/2014 06:22 PM 
 Subject:[openstack-dev] [sahara] Notes on developing Sahara
 Spark EDP to work with swift:// paths 
 
 __
 
 
 
 Hi folks,
 
  I've updated this etherpad with notes from an investigation of
 Spark/Swift and the hadoop-openstack plugin carried in the
 sahara-extra
 repo.
  
  Following the notes there, I was able to access swift:// paths from
 Spark jobs on a Spark standalone cluster launched from Sahara and then
 fixed up by hand.
 
  Comments welcome.  This is a POC at this point imho, we have work to
 do to fully integrate this into Sahara.
 
 https://etherpad.openstack.org/p/sahara_spark_edp
 
 Best,
 
 Trevor
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Sahara] Swift authentication and backwards compatibility

2014-08-15 Thread Trevor McKay
Thoughts, rapidfire :)

In short, I think we should plan on backward compat unless some stubborn
technical problem gets in our way 

I think backward compatibility is a good idea.  We can make the
user/pass inputs for data objects optional (they are required
currently), maybe even gray them out in the UI with a checkbox to turn
them on, or something like that.

Sahara can detect whether or not the proxy domain is there, and whether
or not it can be created.  If Sahara ends up in a situation where it
thinks user/pass are required, but the data objects don't have them,
we can return a meaningful error.

The job manager can key off of the values supplied for the data source
objects (no user/pass? must be proxy) and/or cluster configs (for
instance, a new cluster config could be added -- if it's absent we
assume old cluster and therefore old hadoop swfit plugin).  Workflow
can be generated accordingly.

The hadoop swift plugin can look at the config values provided, as you
noted yesterday, and get auth tokens in either manor.

Best,

Trev


On Thu, 2014-08-14 at 22:20 -0400, Michael McCune wrote:
 hello Sahara folks,
 
 I am working to get the revamped spec[1] finalized and I'd like to know the 
 group's thoughts on the idea of backward compatibility. It is possible to 
 implement the new authentication method and remain backward compatible, but 
 we will need to keep the username and password inputs in the Swift data forms.
 
 Having the backward compatibility would also give Sahara a way to react in 
 situations where the proxy domain is not available or the administrator 
 doesn't wish to use it. I'm not sure this is the behavior we want, but I 
 don't know if it is proper for Sahara to exit if no proxy domain can be found.
 
 If we choose not to remain backward compatible then we are requiring Sahara 
 operators to create the new proxy domain needed, and they must update all 
 virtual machine images.
 
 Thoughts?
 
 regards,
 mike
 
 [1]: https://review.openstack.org/#/c/113591/
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Sahara] Error config?

2014-07-09 Thread Trevor McKay
Hi,

  I believe if you look in the sahara log output you may see more
messages about this.  There should be some indicator of what the
validation error was.

Trevor

On Wed, 2014-07-09 at 23:44 +0700, Dat Tran wrote:
 Thank matt,
 
 
 It's worked. Can i ask you a question, please?
 
 
 When i create file ng_master_template_create.json
 and ng_worker_template_create.json. Then, I run:
 http $SAHARA_URL/node-group-templates X-Auth-Token:$AUTH_TOKEN 
 ng_master_template_create.json
 
 
 Message:
 
 HTTP/1.1 500 INTERNAL SERVER ERROR
 Content-Length: 111
 Content-Type: application/json
 Date: Wed, 09 Jul 2014 16:14:09 GMT
 {
 error_code: 500, 
 error_message: Error occurred during validation, 
 error_name: INTERNAL_SERVER_ERROR
 }
 
 
 Thank you very very much!
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Sahara] Spark plugin: EDP and Spark jobs

2014-06-06 Thread Trevor McKay
Thanks Daniele,

  This is a good summary (also pasted more or less on the etherpad).

 There is also a third possibility of bypassing the job-server problem
 and call directly Spark commands on the master node of the cluster.

  I am starting to play around with this idea as a simple
proof-of-concept. I think it can be done along with some refactoring in
the Sahara job manager.

  I think the refactoring is viable and can give us something where the
details of run, status, kill can be hidden behind another common
interface.  If this proves to be viable, we can pursue more capable
spark job models next.

We shall see!  Learn by doing.  I should have a CR in a few days.

Best,

Trevor

On Fri, 2014-06-06 at 10:54 +0200, Daniele Venzano wrote:
 Dear all,
 
 A short while ago the Spark plugin for Sahara was merged, opening up the
 possibility of deploying Spark clusters with one click from OpenStack.
 Since Spark is quite different from Hadoop, we need to take a number of
 decisions on how to proceed implementing important features, like, in
 particular, EDP. Spark does not have a built-in job-server and EDP needs
 a way to have a very generic and high level interface to submit, check
 the basic status and kill a job.
 
 In summary, this is our understanding of the current situation:
 1. a quick hack is to use Oozie for application submission (this mimics
 what Cloudera did by the end of last year, when preparing to announce
 the integration of Spark in CDH)
 2. an alternative is to use a spark job-server, which should replace Oozie
 (there is a repo on github from ooyala that implements an instance of a
 job-server)
 
 Here's our view on the points above:
 1. clearly, the first approach is an ugly hack, that creates
 dependencies with Oozie. Oozie requires mapreduce, and uses tiny
 map-only jobs to submit part of a larger workflow. Besides dependencies,
 this is a bazooka to kill a fly, as we're not addressing spark
 application workflows right now
 2. the spark job-server idea is more clean, but the current project from
 Ooyala supports an old version of spark. Spark 1.0.0 (which we have
 already tested in Sahara and that we will commit soon) offers some new
 methods to submit and package applications, that can drastically
 simplify the job-server
 
 As a consequence, the doubt is: do we contribute to that project, create
 a new one, or contribute directly to spark?
 
 A few more points:
 - assuming we have a working prototype of 2), we need to modify the
 Sahara setup such that it deploys, in addition to the usual suspects
 (master and slaves) one more service, the spark job-server
 
 There is also a third possibility of bypassing the job-server problem
 and call directly Spark commands on the master node of the cluster.



 One last observation: currently, spark in standalone mode (that we use
 in the plugin) does not support other schedulers than FIFO, when
 multiple spark applications/jobs are submitted to the cluster. Hence,
 the spark job-server could be a good place to integrate a better job
 scheduler.
 
 Trevor McKay opened a pad here:
 https://etherpad.openstack.org/p/sahara_spark_edp
 
 to gather ideas and feedback. This email is based on the very
 preliminary discussion that happened yesterday via IRC, email and the
 above-mentioned etherpad and has the objective of starting a public
 discussion on how to proceed.
 



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Re: Spark plugin: EDP and Spark jobs

2014-06-06 Thread Trevor McKay
(resend with proper subject markers)

Thanks Daniele,

  This is a good summary (also pasted more or less on the etherpad).

 There is also a third possibility of bypassing the job-server problem
 and call directly Spark commands on the master node of the cluster.

  I am starting to play around with this idea as a simple
proof-of-concept. I think it can be done along with some refactoring in
the Sahara job manager.

  I think the refactoring is viable and can give us something where the
details of run, status, kill can be hidden behind another common
interface.  If this proves to be viable, we can pursue more capable
spark job models next.

We shall see!  Learn by doing.  I should have a CR in a few days.

Best,

Trevor

On Fri, 2014-06-06 at 10:54 +0200, Daniele Venzano wrote:
 Dear all,
 
 A short while ago the Spark plugin for Sahara was merged, opening up
the
 possibility of deploying Spark clusters with one click from OpenStack.
 Since Spark is quite different from Hadoop, we need to take a number
of
 decisions on how to proceed implementing important features, like, in
 particular, EDP. Spark does not have a built-in job-server and EDP
needs
 a way to have a very generic and high level interface to submit, check
 the basic status and kill a job.
 
 In summary, this is our understanding of the current situation:
 1. a quick hack is to use Oozie for application submission (this
mimics
 what Cloudera did by the end of last year, when preparing to announce
 the integration of Spark in CDH)
 2. an alternative is to use a spark job-server, which should replace
Oozie
 (there is a repo on github from ooyala that implements an instance of
a
 job-server)
 
 Here's our view on the points above:
 1. clearly, the first approach is an ugly hack, that creates
 dependencies with Oozie. Oozie requires mapreduce, and uses tiny
 map-only jobs to submit part of a larger workflow. Besides
dependencies,
 this is a bazooka to kill a fly, as we're not addressing spark
 application workflows right now
 2. the spark job-server idea is more clean, but the current project
from
 Ooyala supports an old version of spark. Spark 1.0.0 (which we have
 already tested in Sahara and that we will commit soon) offers some new
 methods to submit and package applications, that can drastically
 simplify the job-server
 
 As a consequence, the doubt is: do we contribute to that project,
create
 a new one, or contribute directly to spark?
 
 A few more points:
 - assuming we have a working prototype of 2), we need to modify the
 Sahara setup such that it deploys, in addition to the usual suspects
 (master and slaves) one more service, the spark job-server
 
 There is also a third possibility of bypassing the job-server problem
 and call directly Spark commands on the master node of the cluster.



 One last observation: currently, spark in standalone mode (that we use
 in the plugin) does not support other schedulers than FIFO, when
 multiple spark applications/jobs are submitted to the cluster. Hence,
 the spark job-server could be a good place to integrate a better job
 scheduler.
 
 Trevor McKay opened a pad here:
 https://etherpad.openstack.org/p/sahara_spark_edp
 
 to gather ideas and feedback. This email is based on the very
 preliminary discussion that happened yesterday via IRC, email and the
 above-mentioned etherpad and has the objective of starting a public
 discussion on how to proceed.
 



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara] Etherpad for discussion of spark EDP implementation

2014-06-05 Thread Trevor McKay
Hi folks,

  We've just started an investigation of spark EDP for Sahara. Please
visit the etherpad and share your insights.  Spark experts especially
welcome!

https://etherpad.openstack.org/p/sahara_spark_edp


  There are some design/roadmap decisions we need to make -- how are we
going to do this, in what timeframe, with what steps?  Do we employ a
short term solution and replace long term with something more
supportable, etc.

Best,

Trevor


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] summit wrap-up: backward compat

2014-05-29 Thread Trevor McKay
Catching up...

On Thu, 2014-05-29 at 15:59 +0400, Alexander Ignatov wrote:
 On 28 May 2014, at 17:14, Sergey Lukjanov slukja...@mirantis.com wrote:
  1. How should we handle addition of new functionality to the API,
  should we bump minor version and just add new endpoints?
 
 Agree with most of folks. No new versions on adding new endpoints. 
 Semantic changes require new major version of rest api.

+1 this and previous comments.  I don't think we'll generate too many
semantic changes (but I could be wrong :) )

I agree with Mike that we should have simple version numbers, v1, v2, v3

  2. For which period of time should we keep deprecated API and client for it?
 
 One release cycle for deprecation period.

+1.  If we give folks N cycles, they will always wait until the Nth
cycle to move away.  Might as well be 1.

 
  3. How to publish all images and/or keep stability of building images
  for plugins?
  
 
 We should keep all images for all plugins (non-deprecated as Matt mentioned) 
 for each release. In addition we could keep  at least one image which could 
 be 
 downloaded and used with master branch of Sahara. Plugin vendors could keep 
 its own set of images and we can reflect it in the docs.

I agree with keeping all images grouped with a release for all supported
plugins in that release.

Are we suggesting here that there are 2 places to find images, one in
the Sahara releases and a second in a vendor repo listed in the docs?

 Regards,
 Alexander Ignatov
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] summit wrap-up: subprojects

2014-05-29 Thread Trevor McKay
below, sahara-extra

On Wed, 2014-05-28 at 20:02 +0400, Sergey Lukjanov wrote:
 Hey folks,
 
 it's a small wrap-up for the topic Sahara subprojects releasing and
 versioning that was discussed partially on summit and requires some
 more discussions. You can find details in [0].
 
  common
 
 We'll include only one tarball for sahara to the release launchpad
 pages. All other links will be provided in docs.
 
  sahara-dashboard
 
 The merging to Horizon process is now in progress. We've decided that
 j1 is the deadline for merging main code parts and during the j2 all
 the code should be merged into Horizon, so, if in time of j2 we'll
 have some work on merging sahara-dashboard to Horizon not done we'll
 need to fallback to the separated sahara-dashboard repo release for
 Juno cycle and continue merging the code into the Horizon to be able
 to completely kill sahara-dashboard repo in K release.
 
 Where we should keep our UI integration tests?
 
  sahara-image-elements
 
 We're agreed that some common parts should be merged into the
 diskimage-builder repo (like java support, ssh, etc.). The main issue
 of keeping -image-elements separated is how to release them and
 provide mapping sahara version - elements version. You can find
 different options in etherpad [0], I'll write here about the option
 that I think will work best for us.
 
 So, the idea is that sahara-image-elements is a bunch of scripts and
 tools for building images for Sahara. It's high coupled with plugins's
 code in Sahara, so, we need to align them good. Current default
 decision is to keep aligned versioning like 2014.1 and etc. It'll be
 discussed on the weekly irc team meeting May 29.
 
  sahara-extra
 
 Keep it as is, no need to stop releasing, because we're not publishing
 anything to pypi. No real need for tags.

Even if we keep the repo for now, I think we could simplify a little
bit.  The edp-examples could be moved to the Sahara repo.  Some of those
examples we use in the integration tests anyway -- why have them
duplicated?

 
 
  open questions
 
 If you have any objections for this model, please, share your thoughts
 before June 3 due to the Juno-1 (June 12) to have enough time to apply
 selected approach.
 
 [0] https://etherpad.openstack.org/p/juno-summit-sahara-relmngmt-backward
 
 Thanks.
 



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] Nominate Trevor McKay for sahara-core

2014-05-21 Thread Trevor McKay
Thank you all,

  I have been away from a computer for a few days post Summit :)

I appreciate your vote of confidence, and I look forward to continued
work on Sahara.  Here's to more Big Data processing on Openstack!

Best regards,

Trevor

On Mon, 2014-05-19 at 10:13 -0400, Sergey Lukjanov wrote:
 Trevor, congrats!
 
 welcome to the sahara-core.
 
 On Thu, May 15, 2014 at 11:41 AM, Matthew Farrellee m...@redhat.com wrote:
  On 05/12/2014 05:31 PM, Sergey Lukjanov wrote:
 
  Hey folks,
 
  I'd like to nominate Trevor McKay (tmckay) for sahara-core.
 
  He is among the top reviewers of Sahara subprojects. Trevor is working
  on Sahara full time since summer 2013 and is very familiar with
  current codebase. His code contributions and reviews have demonstrated
  a good knowledge of Sahara internals. Trevor has a valuable knowledge
  of EDP part and Hadoop itself. He's working on both bugs and new
  features implementation.
 
  Some links:
 
  http://stackalytics.com/report/contribution/sahara-group/30
  http://stackalytics.com/report/contribution/sahara-group/90
  http://stackalytics.com/report/contribution/sahara-group/180
 
  https://review.openstack.org/#/q/owner:tmckay+sahara+AND+-status:abandoned,n,z
  https://launchpad.net/~tmckay
 
  Sahara cores, please, reply with +1/0/-1 votes.
 
  Thanks.
 
 
  +1
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Tox issues on a clean environment

2014-03-06 Thread Trevor McKay
I am having a very similar issue with horizon, just today. I cloned the
repo and started from scratch on master.

tools/install_venv.py is trying to install cffi as a depdendency,
ultimately fails with 

ImportError: cannot import name Feature

This is Fedora 19.  I know some folks on Fedora 20 who are not having
this issue.  I'm guessing it's a version thing...

Trevor

On Thu, 2014-03-06 at 08:14 -0800, Gary Kotton wrote:
 Hi,
 Anyone know how I cam solve the error below:
 
 
   Running setup.py install for jsonpatch
 /usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown
 distribution option: 'entry_poimts'
   warnings.warn(msg)
 changing mode of build/scripts-2.7/jsondiff from 664 to 775
 changing mode of build/scripts-2.7/jsonpatch from 664 to 775
 
 changing mode of /home/gk-dev/nova/.tox/py27/bin/jsonpatch to 775
 changing mode of /home/gk-dev/nova/.tox/py27/bin/jsondiff to 775
   Found existing installation: distribute 0.6.24dev-r0
 Not uninstalling distribute at /usr/lib/python2.7/dist-packages,
 outside environment /home/gk-dev/nova/.tox/py27
   Running setup.py install for setuptools
 
 Installing easy_install script to /home/gk-dev/nova/.tox/py27/bin
 Installing easy_install-2.7 script
 to /home/gk-dev/nova/.tox/py27/bin
   Running setup.py install for mccabe
 
   Running setup.py install for cffi
 Traceback (most recent call last):
   File string, line 1, in module
   File /home/gk-dev/nova/.tox/py27/build/cffi/setup.py, line 94,
 in module
 from setuptools import setup, Feature, Extension
 ImportError: cannot import name Feature
 Complete output from
 command /home/gk-dev/nova/.tox/py27/bin/python2.7 -c import
 setuptools;__file__='/home/gk-dev/nova/.tox/py27/build/cffi/setup.py';exec(compile(open(__file__).read().replace('\r\n',
  '\n'), __file__, 'exec')) install --record 
 /tmp/pip-2sWKRK-record/install-record.txt --single-version-externally-managed 
 --install-headers /home/gk-dev/nova/.tox/py27/include/site/python2.7:
 Traceback (most recent call last):
 
 
   File string, line 1, in module
 
 
   File /home/gk-dev/nova/.tox/py27/build/cffi/setup.py, line 94, in
 module
 
 
 from setuptools import setup, Feature, Extension
 
 
 ImportError: cannot import name Feature
 
 
 
 Cleaning up...
 Command /home/gk-dev/nova/.tox/py27/bin/python2.7 -c import
 setuptools;__file__='/home/gk-dev/nova/.tox/py27/build/cffi/setup.py';exec(compile(open(__file__).read().replace('\r\n',
  '\n'), __file__, 'exec')) install --record 
 /tmp/pip-2sWKRK-record/install-record.txt --single-version-externally-managed 
 --install-headers /home/gk-dev/nova/.tox/py27/include/site/python2.7 failed 
 with error code 1 in /home/gk-dev/nova/.tox/py27/build/cffi
 Traceback (most recent call last):
   File .tox/py27/bin/pip, line 9, in module
 load_entry_point('pip==1.5.4', 'console_scripts', 'pip')()
   File
 /home/gk-dev/nova/.tox/py27/local/lib/python2.7/site-packages/pip/__init__.py,
  line 148, in main
 parser.print_help()
   File
 /home/gk-dev/nova/.tox/py27/local/lib/python2.7/site-packages/pip/basecommand.py,
  line 169, in main
 log_file_fp.write(text)
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
 72: ordinal not in range(128)
 
 
 ERROR: could not install deps [-r/home/gk-dev/nova/requirements.txt,
 -r/home/gk-dev/nova/test-requirements.txt]
 
 
 Thanks
 Gary
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] Specific job type for streaming mapreduce? (and someday pipes)

2014-02-05 Thread Trevor McKay
Okay,

  Thanks. I'll make a draft CR that sets up Savanna for dotted names,
and one that uses dotted names with streaming.

Best,

Trevor

On Wed, 2014-02-05 at 15:58 +0400, Sergey Lukjanov wrote:
 I like the dot-separated name. There are several reasons for it:
 
 
 * it'll not require changes in all Savanna subprojects;
 * eventually we'd like to use not only Oozie for EDP (for example, if
 we'll support Twitter Storm) and this new tools could require
 additional 'subtypes'.
 
 
 Thanks for catching this.
 
 
 On Tue, Feb 4, 2014 at 10:47 PM, Trevor McKay tmc...@redhat.com
 wrote:
 Thanks Andrew.
 
 My author thought, which is in between, is to allow dotted
 types.
 MapReduce.streaming for example.
 
 This gives you the subtype flavor but keeps all the APIs the
 same.
 We just need a wrapper function to separate them when we
 compare types.
 
 Best,
 
 Trevor
 
 On Mon, 2014-02-03 at 14:57 -0800, Andrew Lazarev wrote:
  I see two points:
  * having Savanna types mapped to Oozie action types is
 intuitive for
  hadoop users and this is something we would like to keep
  * it is hard to distinguish different kinds of one job type
 
 
  Adding 'subtype' field will solve both problems. Having it
 optional
  will not break backward compatibility. Adding database
 migration
  script is also pretty straightforward.
 
 
  Summarizing, my vote is on subtype field.
 
 
  Thanks,
  Andrew.
 
 
  On Mon, Feb 3, 2014 at 2:10 PM, Trevor McKay
 tmc...@redhat.com
  wrote:
 
  I was trying my best to avoid adding extra job types
 to
  support
  mapreduce variants like streaming or mapreduce with
 pipes, but
  it seems
  that adding the types is the simplest solution.
 
  On the API side, Savanna can live without a specific
 job type
  by
  examining the data in the job record.
  Presence/absence of
  certain
  things, or null values, etc, can provide adequate
 indicators
  to what
  kind of mapreduce it is.  Maybe a little bit subtle.
 
  But for the UI, it seems that explicit knowledge of
 what the
  job is
  makes things easier and better for the user.  When a
 user
  creates a
  streaming mapreduce job and the UI is aware of the
 type later
  on at job
  launch, the user can be prompted to provide the
 right configs
  (i.e., the
  streaming mapper and reducer values).
 
  The explicit job type also supports validation
 without having
  to add
  extra flags (which impacts the savanna client, and
 the JSON,
  etc). For
  example, a streaming mapreduce job does not require
 any
  specified
  libraries so the fact that it is meant to be a
 streaming job
  needs to be
  known at job creation time.
 
  So, to that end, I propose that we add a
 MapReduceStreaming
  job type,
  and probably at some point we will have
 MapReducePiped too.
  It's
  possible that we might have other job types in the
 future too
  as the
  feature set grows.
 
  There was an effort to make Savanna job types
 parallel Oozie
  action
  types, but in this case that's just not possible
 without
  introducing a
  subtype field in the job record, which leads to a
 database
  migration
  script and savanna client changes.
 
  What do you think?
 
  Best,
 
  Trevor
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] savann-ci, Re: [savanna] Alembic migrations and absence of DROP column in sqlite

2014-02-05 Thread Trevor McKay
Hi Sergey,

  Is there a bug or a blueprint for this?  I did a quick search but
didn't see one.

Thanks,

Trevor

On Wed, 2014-02-05 at 16:06 +0400, Sergey Kolekonov wrote:
 I'm currently working on moving on the MySQL for savanna-ci
 
 
 On Wed, Feb 5, 2014 at 3:53 PM, Sergey Lukjanov
 slukja...@mirantis.com wrote:
 Agreed, let's move on to the MySQL for savanna-ci to run
 integration tests against production-like DB.
 
 
 On Wed, Feb 5, 2014 at 1:54 AM, Andrew Lazarev
 alaza...@mirantis.com wrote:
 Since sqlite is not in the list of databases that
 would be used in production, CI should use other DB
 for testing.
 
 
 Andrew.
 
 
 On Tue, Feb 4, 2014 at 1:13 PM, Alexander Ignatov
 aigna...@mirantis.com wrote:
 Indeed. We should create a bug around that and
 move our savanna-ci to mysql.
 
 Regards,
 Alexander Ignatov
 
 
 
 On 05 Feb 2014, at 01:01, Trevor McKay
 tmc...@redhat.com wrote:
 
  This brings up an interesting problem:
 
  In https://review.openstack.org/#/c/70420/
 I've added a migration that
  uses a drop column for an upgrade.
 
  But savann-ci is apparently using a sqlite
 database to run.  So it can't
  possibly pass.
 
  What do we do here?  Shift savanna-ci tests
 to non sqlite?
 
  Trevor
 
  On Sat, 2014-02-01 at 18:17 +0200, Roman
 Podoliaka wrote:
  Hi all,
 
  My two cents.
 
  2) Extend alembic so that op.drop_column()
 does the right thing
  We could, but should we?
 
  The only reason alembic doesn't support
 these operations for SQLite
  yet is that SQLite lacks proper support of
 ALTER statement. For
  sqlalchemy-migrate we've been providing a
 work-around in the form of
  recreating of a table and copying of all
 existing rows (which is a
  hack, really).
 
  But to be able to recreate a table, we
 first must have its definition.
  And we've been relying on SQLAlchemy schema
 reflection facilities for
  that. Unfortunately, this approach has a
 few drawbacks:
 
  1) SQLAlchemy versions prior to 0.8.4 don't
 support reflection of
  unique constraints, which means the
 recreated table won't have them;
 
  2) special care must be taken in 'edge'
 cases (e.g. when you want to
  drop a BOOLEAN column, you must also drop
 the corresponding CHECK (col
  in (0, 1)) constraint manually, or SQLite
 will raise an error when the
  table is recreated without the column being
 dropped)
 
  3) special care must be taken for 'custom'
 type columns (it's got
  better with SQLAlchemy 0.8.x, but e.g. in
 0.7.x we had to override
  definitions of reflected BIGINT columns
 manually for each
  column.drop() call)
 
  4) schema reflection can't be performed
 when alembic migrations are
  run in 'offline' mode (without connecting
 to a DB)
  ...
  (probably something else I've forgotten)
 
  So it's totally doable, but, IMO

Re: [openstack-dev] [savanna] Specific job type for streaming mapreduce? (and someday pipes)

2014-02-04 Thread Trevor McKay
Thanks Andrew.

My author thought, which is in between, is to allow dotted types.
MapReduce.streaming for example.

This gives you the subtype flavor but keeps all the APIs the same.
We just need a wrapper function to separate them when we compare types.

Best,

Trevor

On Mon, 2014-02-03 at 14:57 -0800, Andrew Lazarev wrote:
 I see two points:
 * having Savanna types mapped to Oozie action types is intuitive for
 hadoop users and this is something we would like to keep
 * it is hard to distinguish different kinds of one job type
 
 
 Adding 'subtype' field will solve both problems. Having it optional
 will not break backward compatibility. Adding database migration
 script is also pretty straightforward.
 
 
 Summarizing, my vote is on subtype field.
 
 
 Thanks,
 Andrew.
 
 
 On Mon, Feb 3, 2014 at 2:10 PM, Trevor McKay tmc...@redhat.com
 wrote:
 
 I was trying my best to avoid adding extra job types to
 support
 mapreduce variants like streaming or mapreduce with pipes, but
 it seems
 that adding the types is the simplest solution.
 
 On the API side, Savanna can live without a specific job type
 by
 examining the data in the job record.  Presence/absence of
 certain
 things, or null values, etc, can provide adequate indicators
 to what
 kind of mapreduce it is.  Maybe a little bit subtle.
 
 But for the UI, it seems that explicit knowledge of what the
 job is
 makes things easier and better for the user.  When a user
 creates a
 streaming mapreduce job and the UI is aware of the type later
 on at job
 launch, the user can be prompted to provide the right configs
 (i.e., the
 streaming mapper and reducer values).
 
 The explicit job type also supports validation without having
 to add
 extra flags (which impacts the savanna client, and the JSON,
 etc). For
 example, a streaming mapreduce job does not require any
 specified
 libraries so the fact that it is meant to be a streaming job
 needs to be
 known at job creation time.
 
 So, to that end, I propose that we add a MapReduceStreaming
 job type,
 and probably at some point we will have MapReducePiped too.
 It's
 possible that we might have other job types in the future too
 as the
 feature set grows.
 
 There was an effort to make Savanna job types parallel Oozie
 action
 types, but in this case that's just not possible without
 introducing a
 subtype field in the job record, which leads to a database
 migration
 script and savanna client changes.
 
 What do you think?
 
 Best,
 
 Trevor
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] savann-ci, Re: [savanna] Alembic migrations and absence of DROP column in sqlite

2014-02-04 Thread Trevor McKay
This brings up an interesting problem:

In https://review.openstack.org/#/c/70420/ I've added a migration that
uses a drop column for an upgrade.

But savann-ci is apparently using a sqlite database to run.  So it can't
possibly pass.

What do we do here?  Shift savanna-ci tests to non sqlite?

Trevor

On Sat, 2014-02-01 at 18:17 +0200, Roman Podoliaka wrote:
 Hi all,
 
 My two cents.
 
  2) Extend alembic so that op.drop_column() does the right thing
 We could, but should we?
 
 The only reason alembic doesn't support these operations for SQLite
 yet is that SQLite lacks proper support of ALTER statement. For
 sqlalchemy-migrate we've been providing a work-around in the form of
 recreating of a table and copying of all existing rows (which is a
 hack, really).
 
 But to be able to recreate a table, we first must have its definition.
 And we've been relying on SQLAlchemy schema reflection facilities for
 that. Unfortunately, this approach has a few drawbacks:
 
 1) SQLAlchemy versions prior to 0.8.4 don't support reflection of
 unique constraints, which means the recreated table won't have them;
 
 2) special care must be taken in 'edge' cases (e.g. when you want to
 drop a BOOLEAN column, you must also drop the corresponding CHECK (col
 in (0, 1)) constraint manually, or SQLite will raise an error when the
 table is recreated without the column being dropped)
 
 3) special care must be taken for 'custom' type columns (it's got
 better with SQLAlchemy 0.8.x, but e.g. in 0.7.x we had to override
 definitions of reflected BIGINT columns manually for each
 column.drop() call)
 
 4) schema reflection can't be performed when alembic migrations are
 run in 'offline' mode (without connecting to a DB)
 ...
 (probably something else I've forgotten)
 
 So it's totally doable, but, IMO, there is no real benefit in
 supporting running of schema migrations for SQLite.
 
  ...attempts to drop schema generation based on models in favor of migrations
 
 As long as we have a test that checks that the DB schema obtained by
 running of migration scripts is equal to the one obtained by calling
 metadata.create_all(), it's perfectly OK to use model definitions to
 generate the initial DB schema for running of unit-tests as well as
 for new installations of OpenStack (and this is actually faster than
 running of migration scripts). ... and if we have strong objections
 against doing metadata.create_all(), we can always use migration
 scripts for both new installations and upgrades for all DB backends,
 except SQLite.
 
 Thanks,
 Roman
 
 On Sat, Feb 1, 2014 at 12:09 PM, Eugene Nikanorov
 enikano...@mirantis.com wrote:
  Boris,
 
  Sorry for the offtopic.
  Is switching to model-based schema generation is something decided? I see
  the opposite: attempts to drop schema generation based on models in favor of
  migrations.
  Can you point to some discussion threads?
 
  Thanks,
  Eugene.
 
 
 
  On Sat, Feb 1, 2014 at 2:19 AM, Boris Pavlovic bpavlo...@mirantis.com
  wrote:
 
  Jay,
 
  Yep we shouldn't use migrations for sqlite at all.
 
  The major issue that we have now is that we are not able to ensure that DB
  schema created by migration  models are same (actually they are not same).
 
  So before dropping support of migrations for sqlite  switching to model
  based created schema we should add tests that will check that model 
  migrations are synced.
  (we are working on this)
 
 
 
  Best regards,
  Boris Pavlovic
 
 
  On Fri, Jan 31, 2014 at 7:31 PM, Andrew Lazarev alaza...@mirantis.com
  wrote:
 
  Trevor,
 
  Such check could be useful on alembic side too. Good opportunity for
  contribution.
 
  Andrew.
 
 
  On Fri, Jan 31, 2014 at 6:12 AM, Trevor McKay tmc...@redhat.com wrote:
 
  Okay,  I can accept that migrations shouldn't be supported on sqlite.
 
  However, if that's the case then we need to fix up savanna-db-manage so
  that it checks the db connection info and throws a polite error to the
  user for attempted migrations on unsupported platforms. For example:
 
  Database migrations are not supported for sqlite
 
  Because, as a developer, when I see a sql error trace as the result of
  an operation I assume it's broken :)
 
  Best,
 
  Trevor
 
  On Thu, 2014-01-30 at 15:04 -0500, Jay Pipes wrote:
   On Thu, 2014-01-30 at 14:51 -0500, Trevor McKay wrote:
I was playing with alembic migration and discovered that
op.drop_column() doesn't work with sqlite.  This is because sqlite
doesn't support dropping a column (broken imho, but that's another
discussion).  Sqlite throws a syntax error.
   
To make this work with sqlite, you have to copy the table to a
temporary
excluding the column(s) you don't want and delete the old one,
followed
by a rename of the new table.
   
The existing 002 migration uses op.drop_column(), so I'm assuming
it's
broken, too (I need to check what the migration test is doing).  I
was
working on an 003.
   
How do we want

[openstack-dev] [savanna] Specific job type for streaming mapreduce? (and someday pipes)

2014-02-03 Thread Trevor McKay

I was trying my best to avoid adding extra job types to support
mapreduce variants like streaming or mapreduce with pipes, but it seems
that adding the types is the simplest solution.

On the API side, Savanna can live without a specific job type by
examining the data in the job record.  Presence/absence of certain
things, or null values, etc, can provide adequate indicators to what
kind of mapreduce it is.  Maybe a little bit subtle.

But for the UI, it seems that explicit knowledge of what the job is
makes things easier and better for the user.  When a user creates a
streaming mapreduce job and the UI is aware of the type later on at job
launch, the user can be prompted to provide the right configs (i.e., the
streaming mapper and reducer values).

The explicit job type also supports validation without having to add
extra flags (which impacts the savanna client, and the JSON, etc). For
example, a streaming mapreduce job does not require any specified
libraries so the fact that it is meant to be a streaming job needs to be
known at job creation time.

So, to that end, I propose that we add a MapReduceStreaming job type,
and probably at some point we will have MapReducePiped too. It's
possible that we might have other job types in the future too as the
feature set grows.

There was an effort to make Savanna job types parallel Oozie action
types, but in this case that's just not possible without introducing a
subtype field in the job record, which leads to a database migration
script and savanna client changes.

What do you think?

Best,

Trevor



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] Choosing provisioning engine during cluster launch

2014-01-30 Thread Trevor McKay
My mistake, it's already there.  I missed the distinction between set on
startup and set per cluster.

Trev

On Thu, 2014-01-30 at 10:50 -0500, Trevor McKay wrote:
 +1
 
 How about an undocumented config?
 
 Trev
 
 On Thu, 2014-01-30 at 09:24 -0500, Matthew Farrellee wrote:
  i imagine this is something that can be useful in a development and 
  testing environment, especially during the transition period from direct 
  to heat. so having the ability is not unreasonable, but i wouldn't 
  expose it to users via the dashboard (maybe not even directly in the cli)
  
  generally i want to reduce the number of parameters / questions the user 
  is asked
  
  best,
  
  
  matt
  
  On 01/30/2014 04:42 AM, Dmitry Mescheryakov wrote:
   I agree with Andrew. I see no value in letting users select how their
   cluster is provisioned, it will only make interface a little bit more
   complex.
  
   Dmitry
  
  
   2014/1/30 Andrew Lazarev alaza...@mirantis.com
   mailto:alaza...@mirantis.com
  
   Alexander,
  
   What is the purpose of exposing this to user side? Both engines must
   do exactly the same thing and they exist in the same time only for
   transition period until heat engine is stabilized. I don't see any
   value in proposed option.
  
   Andrew.
  
  
   On Wed, Jan 29, 2014 at 8:44 PM, Alexander Ignatov
   aigna...@mirantis.com mailto:aigna...@mirantis.com wrote:
  
   Today Savanna has two provisioning engines, heat and old one
   known as 'direct'.
   Users can choose which engine will be used by setting special
   parameter in 'savanna.conf'.
  
   I have an idea to give an ability for users to define
   provisioning engine
   not only when savanna is started but when new cluster is
   launched. The idea is simple.
   We will just add new field 'provisioning_engine' to 'cluster'
   and 'cluster_template'
   objects. And profit is obvious, users can easily switch from one
   engine to another without
   restarting savanna service. Of course, this parameter can be
   omitted and the default value
   from the 'savanna.conf' will be applied.
  
   Is this viable? What do you think?
  
   Regards,
   Alexander Ignatov
  
  
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   mailto:OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   mailto:OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
  
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] Choosing provisioning engine during cluster launch

2014-01-30 Thread Trevor McKay
+1

How about an undocumented config?

Trev

On Thu, 2014-01-30 at 09:24 -0500, Matthew Farrellee wrote:
 i imagine this is something that can be useful in a development and 
 testing environment, especially during the transition period from direct 
 to heat. so having the ability is not unreasonable, but i wouldn't 
 expose it to users via the dashboard (maybe not even directly in the cli)
 
 generally i want to reduce the number of parameters / questions the user 
 is asked
 
 best,
 
 
 matt
 
 On 01/30/2014 04:42 AM, Dmitry Mescheryakov wrote:
  I agree with Andrew. I see no value in letting users select how their
  cluster is provisioned, it will only make interface a little bit more
  complex.
 
  Dmitry
 
 
  2014/1/30 Andrew Lazarev alaza...@mirantis.com
  mailto:alaza...@mirantis.com
 
  Alexander,
 
  What is the purpose of exposing this to user side? Both engines must
  do exactly the same thing and they exist in the same time only for
  transition period until heat engine is stabilized. I don't see any
  value in proposed option.
 
  Andrew.
 
 
  On Wed, Jan 29, 2014 at 8:44 PM, Alexander Ignatov
  aigna...@mirantis.com mailto:aigna...@mirantis.com wrote:
 
  Today Savanna has two provisioning engines, heat and old one
  known as 'direct'.
  Users can choose which engine will be used by setting special
  parameter in 'savanna.conf'.
 
  I have an idea to give an ability for users to define
  provisioning engine
  not only when savanna is started but when new cluster is
  launched. The idea is simple.
  We will just add new field 'provisioning_engine' to 'cluster'
  and 'cluster_template'
  objects. And profit is obvious, users can easily switch from one
  engine to another without
  restarting savanna service. Of course, this parameter can be
  omitted and the default value
  from the 'savanna.conf' will be applied.
 
  Is this viable? What do you think?
 
  Regards,
  Alexander Ignatov
 
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [savanna] Alembic migrations and absence of DROP column in sqlite

2014-01-30 Thread Trevor McKay

I was playing with alembic migration and discovered that
op.drop_column() doesn't work with sqlite.  This is because sqlite
doesn't support dropping a column (broken imho, but that's another
discussion).  Sqlite throws a syntax error.

To make this work with sqlite, you have to copy the table to a temporary
excluding the column(s) you don't want and delete the old one, followed
by a rename of the new table.

The existing 002 migration uses op.drop_column(), so I'm assuming it's
broken, too (I need to check what the migration test is doing).  I was
working on an 003.

How do we want to handle this?  Three good options I can think of:

1) don't support migrations for sqlite (I think no, but maybe)

2) Extend alembic so that op.drop_column() does the right thing (more
open-source contributions for us, yay :) )

3) Add our own wrapper in savanna so that we have a drop_column() method
that wraps copy/rename.

Ideas, comments?

Best,

Trevor


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] How to handle diverging EDP job configuration settings

2014-01-29 Thread Trevor McKay
On Wed, 2014-01-29 at 14:35 +0400, Alexander Ignatov wrote:
 Thank you for bringing this up, Trevor.
 
 EDP gets more diverse and it's time to change its model.
 I totally agree with your proposal, but one minor comment.
 Instead of savanna. prefix in job_configs wouldn't it be better to make it
 as edp.? I think savanna. is too more wide word for this.

+1, brilliant. EDP is perfect.  I was worried about the scope of
savanna. too.

 And one more bureaucratic thing... I see you already started implementing it 
 [1], 
 and it is named and goes as new EDP workflow [2]. I think new bluprint should 
 be 
 created for this feature to track all code changes as well as docs updates. 
 Docs I mean public Savanna docs about EDP, rest api docs and samples.

Absolutely, I can make it new blueprint.  Thanks.

 [1] https://review.openstack.org/#/c/69712
 [2] 
 https://blueprints.launchpad.net/openstack/?searchtext=edp-oozie-streaming-mapreduce
 
 Regards,
 Alexander Ignatov
 
 
 
 On 28 Jan 2014, at 20:47, Trevor McKay tmc...@redhat.com wrote:
 
  Hello all,
  
  In our first pass at EDP, the model for job settings was very consistent
  across all of our job types. The execution-time settings fit into this
  (superset) structure:
  
  job_configs = {'configs': {}, # config settings for oozie and hadoop
 'params': {},  # substitution values for Pig/Hive
 'args': []}# script args (Pig and Java actions)
  
  But we have some things that don't fit (and probably more in the
  future):
  
  1) Java jobs have 'main_class' and 'java_opts' settings
Currently these are handled as additional fields added to the
  structure above.  These were the first to diverge.
  
  2) Streaming MapReduce (anticipated) requires mapper and reducer
  settings (different than the mapred..class settings for
  non-streaming MapReduce)
  
  Problems caused by adding fields
  
  The job_configs structure above is stored in the database. Each time we
  add a field to the structure above at the level of configs, params, and
  args, we force a change to the database tables, a migration script and a
  change to the JSON validation for the REST api.
  
  We also cause a change for python-savannaclient and potentially other
  clients.
  
  This kind of change seems bad.
  
  Proposal: Borrow a page from Oozie and add savanna. configs
  -
  I would like to fit divergent job settings into the structure we already
  have.  One way to do this is to leverage the 'configs' dictionary.  This
  dictionary primarily contains settings for hadoop, but there are a
  number of oozie.xxx settings that are passed to oozie as configs or
  set by oozie for the benefit of running apps.
  
  What if we allow savanna. settings to be added to configs?  If we do
  that, any and all special configuration settings for specific job types
  or subtypes can be handled with no database changes and no api changes.
  
  Downside
  
  Currently, all 'configs' are rendered in the generated oozie workflow.
  The savanna. settings would be stripped out and processed by Savanna,
  thereby changing that behavior a bit (maybe not a big deal)
  
  We would also be mixing savanna. configs with config_hints for jobs,
  so users would potentially see savanna. settings mixed with oozie
  and hadoop settings.  Again, maybe not a big deal, but it might blur the
  lines a little bit.  Personally, I'm okay with this.
  
  Slightly different
  --
  We could also add a 'savanna-configs': {} element to job_configs to
  keep the configuration spaces separate.
  
  But, now we would have 'savanna-configs' (or another name), 'configs',
  'params', and 'args'.  Really? Just how many different types of values
  can we come up with? :)
  
  I lean away from this approach.
  
  Related: breaking up the superset
  -
  
  It is also the case that not every job type has every value type.
  
  Configs   ParamsArgs
  HiveY YN
  Pig Y YY
  MapReduce   Y NN
  JavaY NY
  
  So do we make that explicit in the docs and enforce it in the api with
  errors?
  
  Thoughts? I'm sure there are some :)
  
  Best,
  
  Trevor
  
  
  
  
  
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] How to handle diverging EDP job configuration settings

2014-01-29 Thread Trevor McKay
So, assuming we go forward with this, the followup question is whether
or not to move main_class and java_opts for Java actions into
edp.java.main_class and edp.java.java_opts configs.

I think yes.

Best,

Trevor

On Wed, 2014-01-29 at 09:15 -0500, Trevor McKay wrote:
 On Wed, 2014-01-29 at 14:35 +0400, Alexander Ignatov wrote:
  Thank you for bringing this up, Trevor.
  
  EDP gets more diverse and it's time to change its model.
  I totally agree with your proposal, but one minor comment.
  Instead of savanna. prefix in job_configs wouldn't it be better to make it
  as edp.? I think savanna. is too more wide word for this.
 
 +1, brilliant. EDP is perfect.  I was worried about the scope of
 savanna. too.
 
  And one more bureaucratic thing... I see you already started implementing 
  it [1], 
  and it is named and goes as new EDP workflow [2]. I think new bluprint 
  should be 
  created for this feature to track all code changes as well as docs updates. 
  Docs I mean public Savanna docs about EDP, rest api docs and samples.
 
 Absolutely, I can make it new blueprint.  Thanks.
 
  [1] https://review.openstack.org/#/c/69712
  [2] 
  https://blueprints.launchpad.net/openstack/?searchtext=edp-oozie-streaming-mapreduce
  
  Regards,
  Alexander Ignatov
  
  
  
  On 28 Jan 2014, at 20:47, Trevor McKay tmc...@redhat.com wrote:
  
   Hello all,
   
   In our first pass at EDP, the model for job settings was very consistent
   across all of our job types. The execution-time settings fit into this
   (superset) structure:
   
   job_configs = {'configs': {}, # config settings for oozie and hadoop
'params': {},  # substitution values for Pig/Hive
'args': []}# script args (Pig and Java actions)
   
   But we have some things that don't fit (and probably more in the
   future):
   
   1) Java jobs have 'main_class' and 'java_opts' settings
 Currently these are handled as additional fields added to the
   structure above.  These were the first to diverge.
   
   2) Streaming MapReduce (anticipated) requires mapper and reducer
   settings (different than the mapred..class settings for
   non-streaming MapReduce)
   
   Problems caused by adding fields
   
   The job_configs structure above is stored in the database. Each time we
   add a field to the structure above at the level of configs, params, and
   args, we force a change to the database tables, a migration script and a
   change to the JSON validation for the REST api.
   
   We also cause a change for python-savannaclient and potentially other
   clients.
   
   This kind of change seems bad.
   
   Proposal: Borrow a page from Oozie and add savanna. configs
   -
   I would like to fit divergent job settings into the structure we already
   have.  One way to do this is to leverage the 'configs' dictionary.  This
   dictionary primarily contains settings for hadoop, but there are a
   number of oozie.xxx settings that are passed to oozie as configs or
   set by oozie for the benefit of running apps.
   
   What if we allow savanna. settings to be added to configs?  If we do
   that, any and all special configuration settings for specific job types
   or subtypes can be handled with no database changes and no api changes.
   
   Downside
   
   Currently, all 'configs' are rendered in the generated oozie workflow.
   The savanna. settings would be stripped out and processed by Savanna,
   thereby changing that behavior a bit (maybe not a big deal)
   
   We would also be mixing savanna. configs with config_hints for jobs,
   so users would potentially see savanna. settings mixed with oozie
   and hadoop settings.  Again, maybe not a big deal, but it might blur the
   lines a little bit.  Personally, I'm okay with this.
   
   Slightly different
   --
   We could also add a 'savanna-configs': {} element to job_configs to
   keep the configuration spaces separate.
   
   But, now we would have 'savanna-configs' (or another name), 'configs',
   'params', and 'args'.  Really? Just how many different types of values
   can we come up with? :)
   
   I lean away from this approach.
   
   Related: breaking up the superset
   -
   
   It is also the case that not every job type has every value type.
   
   Configs   ParamsArgs
   HiveY YN
   Pig Y YY
   MapReduce   Y NN
   JavaY NY
   
   So do we make that explicit in the docs and enforce it in the api with
   errors?
   
   Thoughts? I'm sure there are some :)
   
   Best,
   
   Trevor
   
   
   
   
   
   
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [savanna] Undoing a change in the alembic migrations

2014-01-29 Thread Trevor McKay
Hi Sergey,

  In https://review.openstack.org/#/c/69982/1 we are moving the
'main_class' and 'java_opts' fields for a job execution into the
job_configs['configs'] dictionary.  This means that 'main_class' and
'java_opts' don't need to be in the database anymore.

  These fields were just added in the initial version of the migration
scripts.  The README says that migrations work from icehouse. Since
this is the initial script, does that mean we can just remove references
to those fields from the db models and the script, or do we need a new
migration script (002) to erase them?

Thanks,

Trevor


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [savanna] How to handle diverging EDP job configuration settings

2014-01-28 Thread Trevor McKay
Hello all,

In our first pass at EDP, the model for job settings was very consistent
across all of our job types. The execution-time settings fit into this
(superset) structure:

job_configs = {'configs': {}, # config settings for oozie and hadoop
   'params': {},  # substitution values for Pig/Hive
   'args': []}# script args (Pig and Java actions)

But we have some things that don't fit (and probably more in the
future):

1) Java jobs have 'main_class' and 'java_opts' settings
   Currently these are handled as additional fields added to the
structure above.  These were the first to diverge.

2) Streaming MapReduce (anticipated) requires mapper and reducer
settings (different than the mapred..class settings for
non-streaming MapReduce)

Problems caused by adding fields

The job_configs structure above is stored in the database. Each time we
add a field to the structure above at the level of configs, params, and
args, we force a change to the database tables, a migration script and a
change to the JSON validation for the REST api.

We also cause a change for python-savannaclient and potentially other
clients.

This kind of change seems bad.

Proposal: Borrow a page from Oozie and add savanna. configs
-
I would like to fit divergent job settings into the structure we already
have.  One way to do this is to leverage the 'configs' dictionary.  This
dictionary primarily contains settings for hadoop, but there are a
number of oozie.xxx settings that are passed to oozie as configs or
set by oozie for the benefit of running apps.

What if we allow savanna. settings to be added to configs?  If we do
that, any and all special configuration settings for specific job types
or subtypes can be handled with no database changes and no api changes.

Downside

Currently, all 'configs' are rendered in the generated oozie workflow.
The savanna. settings would be stripped out and processed by Savanna,
thereby changing that behavior a bit (maybe not a big deal)

We would also be mixing savanna. configs with config_hints for jobs,
so users would potentially see savanna. settings mixed with oozie
and hadoop settings.  Again, maybe not a big deal, but it might blur the
lines a little bit.  Personally, I'm okay with this.

Slightly different
--
We could also add a 'savanna-configs': {} element to job_configs to
keep the configuration spaces separate.

But, now we would have 'savanna-configs' (or another name), 'configs',
'params', and 'args'.  Really? Just how many different types of values
can we come up with? :)

I lean away from this approach.

Related: breaking up the superset
-

It is also the case that not every job type has every value type.

 Configs   ParamsArgs
HiveY YN
Pig Y YY
MapReduce   Y NN
JavaY NY

So do we make that explicit in the docs and enforce it in the api with
errors?

Thoughts? I'm sure there are some :)

Best,

Trevor



  


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] savannaclient v2 api

2014-01-27 Thread Trevor McKay
We should consider turning mains into a string instead of a list for
v2.

Hive and Pig Oozie actions use mains, and each may only specify a single
script element.  There is no utility akaik for having multiple mains
associated with a job.

Workflows with multiple actions might need multiple mains, but those
haven't been designed at this point.

Best,

Trevor

On Tue, 2014-01-14 at 12:24 -0500, Matthew Farrellee wrote:
 https://blueprints.launchpad.net/savanna/+spec/v2-api
 
 I've finished a review of the v1.0 and v1.1 APIs with an eye to making 
 them more consistent and RESTful.
 
 Please use this thread to comment on my suggestions for v1.0  v1.1, or 
 to make further suggestions.
 
 Best,
 
 
 matt
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [savanna] why swift-internal:// ?

2014-01-24 Thread Trevor McKay
Matt et al,

  Yes, swift-internal was meant as a marker to distinguish it from
swift-external someday. I agree, this could be indicated by setting 
other fields.

Little bit of implementation detail for scope:

  In the current EDP implementation, SWIFT_INTERNAL_PREFIX shows up in
essentially two places.  One is validation (pretty easy to change).

  The other is in Savanna's binary_retrievers module where, as others
suggested, the auth url (proto, host, port, api) and admin tenant from
the savanna configuration are used with the user/passw to make a
connection through the swift client.

  Handling of different types of job binaries is done in
binary_retrievers/dispatch.py, where the URL determines the treatment.
This could easily be extended to look at other indicators.

Best,

Trev

On Fri, 2014-01-24 at 07:50 -0500, Matthew Farrellee wrote:
 andrew,
 
 what about having swift:// which defaults to the configured tenant and 
 auth url for what we now call swift-internal, and we allow for user 
 input to change tenant and auth url for what would be swift-external?
 
 in fact, we may need to add the tenant selection in icehouse. it's a 
 pretty big limitation to only allow a single tenant.
 
 best,
 
 
 matt
 
 On 01/23/2014 11:15 PM, Andrew Lazarev wrote:
  Matt,
 
  For swift-internal we are using the same keystone (and identity protocol
  version) as for savanna. Also savanna admin tenant is used.
 
  Thanks,
  Andrew.
 
 
  On Thu, Jan 23, 2014 at 6:17 PM, Matthew Farrellee m...@redhat.com
  mailto:m...@redhat.com wrote:
 
  what makes it internal vs external?
 
  swift-internal needs user  pass
 
  swift-external needs user  pass  ?auth url?
 
  best,
 
 
  matt
 
  On 01/23/2014 08:43 PM, Andrew Lazarev wrote:
 
  Matt,
 
  I can easily imagine situation when job binaries are stored in
  external
  HDFS or external SWIFT (like data sources). Internal and
  external swifts
  are different since we need additional credentials.
 
  Thanks,
  Andrew.
 
 
  On Thu, Jan 23, 2014 at 5:30 PM, Matthew Farrellee
  m...@redhat.com mailto:m...@redhat.com
  mailto:m...@redhat.com mailto:m...@redhat.com wrote:
 
   trevor,
 
   job binaries are stored in swift or an internal savanna db,
   represented by swift-internal:// and savanna-db://
  respectively.
 
   why swift-internal:// and not just swift://?
 
   fyi, i see mention of a potential future version of savanna w/
   swift-external://
 
   best,
 
 
   matt
 
   ___
   OpenStack-dev mailing list
   OpenStack-dev@lists.openstack.org
   mailto:OpenStack-dev@lists.__openstack.org
  mailto:OpenStack-dev@lists.openstack.org
  
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
  
  http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
  
  http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
  _
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.__org
  mailto:OpenStack-dev@lists.openstack.org
  
  http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
  _
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.__org
  mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Savanna EDP sequence diagrams added for discussion...

2013-07-18 Thread Trevor McKay
Hi all,

  Here is a page to hold sequence diagrams for Savanna EDP, 
based on current launchpad blueprints.  We thought it might be helpful to 
create some diagrams for discussion as the component specs are written and the
API is worked out:

  https://wiki.openstack.org/wiki/Savanna/EDP_Sequences

  (The main page for EDP is here https://wiki.openstack.org/wiki/Savanna/EDP )

  There is an initial sequence there, along with a link to the source 
for generating the PNG with PlantUML.  Feedback would be great, either 
through IRC, email, comments on the wiki, or by modifying 
the sequence and/or posting additional sequences.

  The sequences can be generated/modified easily with with Plantuml which 
installs as a single jar file:

  http://plantuml.sourceforge.net/download.html
 
  java -jar plantuml.jar

  Choose the directory which contains plantuml text files and it will
monitor, generate, and update PNGs as you save/modify text files. I thought
it was broken the first time I ran it because there are no controls :)
Very simple.

Best,

Trevor


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Savanna-all] Savanna EDP sequence diagrams added for discussion...

2013-07-18 Thread Trevor McKay
fyi, updates to the diagram based on feedback

On Thu, 2013-07-18 at 13:49 -0400, Trevor McKay wrote:
 Hi all,
 
   Here is a page to hold sequence diagrams for Savanna EDP, 
 based on current launchpad blueprints.  We thought it might be helpful to 
 create some diagrams for discussion as the component specs are written and the
 API is worked out:
 
   https://wiki.openstack.org/wiki/Savanna/EDP_Sequences
 
   (The main page for EDP is here https://wiki.openstack.org/wiki/Savanna/EDP )
 
   There is an initial sequence there, along with a link to the source 
 for generating the PNG with PlantUML.  Feedback would be great, either 
 through IRC, email, comments on the wiki, or by modifying 
 the sequence and/or posting additional sequences.
 
   The sequences can be generated/modified easily with with Plantuml which 
 installs as a single jar file:
 
   http://plantuml.sourceforge.net/download.html
  
   java -jar plantuml.jar
 
   Choose the directory which contains plantuml text files and it will
 monitor, generate, and update PNGs as you save/modify text files. I thought
 it was broken the first time I ran it because there are no controls :)
 Very simple.
 
 Best,
 
 Trevor
 
 



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev