Re: [openstack-dev] [Savanna] Spark plugin status

2014-01-19 Thread Matthew Farrellee

On 01/10/2014 04:05 AM, Daniele Venzano wrote:

On 01/09/14 19:12, Matthew Farrellee wrote:

This is definitely great news!

+2 to the things Sergey mentioned below.

Additionally, will you fill out the blueprint or wiki w/ details that
will help others write integration tests for your plugin?


We already implemented at least some part of the integration tests for
Spark, mimicking the ones that are provided with the Vanilla plugin. The
Spark plugin works almost exactly as the Vanilla one, it can install a
datanode, namenode, Spark master or Spark worker and resize the cluster.
What kind of documentation is needed?


That's great. Documentation of how and when to use the plugin will be great.



And, did you integrate (or have plans to integrate) Spark into the EDP
workflows in Horizon?


We would like to have that functionality. Currently we are limited by
the lack of a Swift service in our cluster. We will have one test
installation in a short while and then we will see. What is the status
of the HDFS datasource? We are very interested in that, but I lost track
of the development during the holidays.


It's coming along well. You could ping tmckay or croberts on #savanna to 
get specifics.


Best,


matt


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Savanna] Spark plugin status

2014-01-10 Thread Sergey Lukjanov
Answers inlined.


On Fri, Jan 10, 2014 at 1:05 PM, Daniele Venzano  wrote:

> On 01/09/14 19:12, Matthew Farrellee wrote:
>
>> This is definitely great news!
>>
>> +2 to the things Sergey mentioned below.
>>
>> Additionally, will you fill out the blueprint or wiki w/ details that
>> will help others write integration tests for your plugin?
>>
>
> We already implemented at least some part of the integration tests for
> Spark, mimicking the ones that are provided with the Vanilla plugin. The
> Spark plugin works almost exactly as the Vanilla one, it can install a
> datanode, namenode, Spark master or Spark worker and resize the cluster.
> What kind of documentation is needed?


[SL] Are you installing HDFS too? I think that some docs about how your
plugin works and about the Spark's requirements will be great.


>
>
>
>  And, did you integrate (or have plans to integrate) Spark into the EDP
>> workflows in Horizon?
>>
>
> We would like to have that functionality. Currently we are limited by the
> lack of a Swift service in our cluster. We will have one test installation
> in a short while and then we will see. What is the status of the HDFS
> datasource? We are very interested in that, but I lost track of the
> development during the holidays.


Is it possible to run Spark workloads using Oozie? Here is the external
HDFS support change request - https://review.openstack.org/#/c/47828/.


>
>
>
>
>  On 01/09/2014 03:41 AM, Sergey Lukjanov wrote:
>>
>>> Hi,
>>>
>>> I'm really glad to here that!
>>>
>>> Answers inlined.
>>>
>>> Thanks.
>>>
>>>
>>> On Thu, Jan 9, 2014 at 11:33 AM, Daniele Venzano
>>> mailto:daniele.venz...@eurecom.fr>> wrote:
>>>
>>> Hello,
>>>
>>> we are finishing up the development of the Spark plugin for Savanna.
>>> In the next few days we will deploy it on an OpenStack cluster with
>>> real users to iron out the last few things. Hopefully next week we
>>> will put the code on a public github repository in beta status.
>>>
>>> [SL] Awesome! Could you, please, share some info this installation if
>>> possible? like OpenStack cluster version and size, Savanna version,
>>> expected Spark cluster sizes and lifecycle, etc.
>>>
>>>
>>> You can find the blueprint here:
>>> https://blueprints.launchpad.__net/savanna/+spec/spark-plugin
>>> 
>>>
>>> There are two things we need to release, the VM image and the code
>>> itself.
>>> For the image we created one ourselves and for the code we used the
>>> Vanilla plugin as a base.
>>>
>>> [SL] You can use diskimage-builder [0] to prepare such images, we're
>>> already using it for building images for vanilla plugin [1].
>>>
>>>
>>> We feel that our work could be interesting for others and we would
>>> like to see it integrated in Savanna. What is the best way to
>>> proceed?
>>>
>>> [SL] Absolutely, it's a very interesting tool for data processing. IMO
>>> the best way is to create a change request to savanna for code review
>>> and discussion in gerrit, it'll be really the most effective way to
>>> collaborate. As for the best way of integration with Savanna - we're
>>> expecting to see it in the openstack/savanna repo like vanilla, HDP and
>>> IDH (which will be landed soon) plugins.
>>>
>>>
>>> We did not follow the Gerrit workflow until now because development
>>> happened internally.
>>> I will prepare the repo on github with git-review and reference the
>>> blueprint in the commit. After that, do you prefer that I send
>>> immediately the code for review or should I send a link here on the
>>> mailing list first for some feedback/discussion?
>>>
>>> [SL] It'll be better to immediately send the code for review.
>>>
>>>
>>> Thank you,
>>> Daniele Venzano, Hoang Do and Vo Thanh Phuc
>>>
>>> _
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.__org
>>> 
>>>
>>> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>>> 
>>>
>>>
>>>
>>> [0] https://github.com/openstack/diskimage-builder
>>> [1] https://github.com/openstack/savanna-image-elements
>>>
>>> Please, feel free to ping me if some help needed with gerrit or savanna
>>> internals stuff.
>>>
>>> Thanks.
>>>
>>> --
>>> Sincerely yours,
>>> Sergey Lukjanov
>>> Savanna Technical Lead
>>> Mirantis Inc.
>>>
>>>
>>> ___
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
> ___
> Ope

Re: [openstack-dev] [Savanna] Spark plugin status

2014-01-10 Thread Daniele Venzano

On 01/09/14 19:12, Matthew Farrellee wrote:

This is definitely great news!

+2 to the things Sergey mentioned below.

Additionally, will you fill out the blueprint or wiki w/ details that
will help others write integration tests for your plugin?


We already implemented at least some part of the integration tests for 
Spark, mimicking the ones that are provided with the Vanilla plugin. The 
Spark plugin works almost exactly as the Vanilla one, it can install a 
datanode, namenode, Spark master or Spark worker and resize the cluster.

What kind of documentation is needed?



And, did you integrate (or have plans to integrate) Spark into the EDP
workflows in Horizon?


We would like to have that functionality. Currently we are limited by 
the lack of a Swift service in our cluster. We will have one test 
installation in a short while and then we will see. What is the status 
of the HDFS datasource? We are very interested in that, but I lost track 
of the development during the holidays.





On 01/09/2014 03:41 AM, Sergey Lukjanov wrote:

Hi,

I'm really glad to here that!

Answers inlined.

Thanks.


On Thu, Jan 9, 2014 at 11:33 AM, Daniele Venzano
mailto:daniele.venz...@eurecom.fr>> wrote:

Hello,

we are finishing up the development of the Spark plugin for Savanna.
In the next few days we will deploy it on an OpenStack cluster with
real users to iron out the last few things. Hopefully next week we
will put the code on a public github repository in beta status.

[SL] Awesome! Could you, please, share some info this installation if
possible? like OpenStack cluster version and size, Savanna version,
expected Spark cluster sizes and lifecycle, etc.


You can find the blueprint here:
https://blueprints.launchpad.__net/savanna/+spec/spark-plugin


There are two things we need to release, the VM image and the code
itself.
For the image we created one ourselves and for the code we used the
Vanilla plugin as a base.

[SL] You can use diskimage-builder [0] to prepare such images, we're
already using it for building images for vanilla plugin [1].


We feel that our work could be interesting for others and we would
like to see it integrated in Savanna. What is the best way to
proceed?

[SL] Absolutely, it's a very interesting tool for data processing. IMO
the best way is to create a change request to savanna for code review
and discussion in gerrit, it'll be really the most effective way to
collaborate. As for the best way of integration with Savanna - we're
expecting to see it in the openstack/savanna repo like vanilla, HDP and
IDH (which will be landed soon) plugins.


We did not follow the Gerrit workflow until now because development
happened internally.
I will prepare the repo on github with git-review and reference the
blueprint in the commit. After that, do you prefer that I send
immediately the code for review or should I send a link here on the
mailing list first for some feedback/discussion?

[SL] It'll be better to immediately send the code for review.


Thank you,
Daniele Venzano, Hoang Do and Vo Thanh Phuc

_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org


http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev




[0] https://github.com/openstack/diskimage-builder
[1] https://github.com/openstack/savanna-image-elements

Please, feel free to ping me if some help needed with gerrit or savanna
internals stuff.

Thanks.

--
Sincerely yours,
Sergey Lukjanov
Savanna Technical Lead
Mirantis Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Savanna] Spark plugin status

2014-01-09 Thread Matthew Farrellee

This is definitely great news!

+2 to the things Sergey mentioned below.

Additionally, will you fill out the blueprint or wiki w/ details that 
will help others write integration tests for your plugin?


And, did you integrate (or have plans to integrate) Spark into the EDP 
workflows in Horizon?


Best,


matt

On 01/09/2014 03:41 AM, Sergey Lukjanov wrote:

Hi,

I'm really glad to here that!

Answers inlined.

Thanks.


On Thu, Jan 9, 2014 at 11:33 AM, Daniele Venzano
mailto:daniele.venz...@eurecom.fr>> wrote:

Hello,

we are finishing up the development of the Spark plugin for Savanna.
In the next few days we will deploy it on an OpenStack cluster with
real users to iron out the last few things. Hopefully next week we
will put the code on a public github repository in beta status.

[SL] Awesome! Could you, please, share some info this installation if
possible? like OpenStack cluster version and size, Savanna version,
expected Spark cluster sizes and lifecycle, etc.


You can find the blueprint here:
https://blueprints.launchpad.__net/savanna/+spec/spark-plugin


There are two things we need to release, the VM image and the code
itself.
For the image we created one ourselves and for the code we used the
Vanilla plugin as a base.

[SL] You can use diskimage-builder [0] to prepare such images, we're
already using it for building images for vanilla plugin [1].


We feel that our work could be interesting for others and we would
like to see it integrated in Savanna. What is the best way to proceed?

[SL] Absolutely, it's a very interesting tool for data processing. IMO
the best way is to create a change request to savanna for code review
and discussion in gerrit, it'll be really the most effective way to
collaborate. As for the best way of integration with Savanna - we're
expecting to see it in the openstack/savanna repo like vanilla, HDP and
IDH (which will be landed soon) plugins.


We did not follow the Gerrit workflow until now because development
happened internally.
I will prepare the repo on github with git-review and reference the
blueprint in the commit. After that, do you prefer that I send
immediately the code for review or should I send a link here on the
mailing list first for some feedback/discussion?

[SL] It'll be better to immediately send the code for review.


Thank you,
Daniele Venzano, Hoang Do and Vo Thanh Phuc

_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org

http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 




[0] https://github.com/openstack/diskimage-builder
[1] https://github.com/openstack/savanna-image-elements

Please, feel free to ping me if some help needed with gerrit or savanna
internals stuff.

Thanks.

--
Sincerely yours,
Sergey Lukjanov
Savanna Technical Lead
Mirantis Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Savanna] Spark plugin status

2014-01-09 Thread Daniele Venzano

On 01/09/14 09:41, Sergey Lukjanov wrote:


On Thu, Jan 9, 2014 at 11:33 AM, Daniele Venzano
mailto:daniele.venz...@eurecom.fr>> wrote:

Hello,

we are finishing up the development of the Spark plugin for Savanna.
In the next few days we will deploy it on an OpenStack cluster with
real users to iron out the last few things. Hopefully next week we
will put the code on a public github repository in beta status.

[SL] Awesome! Could you, please, share some info this installation if
possible? like OpenStack cluster version and size, Savanna version,
expected Spark cluster sizes and lifecycle, etc.


As part of the Bigfoot project that is funding us 
(http://bigfootproject.eu/) we have a research OpenStack cluster with 6 
compute nodes, hopefully with more coming. The machines have 16 CPUs, 32 
with hyperthreading, and 128GB of RAM.


OpenStack is the Ubuntu cloud version (Grizzly 2013.1.4), but Horizon 
and Keystone are on the latest Havana branch versions. It uses KVM and 
the openvswitch plugin for networking.


For Savanna, we stayed with a version from git that was working for us, 
after 0.3, but now a couple of months old. Part of the work I need to do 
is merging with the current Savanna master branch.


We have five users that are interested in running Spark jobs and at 
least one has already been doing so on the Bigfoot platform with a 
cluster created by hand.
We will start with two of them and then let in the others. One will use 
a small cluster with 3 nodes, the other with about ten nodes.
We also plan to run a few tests with various sizes of clusters, mainly 
to measure performance in various conditions.



[SL] You can use diskimage-builder [0] to prepare such images, we're
already using it for building images for vanilla plugin [1].


Yes, I had a quick look and from what I understand we will need to 
modify the scripts that build the images. We will make a separate change 
request for that.



[SL] Absolutely, it's a very interesting tool for data processing. IMO
the best way is to create a change request to savanna for code review
and discussion in gerrit, it'll be really the most effective way to
collaborate. As for the best way of integration with Savanna - we're
expecting to see it in the openstack/savanna repo like vanilla, HDP and
IDH (which will be landed soon) plugins.


Nice! I will contact you when I am ready to create the github repo, so 
that I do it right for the review process.


Thanks,
Daniele

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Savanna] Spark plugin status

2014-01-09 Thread Sergey Lukjanov
Hi,

I'm really glad to here that!

Answers inlined.

Thanks.


On Thu, Jan 9, 2014 at 11:33 AM, Daniele Venzano  wrote:

> Hello,
>
> we are finishing up the development of the Spark plugin for Savanna.
> In the next few days we will deploy it on an OpenStack cluster with real
> users to iron out the last few things. Hopefully next week we will put the
> code on a public github repository in beta status.
>
[SL] Awesome! Could you, please, share some info this installation if
possible? like OpenStack cluster version and size, Savanna version,
expected Spark cluster sizes and lifecycle, etc.

>
> You can find the blueprint here:
> https://blueprints.launchpad.net/savanna/+spec/spark-plugin
>
> There are two things we need to release, the VM image and the code itself.
> For the image we created one ourselves and for the code we used the
> Vanilla plugin as a base.
>
[SL] You can use diskimage-builder [0] to prepare such images, we're
already using it for building images for vanilla plugin [1].

>
> We feel that our work could be interesting for others and we would like to
> see it integrated in Savanna. What is the best way to proceed?
>
[SL] Absolutely, it's a very interesting tool for data processing. IMO the
best way is to create a change request to savanna for code review and
discussion in gerrit, it'll be really the most effective way to
collaborate. As for the best way of integration with Savanna - we're
expecting to see it in the openstack/savanna repo like vanilla, HDP and IDH
(which will be landed soon) plugins.

>
> We did not follow the Gerrit workflow until now because development
> happened internally.
> I will prepare the repo on github with git-review and reference the
> blueprint in the commit. After that, do you prefer that I send immediately
> the code for review or should I send a link here on the mailing list first
> for some feedback/discussion?
>
[SL] It'll be better to immediately send the code for review.

>
> Thank you,
> Daniele Venzano, Hoang Do and Vo Thanh Phuc
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


[0] https://github.com/openstack/diskimage-builder
[1] https://github.com/openstack/savanna-image-elements

Please, feel free to ping me if some help needed with gerrit or savanna
internals stuff.

Thanks.

-- 
Sincerely yours,
Sergey Lukjanov
Savanna Technical Lead
Mirantis Inc.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Savanna] Spark plugin status

2014-01-08 Thread Daniele Venzano

Hello,

we are finishing up the development of the Spark plugin for Savanna.
In the next few days we will deploy it on an OpenStack cluster with real 
users to iron out the last few things. Hopefully next week we will put 
the code on a public github repository in beta status.


You can find the blueprint here:
https://blueprints.launchpad.net/savanna/+spec/spark-plugin

There are two things we need to release, the VM image and the code itself.
For the image we created one ourselves and for the code we used the 
Vanilla plugin as a base.


We feel that our work could be interesting for others and we would like 
to see it integrated in Savanna. What is the best way to proceed?


We did not follow the Gerrit workflow until now because development 
happened internally.
I will prepare the repo on github with git-review and reference the 
blueprint in the commit. After that, do you prefer that I send 
immediately the code for review or should I send a link here on the 
mailing list first for some feedback/discussion?


Thank you,
Daniele Venzano, Hoang Do and Vo Thanh Phuc

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev