Summary of IRC meeting in #cloudstack-meeting, Mon Jan 28 19:01:27 2013

ASF IRC Services Mon, 28 Jan 2013 12:13:38 -0800

Members present: edison_cs, topcloud, jburwell

----------------
Meeting summary:
----------------


1. Preface

IRC log follows:


# 1. Preface #
19:02:50 [edison_cs]: jburwell: about the storage stuff, I am trying to hook up 
the inferfaces listed in my design document into mgt server, made a lot of 
change on the existing mgt server code
19:02:59 [edison_cs]: I'll start a new branch based on javelin
19:03:27 [edison_cs]: maybe by looking at the code, we can understand what we 
are talking about 
19:05:12 [edison_cs]: I think we are doing the same thing: clean up current 
storage code, make it more extensible etc, and the API proposed by both of us 
are very similar. 
19:07:27 [edison_cs]: Alex and I both still quite understand, why the api needs 
to have input/output stream
19:07:34 [edison_cs]: not understand..
19:09:57 [jburwell]: the issue is how do we acquire data
19:10:12 [jburwell]: for example, downloading a template from a URL
19:10:21 [jburwell]: simply providing a URI on that operation is incomplete
19:10:27 [edison_cs]: I know, but that's at the provision side
19:10:42 [edison_cs]: what I am talking about is at the orchestration side, on 
the mgt server
19:10:45 [jburwell]: we need some of saying writing the stream of data (input 
stream) to this location (URI)
19:10:57 [jburwell]: please define orchestration
19:11:13 [edison_cs]: if there is a create api coming in from api server into 
mgt server
19:11:28 [jburwell]: create what?
19:11:34 [edison_cs]: mgt server needs to find out proper storage provider to 
create volume
19:11:42 [edison_cs]: let's say create volume api
19:11:57 [edison_cs]: mgt server needs to maintain volume state in the db
19:12:04 [jburwell]: for create a volume, a stream may not be necessary
19:12:06 [edison_cs]: then call the proper storage provider to create that 
volume
19:12:13 [jburwell]: for other operations, streams are necessary
19:12:34 [edison_cs]: inside that storage provider, the implementation, can 
send a command to hypervisor resource
19:12:44 [edison_cs]: or it can directly call storage box api 
19:13:13 [jburwell]: I think it difficult to follow the conversation if we 
switch examples often
19:13:20 [jburwell]: I would like to stick the the download template example
19:13:27 [edison_cs]: ok
19:13:35 [jburwell]: because it exemplifies the issues I encountered 
implementing S3
19:13:50 [jburwell]: I am concerned with operations where data is actually 
being transfered
19:14:04 [edison_cs]: can you gave an example, how create template work in your 
api?
19:14:12 [edison_cs]: how it integrate with mgt server
19:14:27 [jburwell]: I have posted pseudo code for that operation on the 
mailing list
19:15:34 [edison_cs]: as I said, I am ok with your pseduo code, as it's at the 
provision level
19:15:42 [jburwell]: http://markmail.org/message/adck5mkb3izg6khi
19:16:06 [jburwell]: it seems that provisioning is a logical operation, correct?
19:16:12 [edison_cs]: I don't care how provision work, but I more care about 
how mgt server call your s3 privider
19:16:44 [jburwell]: in this case, I expect the StorageDevice to be invoked in 
the SSVM
19:17:00 [jburwell]: but as I have said, the abstract shouldn't care about 
where it is executed
19:17:12 [edison_cs]: but mgt server do need to know and manage s3 provider 
19:17:12 [jburwell]: the mgmt server should make that decision
19:17:34 [edison_cs]: for example, a create template api coming in
19:17:43 [jburwell]: it just needs to say, "I have an operation to perform and 
it will be executed {locally, ssvm}."
19:17:43 [edison_cs]: how the mgt server will call into your s3 provider
19:18:14 [edison_cs]: if there are multiple object storage provider
19:18:27 [edison_cs]: mgt server do need to select one  of them, right
19:18:42 [edison_cs]: mgt server need to allow you to register a provider 
19:18:44 [jburwell]: by virtue of the storable interface, the template itself 
will indicate which device it should be stored on
19:19:04 [jburwell]: I would like to keep registration of drivers/types out of 
this conversation
19:19:12 [edison_cs]: ooops
19:19:12 [jburwell]: as I see that as a wider discussion for CS
19:19:19 [edison_cs]: that's what all I am doing
19:19:19 [jburwell]: let's just assume that all of the types are registered ...
19:19:42 [edison_cs]: how to register a provider, how mgt server call into 
provider
19:19:57 [edison_cs]: how to mgt server provider in the database
19:20:04 [jburwell]: I think the registration of types for storage, 
hypervisors, network device, etc should be unified
19:20:15 [edison_cs]: that's what am I mean the orchestration
19:20:27 [jburwell]: I am concerned with what operations the driver provides
19:20:42 [jburwell]: and ensuring that we can interact with block and object 
stores without the need of an intermediate file system
19:20:43 [edison_cs]: as I said, there only 5-7 operations a provider will 
provide
19:20:57 [edison_cs]: create/copy/delete
19:21:12 [jburwell]: read and list are also required
19:21:28 [edison_cs]: list is required
19:21:42 [edison_cs]: but the api is not input/output stream
19:21:57 [edison_cs]: for example, there is a s3 provider registered
19:22:12 [jburwell]: how would we read or write data without an 
input/outputstream?
19:22:12 [edison_cs]: which has an api called create(object sth)
19:22:29 [edison_cs]: when register template api coming
19:22:50 [edison_cs]: which specifies which provider what to regsiter
19:22:59 [jburwell]: a template should have an association to the storage 
device on which it resides
19:23:12 [jburwell]: ditto with a volume, iso, and snapshot
19:23:14 [edison_cs]: registertemplate(strinng providerUUId, string 
url-of-template)
19:23:45 [edison_cs]: then mgt server will find out the correspoding privoder's 
creat api, then call it
19:24:04 [edison_cs]: inside provider's implemenation, you can send a command 
to ssvm
19:24:12 [edison_cs]: or send a command to hypervisor host 
19:24:13 [jburwell]: but how will the template data actually be given to the 
device in that create?
19:24:19 [edison_cs]: to regsiter the template
19:24:42 [edison_cs]: that depends on provider's implementaion
19:24:42 [jburwell]: there is the registration of the template but there is 
also the acquisition of the associated data (i.e. the x mb file that is the 
actual template)
19:25:05 [jburwell]: with just a UUID string and a logical URI how is the 
storage provider going to acquire the data?
19:25:51 [edison_cs]: storage provier can parse the uri, something like, 
http://some-where/some-path
19:26:12 [edison_cs]: then grab that uri with s3 client api
19:26:19 [jburwell]: so how does the storage provider know how CS will 
reference it later?
19:26:35 [jburwell]: we need an operation that says, "Write this data to this 
URI"
19:26:58 [edison_cs]: let me gave an example about how that works
19:27:19 [jburwell]: i.e take the contents of the input stream and write it to 
the physical location represented by "/template/tmpl/2/200"
19:27:27 [edison_cs]: if mgt server wants download s3 url into a nfs storage
19:27:34 [jburwell]: with that type of interface
19:27:49 [jburwell]: we can unified almost all inter device transfer
19:27:49 [jburwell]: s
19:28:07 [edison_cs]: the provider will have an api called: copy(dataobject 
src-something, dataobject dest-something)
19:28:34 [edison_cs]: src-something.touri will return a URI represent a s3 url
19:28:49 [edison_cs]: dest-something will represent a nfs://balala-/path
19:29:04 [edison_cs]: then inside provider's copy api
19:29:13 [edison_cs]: it will send a copy command to ssvm
19:29:27 [edison_cs]: inside ssvm, it will decode both of the two uri=
19:29:42 [jburwell]: fundamentally, I do not think that a storage provider 
should know anything about the SSVM
19:29:42 [edison_cs]: download s3 uri to nfs
19:29:50 [jburwell]: that should be the management server's decision to 
determine where an operation will be executed
19:29:57 [edison_cs]: as I said, the provider is in the mgt server side
19:30:15 [jburwell]: so now S3 needs to know about every other possible driver?
19:30:27 [jburwell]: and every driver needs to implement a basic i/o stream 
copy operation?
19:30:28 [edison_cs]: it's hook, so provider can implement whatever strategy 
they want
19:31:19 [edison_cs]: I get your idea
19:31:42 [edison_cs]: during the download s3 url to nfs command at ssvm
19:32:04 [edison_cs]: you want nfs can be represented as a outputstream, right?
19:32:05 [jburwell]: we don't want to be downloading a template to NFS anymore
19:32:14 [edison_cs]: that's an example
19:32:28 [jburwell]: we want the template to go straight to S3 or a block store 
or whatever
19:32:37 [edison_cs]: I know, that's what I am doing
19:33:04 [jburwell]: so, yes, the streams are an handle of either a source or 
sink for reading/writing data
19:33:04 [edison_cs]: just give me few days, I'll remove that nfs secondary 
storage as much as possible
19:33:42 [jburwell]: an NFS device working as secondary storage will make sense 
in certain deployments
19:33:49 [edison_cs]: that's what I am saying, your api, is at the provision 
level(which actually access darta)
19:33:51 [edison_cs]: data
19:34:04 [edison_cs]: but my api is at the orchestration level
19:34:21 [edison_cs]: which doesn't need to access data
19:34:21 [jburwell]: my feeling is that both have to be addressed together
19:34:34 [jburwell]: I don't believe that we can design one without considering 
the other
19:35:04 [edison_cs]: as I said, maybe you can give me an example, how mgt 
server interact with your api?
19:35:12 [edison_cs]: during the register template api call
19:35:49 [edison_cs]: what will the register api look lie
19:36:19 [edison_cs]: how the mgt server know call into s3 code
19:36:34 [jburwell]: when you say register, you mean the simple act of storing 
the template's configuration data in the database or also of downloading the 
data?
19:36:34 [edison_cs]: if there are mulitple object storage registered
19:37:12 [jburwell]: when a template configuration is specified, the storage 
device on which it will reside is selected
19:37:19 [jburwell]: and specified as part of the template
19:37:27 [edison_cs]: how?
19:37:34 [jburwell]: it may be an auto-select process or manually specified
19:37:42 [edison_cs]: be specific about the register template api
19:37:52 [edison_cs]: it's encoded in the uri?
19:37:57 [jburwell]: noooo
19:38:13 [jburwell]: the uri is a generic reference for the template that is 
same across all storage devices
19:38:27 [jburwell]: template implements storable
19:38:30 [jburwell]: storable defines two methods
19:38:34 [edison_cs]: ok, then how mgt server knows it's from s3
19:38:35 [jburwell]: getURI() : URI
19:38:36 [edison_cs]: not from swift
19:38:50 [jburwell]: getStorageDevice() : StorageDevice
19:39:04 [jburwell]: when a template is registered, a storage device for that 
template must be specified
19:39:34 [jburwell]: my intention for the URI is that it is a logical value 
that uniquely and consistently references the object in CS
19:39:35 [edison_cs]: is it a uuid
19:39:36 [edison_cs]: ?
19:39:49 [jburwell]: it is responsibility of the storage device to map that to 
a physical location
19:40:04 [jburwell]: the UUID may be part of the URI, but it is not the uri
19:40:27 [edison_cs]: how the registertemplate api will look like?
19:40:34 [jburwell]: a template URI could be "/template/<UUID>" or 
"/template/<account_id>/<template_id>
19:40:49 [jburwell]: registertemplate(Template aTemplate)
19:40:57 [jburwell]: where a template contains all of the properties to be 
created
19:40:57 [edison_cs]: I mean from user API
19:41:06 [jburwell]: user api?
19:41:20 [edison_cs]: yes, user/admin wants to register a template
19:41:27 [jburwell]: registerTemplate(Template aTemplate)
19:41:27 [edison_cs]: what's the api looks like?
19:41:42 [edison_cs]: through restful  api
19:41:57 [jburwell]: a JSON serialization of a Template instance
19:42:20 [jburwell]: so, it would include the name, url, storage_device_id, 
platform, public ...
19:42:42 [edison_cs]: ok, i see, it has storage_device_id
19:42:57 [edison_cs]: that's what I mean storage uuid 
19:43:04 [edison_cs]: in our current admin/user api
19:43:42 [jburwell]: the Template class would provide an implementation to 
render out the URI ...
19:43:57 [jburwell]: and internal operations that want to interact with the 
underlying storage
19:44:05 [jburwell]: could get the associated device via the getStorageDevice 
method
19:44:21 [jburwell]: and reference it logically via the getURI method
19:45:04 [edison_cs]: in a word, the registertemplate do need someway to 
identify, which storage device to register
19:45:12 [edison_cs]: either through a plain uuid
19:45:19 [jburwell]: most definately
19:45:22 [edison_cs]: or through template class
19:45:29 [edison_cs]: ok,
19:45:36 [edison_cs]: then we agree on that
19:45:42 [edison_cs]: let's move on
19:45:42 [jburwell]: we need to know for template (or a volume or iso) on which 
storage device it is intended to live
19:45:49 [edison_cs]: how the mgt server will call you api
19:46:27 [edison_cs]: there is a registertemplate coming in
19:46:45 [edison_cs]: how the mgt server call into your s3 api, which has 
input/outstream
19:47:05 [jburwell]: for this operation, templates are downloaded via the SSVM 
right?
19:47:05 [edison_cs]: does mgt server need to send a command to ssvm?
19:47:13 [edison_cs]: no
19:47:21 [edison_cs]: that's what I am doing
19:47:42 [edison_cs]: mgt server needs to be extended
19:47:44 [jburwell]: so templates will be downloaded in-line in the management 
server?
19:48:04 [edison_cs]: so that, people can plug into his own code, to change the 
way how access storage
19:48:06 [edison_cs]: it can be in ssvm
19:48:13 [edison_cs]: it can be in hypervisor host
19:48:30 [edison_cs]: it can be in an external service
19:48:34 [edison_cs]: or whatever
19:48:44 [edison_cs]: currently, cloudstack enforce you to use ssvm
19:48:50 [edison_cs]: which is bad!
19:49:19 [edison_cs]: how to make the decision, is what I mean: orchestration
19:51:59 [jburwell]: and in my design proposal, a storage device is both 
serializable and stateless
19:52:12 [jburwell]: therefore, you access and use it in-line in the management 
server
19:52:20 [jburwell]: or put it on a async call and ship it to another process
19:52:27 [edison_cs]: I know, but It doesn't sovle the problem, how to make the 
difference
19:52:37 [edison_cs]: how people can plug in code in the mgt server
19:52:50 [edison_cs]: to change the way we access data
19:52:57 [jburwell]: as I said before, the orchestration would consume 
StorageDevices
19:53:12 [jburwell]: a storage device doesn't know or care about orchestration
19:53:12 [edison_cs]: but, where is the orchestration
19:53:13 [jburwell]: it does as it it told
19:53:20 [jburwell]: in the layer above storage devices
19:53:27 [edison_cs]: who will write the orchestration
19:53:34 [edison_cs]: is it extensible
19:53:34 [jburwell]: orchestration should be common across all of CS
19:53:57 [edison_cs]: I know, that's what I am doing in the orchestration code
19:53:58 [jburwell]: storage devices provide the simple, composable operations 
that the orchestration engine uses
19:54:19 [edison_cs]: make the orchestration extensible
19:54:27 [jburwell]: in what way?
19:54:49 [edison_cs]: edison_cs: so that, people can plug into his own code, to 
change the way how access storage
19:54:49 [edison_cs]: [7:48pm] edison_cs: it can be in ssvm
19:54:49 [edison_cs]: [7:48pm] edison_cs: it can be in hypervisor host
19:54:49 [edison_cs]: [7:48pm] edison_cs: it can be in an external service
19:54:49 [jburwell]: to my mind, the actual algorithms implemented by 
orchestration should be common
19:54:49 [edison_cs]: [7:48pm] edison_cs: or whatever
19:54:49 [edison_cs]: [7:48pm] edison_cs: currently, cloudstack enforce you to 
use ssvm
19:54:49 [edison_cs]: [7:48pm] edison_cs: which is bad!
19:54:49 [edison_cs]: [7:49pm] edison_cs: how to make the decision, is what I 
mean: orchestration
19:54:57 [edison_cs]: it's not common
19:55:15 [edison_cs]: people can write whatever algoirthm they what
19:55:35 [edison_cs]: for example
19:55:42 [edison_cs]: if it's kvm
19:55:57 [jburwell]: plugging in new storage algorithms should be orthogonal 
(yes, I used that dreaded word) to the actual device interaction
19:56:12 [edison_cs]: but that's what I am doing
19:56:42 [edison_cs]: so I said before, we are doing different tasks
19:56:49 [edison_cs]: my part is on the orchestration
19:56:57 [edison_cs]: your part is at the provision level
19:57:14 [jburwell]: I understand that
19:57:19 [edison_cs]: without an extensible orchestration part, 
19:57:34 [edison_cs]: cloudstack can only work with one model
19:57:42 [jburwell]: I don't think that they can designed completely separately
19:57:58 [jburwell]: we need to design both together and ensure that 
integration properly
19:58:07 [edison_cs]: ok, I agree
19:58:34 [edison_cs]: that's what I am thinking, I created a new branch
19:58:42 [edison_cs]: then show the orchestration part of code
19:59:07 [edison_cs]: then show, how to interact with provision code
19:59:57 [edison_cs]: is it ok for you?
20:01:13 [jburwell]: works for me
20:01:20 [jburwell]: let me know the name of the branch
20:01:21 [jburwell]: and we hack it together
20:01:35 [edison_cs]: ok, great!
20:02:13 [edison_cs]: haven't decide which branch I should branch off, as 
javelin's fortune is not decided yet
20:02:43 [edison_cs]: if javelin is merged into master, then I'll create a 
branch on master
20:03:05 [edison_cs]: but I'll let you know, after the branch is created
20:03:07 [jburwell]: you could just create a new branch from javelin
20:03:22 [jburwell]: and just reset the upstream branch once there is a clearer 
picture
20:05:15 [topcloud]: hey guys sorry i'm late here.
20:05:35 [topcloud]: had a meeting i had to go to.
20:05:44 [jburwell]: topcloud I think we are wrapping up
20:06:00 [jburwell]: next steps will be to further flesh out our discussion in 
code
20:06:35 [topcloud]: jburwell: sorry i missed it. what do you think? is 
everything cleared up or still not quite clear?
20:06:50 [jburwell]: topcloud I think we still have somethings to work out
20:06:58 [jburwell]: and we are at point where we need to cut some code
20:07:13 [jburwell]: I think we are syncing around the same target
20:07:44 [jburwell]: edison_cs: do you agree?
20:08:05 [edison_cs]: jburwell: yes
20:08:13 [jburwell]: sweet
20:08:23 [edison_cs]: how the orchestration interact with provision code
20:08:29 [edison_cs]: is something we can work out
20:08:35 [topcloud]: edison was explaining to me that there might have been 
some confusion on ssvm being required in orchestration.
20:08:43 [topcloud]: and it is now clear that it is not?
20:09:00 [topcloud]: jburwell: do you feel that clear? because that's very key 
to what we're trying to do.
20:09:13 [topcloud]: jburwell: and also do you agree that's a good idea.
20:09:13 [jburwell]: topcloud: we need write some code to validate
20:10:20 [topcloud]: cool.
20:10:43 [edison_cs]: ok, let's call it a meething
20:10:50 [topcloud]: jburwell: after this talk, what's your opinion on the 
javelin merge into master?

Summary of IRC meeting in #cloudstack-meeting, Mon Jan 28 19:01:27 2013

Reply via email to