Members present: edison_cs, topcloud, jburwell ---------------- Meeting summary: ----------------
1. Preface IRC log follows: # 1. Preface # 19:02:50 [edison_cs]: jburwell: about the storage stuff, I am trying to hook up the inferfaces listed in my design document into mgt server, made a lot of change on the existing mgt server code 19:02:59 [edison_cs]: I'll start a new branch based on javelin 19:03:27 [edison_cs]: maybe by looking at the code, we can understand what we are talking about 19:05:12 [edison_cs]: I think we are doing the same thing: clean up current storage code, make it more extensible etc, and the API proposed by both of us are very similar. 19:07:27 [edison_cs]: Alex and I both still quite understand, why the api needs to have input/output stream 19:07:34 [edison_cs]: not understand.. 19:09:57 [jburwell]: the issue is how do we acquire data 19:10:12 [jburwell]: for example, downloading a template from a URL 19:10:21 [jburwell]: simply providing a URI on that operation is incomplete 19:10:27 [edison_cs]: I know, but that's at the provision side 19:10:42 [edison_cs]: what I am talking about is at the orchestration side, on the mgt server 19:10:45 [jburwell]: we need some of saying writing the stream of data (input stream) to this location (URI) 19:10:57 [jburwell]: please define orchestration 19:11:13 [edison_cs]: if there is a create api coming in from api server into mgt server 19:11:28 [jburwell]: create what? 19:11:34 [edison_cs]: mgt server needs to find out proper storage provider to create volume 19:11:42 [edison_cs]: let's say create volume api 19:11:57 [edison_cs]: mgt server needs to maintain volume state in the db 19:12:04 [jburwell]: for create a volume, a stream may not be necessary 19:12:06 [edison_cs]: then call the proper storage provider to create that volume 19:12:13 [jburwell]: for other operations, streams are necessary 19:12:34 [edison_cs]: inside that storage provider, the implementation, can send a command to hypervisor resource 19:12:44 [edison_cs]: or it can directly call storage box api 19:13:13 [jburwell]: I think it difficult to follow the conversation if we switch examples often 19:13:20 [jburwell]: I would like to stick the the download template example 19:13:27 [edison_cs]: ok 19:13:35 [jburwell]: because it exemplifies the issues I encountered implementing S3 19:13:50 [jburwell]: I am concerned with operations where data is actually being transfered 19:14:04 [edison_cs]: can you gave an example, how create template work in your api? 19:14:12 [edison_cs]: how it integrate with mgt server 19:14:27 [jburwell]: I have posted pseudo code for that operation on the mailing list 19:15:34 [edison_cs]: as I said, I am ok with your pseduo code, as it's at the provision level 19:15:42 [jburwell]: http://markmail.org/message/adck5mkb3izg6khi 19:16:06 [jburwell]: it seems that provisioning is a logical operation, correct? 19:16:12 [edison_cs]: I don't care how provision work, but I more care about how mgt server call your s3 privider 19:16:44 [jburwell]: in this case, I expect the StorageDevice to be invoked in the SSVM 19:17:00 [jburwell]: but as I have said, the abstract shouldn't care about where it is executed 19:17:12 [edison_cs]: but mgt server do need to know and manage s3 provider 19:17:12 [jburwell]: the mgmt server should make that decision 19:17:34 [edison_cs]: for example, a create template api coming in 19:17:43 [jburwell]: it just needs to say, "I have an operation to perform and it will be executed {locally, ssvm}." 19:17:43 [edison_cs]: how the mgt server will call into your s3 provider 19:18:14 [edison_cs]: if there are multiple object storage provider 19:18:27 [edison_cs]: mgt server do need to select one of them, right 19:18:42 [edison_cs]: mgt server need to allow you to register a provider 19:18:44 [jburwell]: by virtue of the storable interface, the template itself will indicate which device it should be stored on 19:19:04 [jburwell]: I would like to keep registration of drivers/types out of this conversation 19:19:12 [edison_cs]: ooops 19:19:12 [jburwell]: as I see that as a wider discussion for CS 19:19:19 [edison_cs]: that's what all I am doing 19:19:19 [jburwell]: let's just assume that all of the types are registered ... 19:19:42 [edison_cs]: how to register a provider, how mgt server call into provider 19:19:57 [edison_cs]: how to mgt server provider in the database 19:20:04 [jburwell]: I think the registration of types for storage, hypervisors, network device, etc should be unified 19:20:15 [edison_cs]: that's what am I mean the orchestration 19:20:27 [jburwell]: I am concerned with what operations the driver provides 19:20:42 [jburwell]: and ensuring that we can interact with block and object stores without the need of an intermediate file system 19:20:43 [edison_cs]: as I said, there only 5-7 operations a provider will provide 19:20:57 [edison_cs]: create/copy/delete 19:21:12 [jburwell]: read and list are also required 19:21:28 [edison_cs]: list is required 19:21:42 [edison_cs]: but the api is not input/output stream 19:21:57 [edison_cs]: for example, there is a s3 provider registered 19:22:12 [jburwell]: how would we read or write data without an input/outputstream? 19:22:12 [edison_cs]: which has an api called create(object sth) 19:22:29 [edison_cs]: when register template api coming 19:22:50 [edison_cs]: which specifies which provider what to regsiter 19:22:59 [jburwell]: a template should have an association to the storage device on which it resides 19:23:12 [jburwell]: ditto with a volume, iso, and snapshot 19:23:14 [edison_cs]: registertemplate(strinng providerUUId, string url-of-template) 19:23:45 [edison_cs]: then mgt server will find out the correspoding privoder's creat api, then call it 19:24:04 [edison_cs]: inside provider's implemenation, you can send a command to ssvm 19:24:12 [edison_cs]: or send a command to hypervisor host 19:24:13 [jburwell]: but how will the template data actually be given to the device in that create? 19:24:19 [edison_cs]: to regsiter the template 19:24:42 [edison_cs]: that depends on provider's implementaion 19:24:42 [jburwell]: there is the registration of the template but there is also the acquisition of the associated data (i.e. the x mb file that is the actual template) 19:25:05 [jburwell]: with just a UUID string and a logical URI how is the storage provider going to acquire the data? 19:25:51 [edison_cs]: storage provier can parse the uri, something like, http://some-where/some-path 19:26:12 [edison_cs]: then grab that uri with s3 client api 19:26:19 [jburwell]: so how does the storage provider know how CS will reference it later? 19:26:35 [jburwell]: we need an operation that says, "Write this data to this URI" 19:26:58 [edison_cs]: let me gave an example about how that works 19:27:19 [jburwell]: i.e take the contents of the input stream and write it to the physical location represented by "/template/tmpl/2/200" 19:27:27 [edison_cs]: if mgt server wants download s3 url into a nfs storage 19:27:34 [jburwell]: with that type of interface 19:27:49 [jburwell]: we can unified almost all inter device transfer 19:27:49 [jburwell]: s 19:28:07 [edison_cs]: the provider will have an api called: copy(dataobject src-something, dataobject dest-something) 19:28:34 [edison_cs]: src-something.touri will return a URI represent a s3 url 19:28:49 [edison_cs]: dest-something will represent a nfs://balala-/path 19:29:04 [edison_cs]: then inside provider's copy api 19:29:13 [edison_cs]: it will send a copy command to ssvm 19:29:27 [edison_cs]: inside ssvm, it will decode both of the two uri= 19:29:42 [jburwell]: fundamentally, I do not think that a storage provider should know anything about the SSVM 19:29:42 [edison_cs]: download s3 uri to nfs 19:29:50 [jburwell]: that should be the management server's decision to determine where an operation will be executed 19:29:57 [edison_cs]: as I said, the provider is in the mgt server side 19:30:15 [jburwell]: so now S3 needs to know about every other possible driver? 19:30:27 [jburwell]: and every driver needs to implement a basic i/o stream copy operation? 19:30:28 [edison_cs]: it's hook, so provider can implement whatever strategy they want 19:31:19 [edison_cs]: I get your idea 19:31:42 [edison_cs]: during the download s3 url to nfs command at ssvm 19:32:04 [edison_cs]: you want nfs can be represented as a outputstream, right? 19:32:05 [jburwell]: we don't want to be downloading a template to NFS anymore 19:32:14 [edison_cs]: that's an example 19:32:28 [jburwell]: we want the template to go straight to S3 or a block store or whatever 19:32:37 [edison_cs]: I know, that's what I am doing 19:33:04 [jburwell]: so, yes, the streams are an handle of either a source or sink for reading/writing data 19:33:04 [edison_cs]: just give me few days, I'll remove that nfs secondary storage as much as possible 19:33:42 [jburwell]: an NFS device working as secondary storage will make sense in certain deployments 19:33:49 [edison_cs]: that's what I am saying, your api, is at the provision level(which actually access darta) 19:33:51 [edison_cs]: data 19:34:04 [edison_cs]: but my api is at the orchestration level 19:34:21 [edison_cs]: which doesn't need to access data 19:34:21 [jburwell]: my feeling is that both have to be addressed together 19:34:34 [jburwell]: I don't believe that we can design one without considering the other 19:35:04 [edison_cs]: as I said, maybe you can give me an example, how mgt server interact with your api? 19:35:12 [edison_cs]: during the register template api call 19:35:49 [edison_cs]: what will the register api look lie 19:36:19 [edison_cs]: how the mgt server know call into s3 code 19:36:34 [jburwell]: when you say register, you mean the simple act of storing the template's configuration data in the database or also of downloading the data? 19:36:34 [edison_cs]: if there are mulitple object storage registered 19:37:12 [jburwell]: when a template configuration is specified, the storage device on which it will reside is selected 19:37:19 [jburwell]: and specified as part of the template 19:37:27 [edison_cs]: how? 19:37:34 [jburwell]: it may be an auto-select process or manually specified 19:37:42 [edison_cs]: be specific about the register template api 19:37:52 [edison_cs]: it's encoded in the uri? 19:37:57 [jburwell]: noooo 19:38:13 [jburwell]: the uri is a generic reference for the template that is same across all storage devices 19:38:27 [jburwell]: template implements storable 19:38:30 [jburwell]: storable defines two methods 19:38:34 [edison_cs]: ok, then how mgt server knows it's from s3 19:38:35 [jburwell]: getURI() : URI 19:38:36 [edison_cs]: not from swift 19:38:50 [jburwell]: getStorageDevice() : StorageDevice 19:39:04 [jburwell]: when a template is registered, a storage device for that template must be specified 19:39:34 [jburwell]: my intention for the URI is that it is a logical value that uniquely and consistently references the object in CS 19:39:35 [edison_cs]: is it a uuid 19:39:36 [edison_cs]: ? 19:39:49 [jburwell]: it is responsibility of the storage device to map that to a physical location 19:40:04 [jburwell]: the UUID may be part of the URI, but it is not the uri 19:40:27 [edison_cs]: how the registertemplate api will look like? 19:40:34 [jburwell]: a template URI could be "/template/<UUID>" or "/template/<account_id>/<template_id> 19:40:49 [jburwell]: registertemplate(Template aTemplate) 19:40:57 [jburwell]: where a template contains all of the properties to be created 19:40:57 [edison_cs]: I mean from user API 19:41:06 [jburwell]: user api? 19:41:20 [edison_cs]: yes, user/admin wants to register a template 19:41:27 [jburwell]: registerTemplate(Template aTemplate) 19:41:27 [edison_cs]: what's the api looks like? 19:41:42 [edison_cs]: through restful api 19:41:57 [jburwell]: a JSON serialization of a Template instance 19:42:20 [jburwell]: so, it would include the name, url, storage_device_id, platform, public ... 19:42:42 [edison_cs]: ok, i see, it has storage_device_id 19:42:57 [edison_cs]: that's what I mean storage uuid 19:43:04 [edison_cs]: in our current admin/user api 19:43:42 [jburwell]: the Template class would provide an implementation to render out the URI ... 19:43:57 [jburwell]: and internal operations that want to interact with the underlying storage 19:44:05 [jburwell]: could get the associated device via the getStorageDevice method 19:44:21 [jburwell]: and reference it logically via the getURI method 19:45:04 [edison_cs]: in a word, the registertemplate do need someway to identify, which storage device to register 19:45:12 [edison_cs]: either through a plain uuid 19:45:19 [jburwell]: most definately 19:45:22 [edison_cs]: or through template class 19:45:29 [edison_cs]: ok, 19:45:36 [edison_cs]: then we agree on that 19:45:42 [edison_cs]: let's move on 19:45:42 [jburwell]: we need to know for template (or a volume or iso) on which storage device it is intended to live 19:45:49 [edison_cs]: how the mgt server will call you api 19:46:27 [edison_cs]: there is a registertemplate coming in 19:46:45 [edison_cs]: how the mgt server call into your s3 api, which has input/outstream 19:47:05 [jburwell]: for this operation, templates are downloaded via the SSVM right? 19:47:05 [edison_cs]: does mgt server need to send a command to ssvm? 19:47:13 [edison_cs]: no 19:47:21 [edison_cs]: that's what I am doing 19:47:42 [edison_cs]: mgt server needs to be extended 19:47:44 [jburwell]: so templates will be downloaded in-line in the management server? 19:48:04 [edison_cs]: so that, people can plug into his own code, to change the way how access storage 19:48:06 [edison_cs]: it can be in ssvm 19:48:13 [edison_cs]: it can be in hypervisor host 19:48:30 [edison_cs]: it can be in an external service 19:48:34 [edison_cs]: or whatever 19:48:44 [edison_cs]: currently, cloudstack enforce you to use ssvm 19:48:50 [edison_cs]: which is bad! 19:49:19 [edison_cs]: how to make the decision, is what I mean: orchestration 19:51:59 [jburwell]: and in my design proposal, a storage device is both serializable and stateless 19:52:12 [jburwell]: therefore, you access and use it in-line in the management server 19:52:20 [jburwell]: or put it on a async call and ship it to another process 19:52:27 [edison_cs]: I know, but It doesn't sovle the problem, how to make the difference 19:52:37 [edison_cs]: how people can plug in code in the mgt server 19:52:50 [edison_cs]: to change the way we access data 19:52:57 [jburwell]: as I said before, the orchestration would consume StorageDevices 19:53:12 [jburwell]: a storage device doesn't know or care about orchestration 19:53:12 [edison_cs]: but, where is the orchestration 19:53:13 [jburwell]: it does as it it told 19:53:20 [jburwell]: in the layer above storage devices 19:53:27 [edison_cs]: who will write the orchestration 19:53:34 [edison_cs]: is it extensible 19:53:34 [jburwell]: orchestration should be common across all of CS 19:53:57 [edison_cs]: I know, that's what I am doing in the orchestration code 19:53:58 [jburwell]: storage devices provide the simple, composable operations that the orchestration engine uses 19:54:19 [edison_cs]: make the orchestration extensible 19:54:27 [jburwell]: in what way? 19:54:49 [edison_cs]: edison_cs: so that, people can plug into his own code, to change the way how access storage 19:54:49 [edison_cs]: [7:48pm] edison_cs: it can be in ssvm 19:54:49 [edison_cs]: [7:48pm] edison_cs: it can be in hypervisor host 19:54:49 [edison_cs]: [7:48pm] edison_cs: it can be in an external service 19:54:49 [jburwell]: to my mind, the actual algorithms implemented by orchestration should be common 19:54:49 [edison_cs]: [7:48pm] edison_cs: or whatever 19:54:49 [edison_cs]: [7:48pm] edison_cs: currently, cloudstack enforce you to use ssvm 19:54:49 [edison_cs]: [7:48pm] edison_cs: which is bad! 19:54:49 [edison_cs]: [7:49pm] edison_cs: how to make the decision, is what I mean: orchestration 19:54:57 [edison_cs]: it's not common 19:55:15 [edison_cs]: people can write whatever algoirthm they what 19:55:35 [edison_cs]: for example 19:55:42 [edison_cs]: if it's kvm 19:55:57 [jburwell]: plugging in new storage algorithms should be orthogonal (yes, I used that dreaded word) to the actual device interaction 19:56:12 [edison_cs]: but that's what I am doing 19:56:42 [edison_cs]: so I said before, we are doing different tasks 19:56:49 [edison_cs]: my part is on the orchestration 19:56:57 [edison_cs]: your part is at the provision level 19:57:14 [jburwell]: I understand that 19:57:19 [edison_cs]: without an extensible orchestration part, 19:57:34 [edison_cs]: cloudstack can only work with one model 19:57:42 [jburwell]: I don't think that they can designed completely separately 19:57:58 [jburwell]: we need to design both together and ensure that integration properly 19:58:07 [edison_cs]: ok, I agree 19:58:34 [edison_cs]: that's what I am thinking, I created a new branch 19:58:42 [edison_cs]: then show the orchestration part of code 19:59:07 [edison_cs]: then show, how to interact with provision code 19:59:57 [edison_cs]: is it ok for you? 20:01:13 [jburwell]: works for me 20:01:20 [jburwell]: let me know the name of the branch 20:01:21 [jburwell]: and we hack it together 20:01:35 [edison_cs]: ok, great! 20:02:13 [edison_cs]: haven't decide which branch I should branch off, as javelin's fortune is not decided yet 20:02:43 [edison_cs]: if javelin is merged into master, then I'll create a branch on master 20:03:05 [edison_cs]: but I'll let you know, after the branch is created 20:03:07 [jburwell]: you could just create a new branch from javelin 20:03:22 [jburwell]: and just reset the upstream branch once there is a clearer picture 20:05:15 [topcloud]: hey guys sorry i'm late here. 20:05:35 [topcloud]: had a meeting i had to go to. 20:05:44 [jburwell]: topcloud I think we are wrapping up 20:06:00 [jburwell]: next steps will be to further flesh out our discussion in code 20:06:35 [topcloud]: jburwell: sorry i missed it. what do you think? is everything cleared up or still not quite clear? 20:06:50 [jburwell]: topcloud I think we still have somethings to work out 20:06:58 [jburwell]: and we are at point where we need to cut some code 20:07:13 [jburwell]: I think we are syncing around the same target 20:07:44 [jburwell]: edison_cs: do you agree? 20:08:05 [edison_cs]: jburwell: yes 20:08:13 [jburwell]: sweet 20:08:23 [edison_cs]: how the orchestration interact with provision code 20:08:29 [edison_cs]: is something we can work out 20:08:35 [topcloud]: edison was explaining to me that there might have been some confusion on ssvm being required in orchestration. 20:08:43 [topcloud]: and it is now clear that it is not? 20:09:00 [topcloud]: jburwell: do you feel that clear? because that's very key to what we're trying to do. 20:09:13 [topcloud]: jburwell: and also do you agree that's a good idea. 20:09:13 [jburwell]: topcloud: we need write some code to validate 20:10:20 [topcloud]: cool. 20:10:43 [edison_cs]: ok, let's call it a meething 20:10:50 [topcloud]: jburwell: after this talk, what's your opinion on the javelin merge into master?