On Fri, Nov 11, 2016 at 1:09 PM, Sijie Guo <si...@apache.org> wrote: > I liked this topic. A better name might be 'stream storage primitives', as > we treat DL as a stream storage. Comments inline. > > On Wed, Nov 9, 2016 at 3:09 AM, Gerrit Sundaram <gerritsunda...@gmail.com> > wrote: > > > As what Sijie suggested in the other email thread, I started this email > > thread for discussing the stream operation primitives. > > > > The stream operations that I am aware of that DL supports are > > > > * Open a distributedlog stream > > * Delete a distributedlog stream > > * List all the distributedlog streams under a namespace > > > > Are you also looking for listing streams under a 'sub-namespace' - (or > streams have common prefix)? (Based on my understanding on your proposal, > you might need this for a filesystem-like API?) >
Yes. However it seems like DL is more designed with flat namespace with just streams. There is no concept about 'sub-namespace'. Although I probably can hack it by just naming the stream names in a filesystem path-like way. However I am still curious do you guys want to introduce any sort of naming hierarchy in the naming within a namespace. For example, can you have a 'StreamSet', which is a set of streams? (like in filesystem, a directory has a list of children). If you have similar hierarchical, it definitely will simply my work. > > > > * Seal a distributedlog stream > > * Truncate a distributedlog stream > > > > Just to clarify this, the 'truncate' in DL is to trim the head of the > stream not the tail. > The 'truncate' in filesystem world is to a size of precisely *length* > bytes, it is truncating the tail. > > Make sure we clarified it and are on same page. > Yes, we are on the same page. > > > > > > I am looking for a more filesystem-like API. for example, > > > > * Get the status/attributes of a stream (like stat in filesystem) > > > > +1 for stream status/attributes. I think we might actually already have > this in DL. since in kestrel, we use that for storing customized metadata. > It might make sense to formalize it into 'stream status'. > Gotcha. > > > > * Rename a stream > > > > we've talked about this for a while. +1. > > > > * Symlink a stream > > > Symlink a stream is probably easy to do. +1 we've thought about that for > having the flexibility to move stream between different storage backend. > Symlink would help this. > > But a more fundamental thought here is symlinks for log segments. So when a > symlinked stream is deleted, the underneath log segments might not be > deleted until its link count decreased to zero. > > > > > > > Another operations that I can think of might be useful. > > > > * Split/Fork a stream (it can be useful for dynamic data partitioning) > > > > > > Split and fork a stream sounds interesting. But it sounds like a more > high-level feature rather than storage primitives. Actually, it might be a > good separate discussion feature. > > > > > > * Merge/Concat streams > > > > > I think there is already one outstanding jira for concatenating two DL > streams. Jia and Arvind are working on that. > > https://issues.apache.org/jira/browse/DL-46 I will watch that lira. > > > > > > > > The above operations are based on my knowledge about DL. Feel free to add > > more. > > > > > > - Gerrit > > >