Re: question about DL namespace
+1 for the interface. - KN On Mon, Sep 12, 2016 at 5:46 PM, Jon Derrick wrote: > Sijie, thank you for your comments. > > I'd like to make a proposal by introducing a `NamespaceResolver`. > > What does a namespace resolver do? A namespace resolver is basically > resolving the log stream name into a metadata location path. Then DL knows > where to locate the metadata of a log stream. The resolver also takes the > responsibility of validating the stream name and managing the hierarchical > of streams. > > A NamespaceResolver interface will look like as below: > > public interface NamespaceResolver { > > /** validate if the stream name is okay */ > > boolean validateStreamName(String streamName); > > /** resolve the stream name into the location path of the metadata */ > > String resolveStreamPath(String streamName); > > > } > > So a filesystem-like namespace resolver will only accept the absolute > file-like paths as the stream names and a kafka-like (what Khurrum > mentioned) namespace resolver will probably accept names like > '/'. > > A namespace resolver will be added to the namespace metadata binding and > loaded via reflection. > > Any thoughts? I will send out a pull request soon. > > - jd > > > On Tue, Aug 23, 2016 at 9:07 AM, Khurrum Nasim > wrote: > > > On Thu, Aug 18, 2016 at 2:30 AM, Sijie Guo wrote: > > > > > Jon, > > > > > > Sorry for late response. This is a very good question. Comments in > line. > > > > > > Sijie > > > > > > On Monday, August 15, 2016, Jon Derrick > > > wrote: > > > > > > > Hello all, > > > > > > > > I read the distributed log code closely. I found that the DL > namespace > > > is a > > > > flat namespace. There will be a potential issue if there are a lot of > > > > streams created under a same namespace. I am very curious what are > the > > > > thoughts behind that. Here are some questions: > > > > > > > > - How many streams that a namespace can support? > > > > > > > > > The maximum number of streams we have had for a single namespace is > more > > > than 30k. But yup, you are right. It is limited by the number of > children > > > that a znode can have. > > > > > > > > > > > > > > > It seems to be bound with > > > > the limitation on the number of children that a zookeeper znode can > > have. > > > > What's the maximum number of logs do you guys have? > > > > - Why not choose a tree representation? Then it might be easier to > > > organize > > > > streams. For example, if I want to use multiple dl streams as > > > partitions, I > > > > can just easily organize them together under same znode. > > > > > > > > > We don't want to DL to focus on partitions. We let applications decide > > how > > > to partition. So we choose a simple way to start. However, I don't > think > > it > > > is necessary to be just a flat namespace. You probably already noticed > > that > > > there is another namespace implementation to support hierarchy. > > > > > > If you do like to support filesystem like namespace, I would suggest > > adding > > > a namespace type on metadata binding. So it can support different types > > of > > > namespaces. Does that meet your requirements? > > > > > > > +1 for supporting different types of namespaces. I want to organize a > kafka > > topic in following format: > > > > namespace/topic/partitions : storing all the partitions > > namespace/topic/partitions/N : storing the given partition `N` > > namespace/topic/subscriptions : storing all the subscriptions > > namespace/topic/subscriptions/S : storing the information of > subscription > > `S` > > > > both `namespace/topic/partitions/N` and `namespace/topic/ > subscriptions/S` > > are DL streams. > > > > So it would make me easier to manage the streams if I can customize > > namespace layout. > > > > - KN > > > > > > > > > > > > > > - Also if it is a tree-like namespace, it might be easier to > implement > > a > > > > filesystem over the streams. Each file can be backed by one dl > stream. > > In > > > > that way, I can also use DL as long term storage. > > > > > > > > Any thoughts? Appreciate your comments. > > > > > > > > > > > > -- > > > > - jderrick > > > > > > > > > > > > > -- > - jderrick >
Re: question about DL namespace
This sounds reasonable to me. Look forward to your contribution. - Sijie On Mon, Sep 12, 2016 at 2:46 AM, Jon Derrick wrote: > Sijie, thank you for your comments. > > I'd like to make a proposal by introducing a `NamespaceResolver`. > > What does a namespace resolver do? A namespace resolver is basically > resolving the log stream name into a metadata location path. Then DL knows > where to locate the metadata of a log stream. The resolver also takes the > responsibility of validating the stream name and managing the hierarchical > of streams. > > A NamespaceResolver interface will look like as below: > > public interface NamespaceResolver { > > /** validate if the stream name is okay */ > > boolean validateStreamName(String streamName); > > /** resolve the stream name into the location path of the metadata */ > > String resolveStreamPath(String streamName); > > > } > > So a filesystem-like namespace resolver will only accept the absolute > file-like paths as the stream names and a kafka-like (what Khurrum > mentioned) namespace resolver will probably accept names like > '/'. > > A namespace resolver will be added to the namespace metadata binding and > loaded via reflection. > > Any thoughts? I will send out a pull request soon. > > - jd > > > On Tue, Aug 23, 2016 at 9:07 AM, Khurrum Nasim > wrote: > > > On Thu, Aug 18, 2016 at 2:30 AM, Sijie Guo wrote: > > > > > Jon, > > > > > > Sorry for late response. This is a very good question. Comments in > line. > > > > > > Sijie > > > > > > On Monday, August 15, 2016, Jon Derrick > > > wrote: > > > > > > > Hello all, > > > > > > > > I read the distributed log code closely. I found that the DL > namespace > > > is a > > > > flat namespace. There will be a potential issue if there are a lot of > > > > streams created under a same namespace. I am very curious what are > the > > > > thoughts behind that. Here are some questions: > > > > > > > > - How many streams that a namespace can support? > > > > > > > > > The maximum number of streams we have had for a single namespace is > more > > > than 30k. But yup, you are right. It is limited by the number of > children > > > that a znode can have. > > > > > > > > > > > > > > > It seems to be bound with > > > > the limitation on the number of children that a zookeeper znode can > > have. > > > > What's the maximum number of logs do you guys have? > > > > - Why not choose a tree representation? Then it might be easier to > > > organize > > > > streams. For example, if I want to use multiple dl streams as > > > partitions, I > > > > can just easily organize them together under same znode. > > > > > > > > > We don't want to DL to focus on partitions. We let applications decide > > how > > > to partition. So we choose a simple way to start. However, I don't > think > > it > > > is necessary to be just a flat namespace. You probably already noticed > > that > > > there is another namespace implementation to support hierarchy. > > > > > > If you do like to support filesystem like namespace, I would suggest > > adding > > > a namespace type on metadata binding. So it can support different types > > of > > > namespaces. Does that meet your requirements? > > > > > > > +1 for supporting different types of namespaces. I want to organize a > kafka > > topic in following format: > > > > namespace/topic/partitions : storing all the partitions > > namespace/topic/partitions/N : storing the given partition `N` > > namespace/topic/subscriptions : storing all the subscriptions > > namespace/topic/subscriptions/S : storing the information of > subscription > > `S` > > > > both `namespace/topic/partitions/N` and `namespace/topic/ > subscriptions/S` > > are DL streams. > > > > So it would make me easier to manage the streams if I can customize > > namespace layout. > > > > - KN > > > > > > > > > > > > > > - Also if it is a tree-like namespace, it might be easier to > implement > > a > > > > filesystem over the streams. Each file can be backed by one dl > stream. > > In > > > > that way, I can also use DL as long term storage. > > > > > > > > Any thoughts? Appreciate your comments. > > > > > > > > > > > > -- > > > > - jderrick > > > > > > > > > > > > > -- > - jderrick >
Re: question about DL namespace
Sijie, thank you for your comments. I'd like to make a proposal by introducing a `NamespaceResolver`. What does a namespace resolver do? A namespace resolver is basically resolving the log stream name into a metadata location path. Then DL knows where to locate the metadata of a log stream. The resolver also takes the responsibility of validating the stream name and managing the hierarchical of streams. A NamespaceResolver interface will look like as below: public interface NamespaceResolver { /** validate if the stream name is okay */ boolean validateStreamName(String streamName); /** resolve the stream name into the location path of the metadata */ String resolveStreamPath(String streamName); } So a filesystem-like namespace resolver will only accept the absolute file-like paths as the stream names and a kafka-like (what Khurrum mentioned) namespace resolver will probably accept names like '/'. A namespace resolver will be added to the namespace metadata binding and loaded via reflection. Any thoughts? I will send out a pull request soon. - jd On Tue, Aug 23, 2016 at 9:07 AM, Khurrum Nasim wrote: > On Thu, Aug 18, 2016 at 2:30 AM, Sijie Guo wrote: > > > Jon, > > > > Sorry for late response. This is a very good question. Comments in line. > > > > Sijie > > > > On Monday, August 15, 2016, Jon Derrick > > wrote: > > > > > Hello all, > > > > > > I read the distributed log code closely. I found that the DL namespace > > is a > > > flat namespace. There will be a potential issue if there are a lot of > > > streams created under a same namespace. I am very curious what are the > > > thoughts behind that. Here are some questions: > > > > > > - How many streams that a namespace can support? > > > > > > The maximum number of streams we have had for a single namespace is more > > than 30k. But yup, you are right. It is limited by the number of children > > that a znode can have. > > > > > > > > > > > It seems to be bound with > > > the limitation on the number of children that a zookeeper znode can > have. > > > What's the maximum number of logs do you guys have? > > > - Why not choose a tree representation? Then it might be easier to > > organize > > > streams. For example, if I want to use multiple dl streams as > > partitions, I > > > can just easily organize them together under same znode. > > > > > > We don't want to DL to focus on partitions. We let applications decide > how > > to partition. So we choose a simple way to start. However, I don't think > it > > is necessary to be just a flat namespace. You probably already noticed > that > > there is another namespace implementation to support hierarchy. > > > > If you do like to support filesystem like namespace, I would suggest > adding > > a namespace type on metadata binding. So it can support different types > of > > namespaces. Does that meet your requirements? > > > > +1 for supporting different types of namespaces. I want to organize a kafka > topic in following format: > > namespace/topic/partitions : storing all the partitions > namespace/topic/partitions/N : storing the given partition `N` > namespace/topic/subscriptions : storing all the subscriptions > namespace/topic/subscriptions/S : storing the information of subscription > `S` > > both `namespace/topic/partitions/N` and `namespace/topic/subscriptions/S` > are DL streams. > > So it would make me easier to manage the streams if I can customize > namespace layout. > > - KN > > > > > > > > > - Also if it is a tree-like namespace, it might be easier to implement > a > > > filesystem over the streams. Each file can be backed by one dl stream. > In > > > that way, I can also use DL as long term storage. > > > > > > Any thoughts? Appreciate your comments. > > > > > > > > > -- > > > - jderrick > > > > > > -- - jderrick
Re: question about DL namespace
On Thu, Aug 18, 2016 at 2:30 AM, Sijie Guo wrote: > Jon, > > Sorry for late response. This is a very good question. Comments in line. > > Sijie > > On Monday, August 15, 2016, Jon Derrick > wrote: > > > Hello all, > > > > I read the distributed log code closely. I found that the DL namespace > is a > > flat namespace. There will be a potential issue if there are a lot of > > streams created under a same namespace. I am very curious what are the > > thoughts behind that. Here are some questions: > > > > - How many streams that a namespace can support? > > > The maximum number of streams we have had for a single namespace is more > than 30k. But yup, you are right. It is limited by the number of children > that a znode can have. > > > > > > > It seems to be bound with > > the limitation on the number of children that a zookeeper znode can have. > > What's the maximum number of logs do you guys have? > > - Why not choose a tree representation? Then it might be easier to > organize > > streams. For example, if I want to use multiple dl streams as > partitions, I > > can just easily organize them together under same znode. > > > We don't want to DL to focus on partitions. We let applications decide how > to partition. So we choose a simple way to start. However, I don't think it > is necessary to be just a flat namespace. You probably already noticed that > there is another namespace implementation to support hierarchy. > > If you do like to support filesystem like namespace, I would suggest adding > a namespace type on metadata binding. So it can support different types of > namespaces. Does that meet your requirements? > +1 for supporting different types of namespaces. I want to organize a kafka topic in following format: namespace/topic/partitions : storing all the partitions namespace/topic/partitions/N : storing the given partition `N` namespace/topic/subscriptions : storing all the subscriptions namespace/topic/subscriptions/S : storing the information of subscription `S` both `namespace/topic/partitions/N` and `namespace/topic/subscriptions/S` are DL streams. So it would make me easier to manage the streams if I can customize namespace layout. - KN > > > > - Also if it is a tree-like namespace, it might be easier to implement a > > filesystem over the streams. Each file can be backed by one dl stream. In > > that way, I can also use DL as long term storage. > > > > Any thoughts? Appreciate your comments. > > > > > > -- > > - jderrick > > >
Re: question about DL namespace
Jon, Sorry for late response. This is a very good question. Comments in line. Sijie On Monday, August 15, 2016, Jon Derrick wrote: > Hello all, > > I read the distributed log code closely. I found that the DL namespace is a > flat namespace. There will be a potential issue if there are a lot of > streams created under a same namespace. I am very curious what are the > thoughts behind that. Here are some questions: > > - How many streams that a namespace can support? The maximum number of streams we have had for a single namespace is more than 30k. But yup, you are right. It is limited by the number of children that a znode can have. > > > It seems to be bound with > the limitation on the number of children that a zookeeper znode can have. > What's the maximum number of logs do you guys have? > - Why not choose a tree representation? Then it might be easier to organize > streams. For example, if I want to use multiple dl streams as partitions, I > can just easily organize them together under same znode. We don't want to DL to focus on partitions. We let applications decide how to partition. So we choose a simple way to start. However, I don't think it is necessary to be just a flat namespace. You probably already noticed that there is another namespace implementation to support hierarchy. If you do like to support filesystem like namespace, I would suggest adding a namespace type on metadata binding. So it can support different types of namespaces. Does that meet your requirements? > - Also if it is a tree-like namespace, it might be easier to implement a > filesystem over the streams. Each file can be backed by one dl stream. In > that way, I can also use DL as long term storage. > > Any thoughts? Appreciate your comments. > > > -- > - jderrick >