Jon, Sorry for late response. This is a very good question. Comments in line.
Sijie On Monday, August 15, 2016, Jon Derrick <jonathan.derri...@gmail.com> wrote: > Hello all, > > I read the distributed log code closely. I found that the DL namespace is a > flat namespace. There will be a potential issue if there are a lot of > streams created under a same namespace. I am very curious what are the > thoughts behind that. Here are some questions: > > - How many streams that a namespace can support? The maximum number of streams we have had for a single namespace is more than 30k. But yup, you are right. It is limited by the number of children that a znode can have. > > > It seems to be bound with > the limitation on the number of children that a zookeeper znode can have. > What's the maximum number of logs do you guys have? > - Why not choose a tree representation? Then it might be easier to organize > streams. For example, if I want to use multiple dl streams as partitions, I > can just easily organize them together under same znode. We don't want to DL to focus on partitions. We let applications decide how to partition. So we choose a simple way to start. However, I don't think it is necessary to be just a flat namespace. You probably already noticed that there is another namespace implementation to support hierarchy. If you do like to support filesystem like namespace, I would suggest adding a namespace type on metadata binding. So it can support different types of namespaces. Does that meet your requirements? > - Also if it is a tree-like namespace, it might be easier to implement a > filesystem over the streams. Each file can be backed by one dl stream. In > that way, I can also use DL as long term storage. > > Any thoughts? Appreciate your comments. > > > -- > - jderrick >