Hi Thomas et al, I’m still working through some issues on the wiki, but I updated much of it to address some of Thomas’ concerns.
One area where I have simplified things has to do with the write process. But I’m not sure my line of thinking is best. Assume the data store receives a request to write blob ID 12345. Assume there are two delegate blob stores, D1 and D2. Assume D1 has higher priority over D2 and that D2 already contains blob ID 12345 - meaning the blob is already stored in D2. Assume that there is no restriction to disallow the blob from being stored in D1. Since D1 has higher priority than D2, blob ID 12345 should be stored in D1, but instead is stored in D2. Originally I thought the system should look for a match for blob ID 12345, which it would find in D2, and then the issue has to do with whether it should update the last access time of blob ID 12345 in D2 or whether it should write to D1 instead which is the proper location for it. After Thomas’ comments I wonder if it should write to D1. Future reads would get the most recently written version so the system would work consistently. If that were to happen, I assume blob ID 12345 in D2 would now be unreferenced and eventually garbage collected. Is that true? Would we consider it appropriate to just let it get garbage collected? Or would it be better to simply update the last access time of blob ID 12345 in D2 and continue referencing it from that location? -MR On August 15, 2017 at 4:06:56 PM, Matt Ryan (o...@mvryan.org) wrote: Hi Thomas, After emailing I saw you also provided comments in-line on the wiki. I’ll work through those and reply back on-list when I think I have addressed them. Thanks for doing that also! -MR On August 15, 2017 at 2:01:04 PM, Matt Ryan (o...@mvryan.org) wrote: Hi Thomas, Thank you for taking the time to offer a review. I’ve been going through the suggested readings and will continue to do so. Some comments inline below. On August 15, 2017 at 12:25:54 AM, Thomas Mueller (muel...@adobe.com.invalid) wrote: Hi, It is important to understand which operations are available in the JCR API, the DataStore API, and the concept of revisions we use for Oak. For example, * The DataStore API doesn’t support updating a binary. This is of course true. The interface supports only an “addRecord()” capability to put a blob into the data store. The javadoc there clearly expects the possibility that the record may already exist: "If the same stream already exists in another record, then that record is returned instead of creating a new one.” Implementations handle the details of what happens when the blob already exists. For example, the “write()” method in the S3Backend class clearly distinguishes between the two as the way to handle this via the AWS SDK is different for an update versus a create: https://svn.apache.org/repos/asf/jackrabbit/oak/trunk/oak-blob-cloud/src/main/java/org/apache/jackrabbit/oak/blob/cloud/s3/S3Backend.java It is still the case that from the data store’s point of view there is no difference between the two so it doesn’t support a distinction. The original data store concept can take this approach because it only has one place for the data to go. The composite blob store has more than one place the data could go, so I believe there is a possibility that the data could exist in a delegate blob store that is not the first blob store that the data could be written to. What should happen in that case? I assumed we should try to find a match first, and prefer updating to creating new. I’m not sure exactly how that would happen though, since the name only matches if the content hash is the same (unless there’s a collision of course), and otherwise it’s a new blob anyway. * A node might have multiple revisions. * In the Oak revision model, you can't update a reference of an old revision. Does the data store even know about this? I assumed this was all handled at a higher level, and that once the data store is told to add a record it’s already been determined that the write is okay, even if it ends up that the stream being written already exists somewhere. * The JCR API allows to create binaries without nodes via ValueFactory (so it's not possible to use storage filters at that time). I admit, I’m unsure how this might ever work. Maybe it would have to be solved by curation later. What you didn't address is how to read if there are multiple possible storage locations, so I assume you didn't think about that case. In my view, this should be supported. You might want to read up on LSM trees on how to do that: using bloom filters for example. I’ll look at what I wrote again and try to make this more clear. I did think about it - at least if we are talking about the same thing. What I was thinking was that a read would go through each delegate in the specified order and attempt to find a match, returning the first match found. The order and the algorithm used would depend on the traversal strategy implementation. It would be possible to use LSM trees, I can see how that would be used, although I wonder if it would be overkill where I’d expect most real-world uses of composite data store to be two or three delegates at most. What do you think? Is that what you were referring to, or are you talking about what to do if a blob exists in more than one location at the same time? Or something else entirely? I’m not sure I understand what you are referring to. Thanks again for the review. -MR