Re: new name for the multiplexing node store
Hi, On Fri, 2017-05-05 at 07:18 -0600, Matt Ryan wrote: > I was wondering about this also WRT federated data store. If the > intent > and effect of both are the same ("both" meaning what is currently > called > the "multiplexing node store" and the proposed (and in-progress) > "federated > data store"), it seems they should use a similar naming convention at > least. > > WDYT? Does that make it more confusing or less confusing? I think the high-level intent is the same for both - compose a single {Data,Node}Store out of multiple sub-stores. The mechanisms might be different though, as the the NodeStore is hierarchical in nature, while the BlobStore blob ids are opaque. Also I still maintain :-) that federated blob stores will work well individually as they have no overall hierarchy to respect, while the multiplexed node stores will have to be composed to create a meaningful image. Robert > > -MR > > On Fri, May 5, 2017 at 6:10 AM, Julian Sedding> wrote: > > > Hi Tomek > > > > In all related discussions the term "mount" appears a lot. So why > > not > > Mounting NodeStore? The module could be "oak-store-mount". > > > > Regards > > Julian > > > > > > On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek > > wrote: > > > Hello oak-dev, > > > > > > the multiplexing node store has been recently extracted from the > > > > oak-core into a separate module and I’ve used it as an opportunity > > to > > rename the thing. The name I suggested is Federated Node Store. > > Robert > > doesn’t agree it’s the right name, mostly because the “partial” > > node > > stores, creating the combined (multiplexing / federated) one, are > > not > > usable on their own and stores only a part of the overall > > repository > > content. > > > > > > Our arguments in their full lengths can be found in the OAK-6136 > > > (last > > > > 3-4 comments), so there’s no need to repeat them here. We wanted to > > ask you > > for opinion about the name. We kind of agree that the > > “multiplexing” is not > > the best choice - can you suggest something else or maybe you think > > that > > “federated” is good enough? > > > > > > Thanks for the feedback. > > > > > > Regards, > > > Tomek > > > > > > -- > > > Tomek Rękawek | Adobe Research | www.adobe.com > > > reka...@adobe.com > > >
Re: A federated data store
I put together a very crude initial POC which can be seen at [0]. This simply allows a FileDataStore to be used as a delegate data store and the FederatedDataStore to be used in Oak as the primary data store. The approach is simply that the FederatedDataStore has information about the delegates (one primary and zero or more secondaries) and can defer all actions to the appropriate delegate. The goal of this POC was to determine if this simple idea could possibly work. I'm simply doing an internal mapping from a simple data store name to a fully qualified class name, and then using reflection to create the data store. This prevents coupling between the FederatedDataStore and other data stores but also limits it to only work with supported data store delegates. One question I have with this has to do with basic correctness of approach. Is it acceptable to create the data store objects directly (e.g. OakCachingFDS), or should the service be going through OSGi to create other data store service objects instead (e.g. FileDataStoreService)? I have a concern that creating service objects may mean OSGi limits me to a single service, whereas if we create the data store objects directly we could have a number of them. For example, multiple S3DataStore objects, each with a different bucket for different purposes. But I'm not sure if that limitation on service objects really exists. Thoughts? [0] - https://github.com/mattvryan/jackrabbit-oak/tree/federated-data-store/oak-blob-federated/src/main/java/org/apache/jackrabbit/oak/blob/federated -MR On Thu, Apr 20, 2017 at 12:20 PM, Matt Ryanwrote: > Hi, > > I'm looking at the possibility of creating a new kind of data store, let's > call it a federated data store, and wanted to see what everyone thinks > about this. > > The basic idea is that the federated data store would allow for more than > one data store to be configured for an Oak instance. Oak would then be > able to choose which data store to use based on a number of criteria, like > file size, JCR path, node type, existence of a node property, a node > property value, or other items, or a combination of items. In my thinking > these are defined in configuration so the federated data store would know > how to select which data store is used to store which binary. > > I think this is a step towards UC14 - Hierarchical BlobStore in [0]. Once > the federated data store was implemented we should be able to support UC14 > with little work. I can also foresee other possible capabilities it could > offer, such as storing blobs for different node types in different data > stores, or choosing from a few different data stores based on geographic > location (UC2 in [0]). > > In my mind we could add capability to DataStoreBlobStore.writeStream() > where the decision is made whether to write a stream to the data store > delegate or put it in-memory. Instead we could defer the decision directly > to the delegate, adding a method to the appropriate interface (BlobStore or > GarbageCollectibleBlobStore) to handle this decision, and default the > decision in AbstractBlobStore to be based on the record size (which is the > current behavior, except currently that decision is made in > DataStoreBlobStore IIUC). All other existing data stores should then > behave the same. But in the case of the federated data store this decision > would be more involved, selecting the right data store based on > configuration. > > The federated data store would need to exist independent of other data > stores, so figuring out how to create those data stores without having a > code dependency would be a challenge to figure out. > > > Please let me know what you think, is my idea about the implementation > flawed, is there a better way to accomplish this, what concerns are there > about it, etc. I'd like to brainstorm with the list something that can > work in this area and then I'll create a ticket for it. Or I can create > the ticket, and we can have the discussion in the ticket. Let me know > which is best. > > > [0] - https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase > > > - Matt Ryan >
Re: A federated data store
On Fri, Apr 21, 2017 at 7:20 AM, Davide Giannellawrote: > On 20/04/2017 19:30, Matt Ryan wrote: > > I misremembered above when I was describing a possible implementation. I > > was thinking we'd add a method to the delegate, but that would be added > to > > the DataStore interface, obviously (not BlobStore or > > GarbageCollectibleBlobStore). Likewise, the default implementation would > > exist in AbstractDataStore (not AbstractBlobStore). > > I like the idea overall and I'm not familiar with the DS codebase so > what I'm saying can be wrong. > > If I think about the idea without knowing the current implementation I > would expect some sort of API which allows for the Visitor pattern to be > leveraged. In this way in an OSGi environment we could simply pull in > all the Visitor services and act and in plain java it will be more > around the repository construction/configuration. > > Davide, thanks for the suggestion of using the Visitor pattern. I spent a fair bit of time over the past couple of weeks researching the Visitor pattern again and thinking about how it would apply. I am not opposed to using that or any other relevant design pattern (I'm generally a fan). But I'm struggling to see how the Visitor pattern would work here, so maybe you can help me see what you had in mind. >From [0] there is an image of a sequence diagram for the visitor pattern [1] that is essentially taken right out of the GoF "Design Patterns" book. Looking at the sequence diagram and trying to map it to this problem: - I believe the class labeled "xx:Composite" would be the FederatedDataStore (some class within this component). - I believe the classes labeled "anA:ConcreteA" and "aB:ConcreteB" would be delegate data stores, e.g. FileDataStore, S3DataStore, or something like that. - I believe the class labeled "v:ConcreteVisitorType1" is ... ??? That's where I get stuck - I can't figure out what the delegated data stores would be visiting. In the GoF "Design Patterns" book for the Visitor Pattern under "Applicability" (page 333): - Bullet one says use the Visitor when "an object structure contains many classes of objects with differing interfaces". Shouldn't be the case here - all the data store delegates should be able to be treated pretty much the same. - Bullet two says use the Visitor when "many distinct and unrelated operations need to be performed on an object structure, and you want to avoid 'polluting' their classes with these operations." I don't think this applies either - the operations are slightly different implementation but similar in purpose, and are not unrelated; we don't need to perform many operations but rather select which one is right; we actually do want to 'pollute' their classes with the operations, because it is within those classes where the logic to do the operation is contained. Can you help me see what you had in mind? I think I'm missing it. [0] - http://www.ghytred.com/ShowArticle.aspx?VisitorPattern [1] - http://www.ghytred.com/images/visitor2.jpg -MR
Re: Intent to backport to 1.6: OAK-5641
Hi, I'm not sure what the latest is here, but isn't this regarded as bad practice? Maintenance branches have always been subsets of the subsequent branch: 1.0 < 1.2 < 1.4 < 1.6 and so on? Any 1.6 release should contain *all* of 1.4 (give or take a week or two until the next stable release). Has this changed in the meantime? alex On Fri, May 5, 2017 at 12:26 PM, Julian Reschkewrote: > ...change was already applied to 1.4, so we really ought to have it in 1.6 > as well. >
Re: when to close bugs
Hi, This might be a side effect of the many maintenance branches we have. Based on the current process, releasing any of them would mark the issue as 'closed', but I think it should be fine as long as the people doing the releases don't lose track of these issues. For example branch 1.2 might have 2 open issues which get closed by a 1.4 release, and if someone were to only check 'open issues' against the 1.2 branch, they'd be wrongly assuming that there is no reason to release 1.2 as there's nothing fixed. Otherwise confusion wrt. release version dates can be easily cleared by looking at the versions tab in JIRA, right? [0] best, alex [0] https://issues.apache.org/jira/browse/OAK/?selectedTab=com.atlassian.jira.jira-projects-plugin:versions-panel On Fri, May 5, 2017 at 12:30 PM, Julian Reschkewrote: > Hi there, > > we currently close bugs when a release is made which fixes the bug. > > This can lead to somewhat strange effects, if a stable release is made > before a release from trunk happened. > > For instance, OAK-5641 was marked closed when 1.4.14 was released, but we > it's not actually fixed in any unstable release from trunk. > > > Best regards, Julian >
Expiring the META/repository-* markers used by MarkSweepGarbageCollector ?
Hi, When a client connects only temporarily to a SharedS3DataStore (for example) and then goes away, the META/repository-* marker created by SegmentNodeStoreService is not removed. This causes MarkSweepGarbageCollector to abort with a "not all repositories have marked references available" message. Do people see an issue with adding an expiration time to those META/repository-* markers? MarkSweepGarbageCollector can then ignore expired markers, considering them to belong to long gone clients. I suppose that the expiration time can be stored as data in the marker blob, and it would have to be refreshed periodically by then client, unless configured to never expire. I can provide a patch for that but wanted to first check for any issues that I overlooked, as I'm not familiar with that code. What do people think? -Bertrand
Re: new name for the multiplexing node store
I was wondering about this also WRT federated data store. If the intent and effect of both are the same ("both" meaning what is currently called the "multiplexing node store" and the proposed (and in-progress) "federated data store"), it seems they should use a similar naming convention at least. WDYT? Does that make it more confusing or less confusing? -MR On Fri, May 5, 2017 at 6:10 AM, Julian Seddingwrote: > Hi Tomek > > In all related discussions the term "mount" appears a lot. So why not > Mounting NodeStore? The module could be "oak-store-mount". > > Regards > Julian > > > On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek wrote: > > Hello oak-dev, > > > > the multiplexing node store has been recently extracted from the > oak-core into a separate module and I’ve used it as an opportunity to > rename the thing. The name I suggested is Federated Node Store. Robert > doesn’t agree it’s the right name, mostly because the “partial” node > stores, creating the combined (multiplexing / federated) one, are not > usable on their own and stores only a part of the overall repository > content. > > > > Our arguments in their full lengths can be found in the OAK-6136 (last > 3-4 comments), so there’s no need to repeat them here. We wanted to ask you > for opinion about the name. We kind of agree that the “multiplexing” is not > the best choice - can you suggest something else or maybe you think that > “federated” is good enough? > > > > Thanks for the feedback. > > > > Regards, > > Tomek > > > > -- > > Tomek Rękawek | Adobe Research | www.adobe.com > > reka...@adobe.com > > >
BUILD FAILURE: Jackrabbit Oak - Build # 259 - Failure
The Apache Jenkins build system has built Jackrabbit Oak (build #259) Status: Failure Check console output at https://builds.apache.org/job/Jackrabbit%20Oak/259/ to view the results. Changes: [chetanm] OAK-6176 - Service to provide access to async indexer state [chetanm] OAK-6176 - Service to provide access to async indexer state -- Add a method to construct lastIndexedTo property -- Add a method to check if given name is async name or not [chetanm] OAK-6176 - Service to provide access to async indexer state Expose the async indexer name as part of IndexStatsMBean interface itself in addition to service property Test results: 1 tests failed. FAILED: org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore2Datasources Error Message: Service of type interface org.apache.jackrabbit.oak.spi.state.NodeStore was found. Expression: (sr == null). Values: sr = [org.apache.jackrabbit.oak.spi.state.NodeStore, org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore, org.apache.jackrabbit.oak.spi.state.Clusterable] Stack Trace: java.lang.AssertionError: Service of type interface org.apache.jackrabbit.oak.spi.state.NodeStore was found. Expression: (sr == null). Values: sr = [org.apache.jackrabbit.oak.spi.state.NodeStore, org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore, org.apache.jackrabbit.oak.spi.state.Clusterable] at org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore2Datasources(DocumentNodeStoreConfigTest.groovy:110)
Re: new name for the multiplexing node store
Hi Tomek In all related discussions the term "mount" appears a lot. So why not Mounting NodeStore? The module could be "oak-store-mount". Regards Julian On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawekwrote: > Hello oak-dev, > > the multiplexing node store has been recently extracted from the oak-core > into a separate module and I’ve used it as an opportunity to rename the > thing. The name I suggested is Federated Node Store. Robert doesn’t agree > it’s the right name, mostly because the “partial” node stores, creating the > combined (multiplexing / federated) one, are not usable on their own and > stores only a part of the overall repository content. > > Our arguments in their full lengths can be found in the OAK-6136 (last 3-4 > comments), so there’s no need to repeat them here. We wanted to ask you for > opinion about the name. We kind of agree that the “multiplexing” is not the > best choice - can you suggest something else or maybe you think that > “federated” is good enough? > > Thanks for the feedback. > > Regards, > Tomek > > -- > Tomek Rękawek | Adobe Research | www.adobe.com > reka...@adobe.com >
new name for the multiplexing node store
Hello oak-dev, the multiplexing node store has been recently extracted from the oak-core into a separate module and I’ve used it as an opportunity to rename the thing. The name I suggested is Federated Node Store. Robert doesn’t agree it’s the right name, mostly because the “partial” node stores, creating the combined (multiplexing / federated) one, are not usable on their own and stores only a part of the overall repository content. Our arguments in their full lengths can be found in the OAK-6136 (last 3-4 comments), so there’s no need to repeat them here. We wanted to ask you for opinion about the name. We kind of agree that the “multiplexing” is not the best choice - can you suggest something else or maybe you think that “federated” is good enough? Thanks for the feedback. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com smime.p7s Description: S/MIME cryptographic signature
when to close bugs
Hi there, we currently close bugs when a release is made which fixes the bug. This can lead to somewhat strange effects, if a stable release is made before a release from trunk happened. For instance, OAK-5641 was marked closed when 1.4.14 was released, but we it's not actually fixed in any unstable release from trunk. Best regards, Julian
Intent to backport to 1.6: OAK-5641
...change was already applied to 1.4, so we really ought to have it in 1.6 as well.
Intent to backport to 1.6/1.4/1.2/1.0: OAK-5667
https://issues.apache.org/jira/browse/OAK-5667 (dead code)
Re: MongoMK failover behaviour.
Hi, On 04/05/17 16:56, "Justin Edelson"wrote: >>Hmm, depending on the Oak version, this may also be caused by OAK-5528. >> The current fix versions are 1.4.15 and 1.6.0. >> > >Would this show up in thread dumps? Based on the description, it seems >like >it should. Not necessarily. In OAK-5528 the lease update thread goes into performLeaseCheck which will do a 5x1sec retry loop. So if the thread dump is taken during that time one would see it - if taken afterwards not. Cheers, Stefan
[ANNOUNCE] Apache Jackrabbit 2.15.2 released
The Apache Jackrabbit community is pleased to announce the release of Apache Jackrabbit 2.15.2. The release is available for download at: https://jackrabbit.apache.org/jcr/downloads.html#v2.15 See the full release notes below for details about this release: Release Notes -- Apache Jackrabbit -- Version 2.15.2 Introduction This is Apache Jackrabbit(TM) 2.15.2, a fully compliant implementation of the Content Repository for Java(TM) Technology API, version 2.0 (JCR 2.0) as specified in the Java Specification Request 283 (JSR 283). Apache Jackrabbit 2.15.2 is an unstable release cut directly from Jackrabbit trunk, with a focus on new features and other improvements. For production use we recommend the latest stable 2.14.x release. Changes in Jackrabbit 2.15.2 Bug [JCR-4118] - RepositoryChecker creates invalid node names [JCR-4121] - ConcurrentModificationException in InternalVersionHistoryImpl.fixLegacy() [JCR-4133] - fix javadoc problems that are errors with JDK8 Task [JCR-4112] - Require Java 8 [JCR-4119] - Upgrade httpcomponents/httpmime to 4.5.3 [JCR-4122] - align parent pom references with Oak [JCR-4127] - update to latest apache parent pom (18) [JCR-4128] - update maven plugins and require Maven 3.2.1 [JCR-4129] - get rid of unused org.json dependency In addition to the above-mentioned changes, this release contains all the changes included up to the Apache Jackrabbit 2.14.x release. For more detailed information about all the changes in this and other Jackrabbit releases, please see the Jackrabbit issue tracker at https://issues.apache.org/jira/browse/JCR Release Contents This release consists of a single source archive packaged as a zip file. The archive can be unpacked with the jar tool from your JDK installation. See the README.txt file for instructions on how to build this release. The source archive is accompanied by SHA1 and MD5 checksums and a PGP signature that you can use to verify the authenticity of your download. The public key used for the PGP signature can be found at https://svn.apache.org/repos/asf/jackrabbit/dist/KEYS. About Apache Jackrabbit --- Apache Jackrabbit is a fully conforming implementation of the Content Repository for Java Technology API (JCR). A content repository is a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more. For more information, visit http://jackrabbit.apache.org/ About The Apache Software Foundation Established in 1999, The Apache Software Foundation provides organizational, legal, and financial support for more than 140 freely-available, collaboratively-developed Open Source projects. The pragmatic Apache License enables individual and commercial users to easily deploy Apache software; the Foundation's intellectual property framework limits the legal exposure of its 3,800+ contributors. For more information, visit http://www.apache.org/ Trademarks -- Apache Jackrabbit, Jackrabbit, Apache, the Apache feather logo, and the Apache Jackrabbit project logo are trademarks of The Apache Software Foundation.