Re: Access azure segments metadata in a case-insensitive way
Hi Aravindo, I’m in favour of merging the patch. I think being strict in what we write and tolerant in what we read is a good thing. Please create an OAK issue and ping me and Andrei Dulceanu, so we can merge it. Regards, Tomek -- Tomek Rękawek | ASF committer | www.apache.org tom...@apache.org > On 23 Jan 2020, at 11:08, Aravindo Wingeier wrote: > > Hi dev's, > > We use azcopy to copy segments from one azure blob container to another for > testing. There is a bug in the current version of azcopy (10.3.3), which > makes all metadata keys start with a capital letter - "type" becomes "Type". > As a consequence, the current implementation can not find the segments in the > azure blob storage. > > The azcopy issue was already reported [1] in 2018, I am contacting MS > directly to follow up on this. As an alternative, we currently use azcopy > version 7, which is much slower and has reliability issues. > > I have little hope that azcopy will be fixed soon, therefore I suggest a > patch to oak-segment-azure, that would be backward compatible and ignore the > case of the keys when reading metadata. See the patch draft at [2]. > > What do you think is the best way to go forward? > > Best regards, > > Aravindo Wingeier > > [1]: https://github.com/Azure/azure-storage-azcopy/issues/113 > [2]: https://github.com/apache/jackrabbit-oak/pull/173
Intent to backport OAK-8124
Hello, I’d like to backport the OAK-8124 to 1.10 and 1.8. This patch adds the security-related commit hooks, which were missing for the partial oak->oak migration in oak-upgrade. Regards, Tomek -- Tomek Rękawek | ASF committer | www.apache.org tom...@apache.org
Intent to backport: OAK-7540 to 1.8.x
Hello, The unique indices may sometimes break on the Composite Node Store. Vikas fixed this in OAK-7540. I'd like to backport the fix to the 1.8, after an user request. Regards, Tomek -- Tomek Rękawek | ASF committer | www.apache.org tom...@apache.org
Intent to backport: OAK-7686 and OAK-7687
Hello, The issues in subject fix incorrect behaviour of the oak-upgrade, when the partial migration is done (eg. only /content/site is being migrated). In this case, the full reindexing is triggered after starting the target repository. I plan to backport the issues to 1.8 and 1.6 branches, as we have an Oak 1.6 user who was hit by the described problem. Regards, Tomek -- Tomek Rękawek | ASF committer | www.apache.org tom...@apache.org
Re: Decide if a composite node store setup expose multiple checkpoint mbeans
Hello Vikas, I think there was a similar case, described in OAK-5309 (multiple instances of the RevisionGCMBean). We introduced an extra property there - “role” - which can be used to differentiate the mbeans. It’s similar to the option 2 in your email. The empty role means that the mbean is related to the “main” node store, while non-empty one is only used for the partial node stores, gathered together by CNS. Maybe we can use similar approach here? Regards, Tomek -- Tomek Rękawek | ASF committer | www.apache.org tom...@apache.org > On 5 Jul 2018, at 23:59, Vikas Saurabh wrote: > > Hi, > > We recently discovered OAK-7610 [0] where > ActiveDeletedBlobCollectorMBeanImpl got confused due to multiple > implementations of CheckpointMBean being exposed in composite node > store setups (since OAK-6315 [1] which implemented checkpoint bean for > composite node store) > > While, for the time being, we are going to avoid that confusion by > changing ActiveDeletedBlobCollectorMBeanImpl to keep on returning > oldest checkpoint timestamp if all CheckpointMBean implementations > report the same oldest checkpoint timestamp. But that "work-around" > works currently because composite node store uses global node store to > list checkpoint to get oldest timestamp... but the approach is > incorrect in general as there's no such guarantee. > > So, here's the question for the discussion: how should the situation > be handled correctly. Afaict, there are a few options (in decreasing > order of my preference): > 1. there's only a single checkpoint mbean exposed (that implies that > mounted node store services need to "know" that they are mounted > stores and hence shouldn't expose their own bean) > 2. composite node store's checkpointMBean implementation can expose > some metadata (say implement a marker interface) - discovering such > implementation can mean "use this implementation for repository level > functionality" > 3. keep the work-around to be implemented in OAK-7610 [0] but document > (ensure??) that the assumption that "all implementations would have > same oldest checkpoint timestamp" > > Would love to get some feedback. > > [0]: https://issues.apache.org/jira/browse/OAK-7610 > [1]: https://issues.apache.org/jira/browse/OAK-7315 > > > Thanks, > Vikas
Intent to backport OAK-7335
Hi, I’m planning to backport OAK-7335 to 1.6.x. It’ll make the oak-upgrade more permissive when migrating nodes with long names (details in the issue). Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com
Re: Azure Segment Store
Hi Ian, > On 5 Mar 2018, at 17:47, Ian Boston <i...@tfd.co.uk> wrote: > > I assume that the patch deals with the 50K limit[1] to the number of blocks > per Azure Blob store ? As far as I understand, it’s the limit that applies to the number of blocks in a single blob. Block is a single write. Since the segments are immutable (written at once), we don’t need to worry about this limit for the segments. It’s a different case for the journal file - a single commit leads to a single append which adds a block. However, the patch takes care of this, by creating journal.log.001, .002, when we’re close to the limit [1]. Regards, Tomek [1] https://github.com/trekawek/jackrabbit-oak/blob/OAK-6922/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/AzureJournalFile.java#L37 -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com signature.asc Description: Message signed with OpenPGP
Azure Segment Store
Hello, I prepared a prototype for the Azure-based Segment Store, which allows to persist all the SegmentMK-related resources (segments, journal, manifest, etc.) on a remote service, namely the Azure Blob Storage [1]. The whole description of the approach, data structure, etc. as well as the patch can be found in OAK-6922. It uses the extension points introduced in the OAK-6921. While it’s still an experimental code, I’d like to commit it to trunk rather sooner than later. The patch is already pretty big and I’d like to avoid developing it “privately” on my own branch. It’s a new, optional Maven module, which doesn’t change any existing behaviour of Oak or SegmentMK. The only change it makes externally is adding a few exports to the oak-segment-tar, so it can use the SPI introduced in the OAK-6921. We may narrow these exports to a single package if you think it’d be good for the encapsulation. There’s a related issue OAK-7297, which introduces the new fixture for benchmark and ITs. After merging it, all the Oak integration tests pass on the Azure Segment Store. Looking forward for the feedback. Regards, Tomek [1] https://azure.microsoft.com/en-us/services/storage/blobs/ -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com signature.asc Description: Message signed with OpenPGP
Intent to backport: OAK-6878
Hello, I plan to backport the OAK-6878 to 1.6 branch today, so it’ll be included in the Monday release. It allows to set the S3DataStore configuration fields using a properties file. It’s requested by the customer - without this patch it’s impossible to set the cacheSize for the S3 migration. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com signature.asc Description: Message signed with OpenPGP
Re: [CompositeDataStore] How to properly create delegate data stores?
Hi Matt, > On 24 Oct 2017, at 21:54, Matt Ryan <o...@mvryan.org> wrote: > It is still unclear to me how this works in terms of configuration files, > and how this would work for the CompositeDataStore. This is how I believe > it would work for two FileDataStores in the composite: > > FDS config 1: > > path=datastore/ds1 > role=local1 > > FDS config 2: > > path=datastore/ds2 > role=local2 > > CompositeDataStore config: > > local1:readOnly=false > local2:readOnly=true > > Something like that anyway. Yes, I’d see something like this too. > My questions then are: How do we store both FileDataStore configuration > files when both have the same PID? What is the file name for each one? > And how to do they associate with the FileDataStoreFactory? For the factory services we use suffixes for the config files: org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStoreFactory-local1.cfg org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStoreFactory-local2.cfg org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStoreFactory-other.cfg OSGi knows that the […].FileDataStoreFactory is a factory and creates as many instances as needed, binding the provided configurations. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com signature.asc Description: Message signed with OpenPGP
Intent to backport OAK-6604 and OAK-6611 to 1.6
Hi, I plan to backport these two issues. They improve the S3 resilience in oak-upgrade by using the newer version of S3DataStore and waiting until all the uploads are finished. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com signature.asc Description: Message signed with OpenPGP
Re: [CompositeDataStore] How to properly create delegate data stores?
Hello Matt, Please find my replies inlined. > On 4 Oct 2017, at 00:13, Matt Ryan <o...@mvryan.org> wrote: > >> 1. Create new BlobStoreProvider interface, with just one method: >> getBlobStore(). >> 2. Modify all the existing blob store services adding them an optional >> “role” property (any string). > One concern I have with this approach is that if we want a data store to be > usable as a CompositeDataStore delegate, that data store has to make > specific provisions to do this. My thinking was that it would be > preferable to have the CompositeDataStore contain as much of the logic as > possible. Ideally a data store should work as a delegate without having to > make any changes to the data store itself. (Not sure if we can achieve > this, but…) Could you elaborate on what kind of provisioning is required for the delegatees? From what I understand, you didn’t plan to rely on OSGi to get the delegate data stores, but initialise all of them in the CompositeDataStore (“contain as much of the logic as possible”). I’m not sure if this is a right approach. It means that composite data store have to depend on every existing blob store and know it internals. If something changes in any blob store, the composite data store have to be updated as well. For the data stores with a rich configuration (s3DataStore) this may get quite complex. On the other hand, the OSGi-based approach makes the whole thing simpler, less coupled, extensible and easier for the maintenance. CompositeDataStore doesn’t need to know any concrete implementation, but rely on the BlobStore interface, without knowing the implementation. OSGi will take care of providing the already-configured delegatees. >> 3. If the data store service is configured with this role, it should >> register the BlobStoreProvider service rather than a normal BlobStore. >> 4. The CompositeDataStoreService should be configured with a list of blob >> store roles it should wait for. >> 5. The CompositeDataStoreService has a MANDATORY_MULTIPLE @Reference of >> type BlobStoreProvider. >> 6. Once (a) the CompositeDataStoreService is activated and (b) all the blob >> store providers are there, it’ll register a BlobStore service, which will >> be picked up by the node store. > I have concerns about this part also. Which blob store providers should > the CompositeDataStoreService wait for? > > For example, should it wait for S3DataStore? If yes, and if the > installation doesn’t use the S3 connector, that provider will never show > up, and therefore the CompositeDataStoreService would never get > registered. If it doesn’t wait for S3DataStore but the installation does > use S3DataStore, what happens if that bundle is unloaded? As above, the CompositeDataStore won’t wait for any particular implementations, but for the BlobStoreProvider configured with an appropriate roles. It knows the role list, so it can tell when all the roles are in place. For instance, we can configure CompositeDataStore with following role list: local1, local2, shared. Now, in the OSGi we’re configuring two FileDataStores, named “local1” and “local2” and also a S3DataStore named “shared”. CompositeDataStore will be notified about all the data store registrations and as soon as three data stores are in place, it can carry on with its initialisation. > Wouldn’t this approach require that every possible data store that can be a > blob store provider for the composite be included in each installation that > wants to use the CompositeDataStore? No. The CompositeDataStore will only reference the BlobStoreProvider interface, not the actual implementations. It’ll be even possible for the customer to implement a completely new blob store implementation and use it as a delegatee (as long as he implements the BlobStoreProvider). Not that we expect customers to do that, but this kind of decoupling makes it easier to work on the Oak codebase. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com signature.asc Description: Message signed with OpenPGP
Re: oak-upgrade blocking the 1.7.1 release
Hi, In OAK-6306 Davide provided a workaround for the broken javadocs (using a different Lucene version for the javadocs generation only). We can’t upgrade the Lucene version used by oak-upgrade, because it’ll break the Jackrabbit 2 - and we need to have this working in this module. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com > On 6 Jun 2017, at 09:23, Julian Reschke <julian.resc...@gmx.de> wrote: > > On 2017-06-06 09:09, Alex Parvulescu wrote: >> I'm not convinced. If you look at the javadoc error it is exactly like the >> one from OAK-6150 (blocking 1.7.0 release at that time) that seemed to >> magically go away. > > Actually, I meant to make that a question > > Davide, you did the 1.7.0 release, right? Do you recall how you got past the > error? > > We may want to do this for 1.7.1 again, but in the mid term, we need to fix > this somehow... > > Best regards, Julian smime.p7s Description: S/MIME cryptographic signature
backporting OAK-6294
Hi, I’d like to backport the OAK-6294 to 1.6 and 1.4 before Monday, so it’ll be included in Oak 1.4.16. It fixes a NPE reported by the customer. Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com smime.p7s Description: S/MIME cryptographic signature
Re: [VOTE] Release Apache Jackrabbit 2.14.1
Hi Julian, > On 29 May 2017, at 11:37, Julian Reschke <resc...@apache.org> wrote: >[ ] +1 Release this package as Apache Jackrabbit 2.14.1 +1 Regards, Tomek -- Tomek Rękawek | Adobe Research | www.adobe.com reka...@adobe.com smime.p7s Description: S/MIME cryptographic signature