Re: new name for the multiplexing node store

2017-05-05 Thread Robert Munteanu
Hi,

On Fri, 2017-05-05 at 07:18 -0600, Matt Ryan wrote:
> I was wondering about this also WRT federated data store.  If the
> intent
> and effect of both are the same ("both" meaning what is currently
> called
> the "multiplexing node store" and the proposed (and in-progress)
> "federated
> data store"), it seems they should use a similar naming convention at
> least.
> 
> WDYT?  Does that make it more confusing or less confusing?

I think the high-level intent is the same for both - compose a single
{Data,Node}Store out of multiple sub-stores.

The mechanisms might be different though, as the the NodeStore is
hierarchical in nature, while the BlobStore blob ids are opaque.

Also I still maintain :-) that federated blob stores will work well
individually as they have no overall hierarchy to respect, while the
multiplexed node stores will have to be composed to create a meaningful
image.

Robert

> 
> -MR
> 
> On Fri, May 5, 2017 at 6:10 AM, Julian Sedding 
> wrote:
> 
> > Hi Tomek
> > 
> > In all related discussions the term "mount" appears a lot. So why
> > not
> > Mounting NodeStore? The module could be "oak-store-mount".
> > 
> > Regards
> > Julian
> > 
> > 
> > On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek 
> > wrote:
> > > Hello oak-dev,
> > > 
> > > the multiplexing node store has been recently extracted from the
> > 
> > oak-core into a separate module and I’ve used it as an opportunity
> > to
> > rename the thing. The name I suggested is Federated Node Store.
> > Robert
> > doesn’t agree it’s the right name, mostly because the “partial”
> > node
> > stores, creating the combined (multiplexing / federated) one, are
> > not
> > usable on their own and stores only a part of the overall
> > repository
> > content.
> > > 
> > > Our arguments in their full lengths can be found in the OAK-6136
> > > (last
> > 
> > 3-4 comments), so there’s no need to repeat them here. We wanted to
> > ask you
> > for opinion about the name. We kind of agree that the
> > “multiplexing” is not
> > the best choice - can you suggest something else or maybe you think
> > that
> > “federated” is good enough?
> > > 
> > > Thanks for the feedback.
> > > 
> > > Regards,
> > > Tomek
> > > 
> > > --
> > > Tomek Rękawek | Adobe Research | www.adobe.com
> > > reka...@adobe.com
> > > 



Re: A federated data store

2017-05-05 Thread Matt Ryan
I put together a very crude initial POC which can be seen at [0].  This
simply allows a FileDataStore to be used as a delegate data store and the
FederatedDataStore to be used in Oak as the primary data store.

The approach is simply that the FederatedDataStore has information about
the delegates (one primary and zero or more secondaries) and can defer all
actions to the appropriate delegate.  The goal of this POC was to determine
if this simple idea could possibly work.  I'm simply doing an internal
mapping from a simple data store name to a fully qualified class name, and
then using reflection to create the data store.  This prevents coupling
between the FederatedDataStore and other data stores but also limits it to
only work with supported data store delegates.

One question I have with this has to do with basic correctness of
approach.  Is it acceptable to create the data store objects directly (e.g.
OakCachingFDS), or should the service be going through OSGi to create other
data store service objects instead (e.g. FileDataStoreService)?

I have a concern that creating service objects may mean OSGi limits me to a
single service, whereas if we create the data store objects directly we
could have a number of them.  For example, multiple S3DataStore objects,
each with a different bucket for different purposes.  But I'm not sure if
that limitation on service objects really exists.

Thoughts?


[0] -
https://github.com/mattvryan/jackrabbit-oak/tree/federated-data-store/oak-blob-federated/src/main/java/org/apache/jackrabbit/oak/blob/federated


-MR

On Thu, Apr 20, 2017 at 12:20 PM, Matt Ryan  wrote:

> Hi,
>
> I'm looking at the possibility of creating a new kind of data store, let's
> call it a federated data store, and wanted to see what everyone thinks
> about this.
>
> The basic idea is that the federated data store would allow for more than
> one data store to be configured for an Oak instance.  Oak would then be
> able to choose which data store to use based on a number of criteria, like
> file size, JCR path, node type, existence of a node property, a node
> property value, or other items, or a combination of items.  In my thinking
> these are defined in configuration so the federated data store would know
> how to select which data store is used to store which binary.
>
> I think this is a step towards UC14 - Hierarchical BlobStore in [0].  Once
> the federated data store was implemented we should be able to support UC14
> with little work.  I can also foresee other possible capabilities it could
> offer, such as storing blobs for different node types in different data
> stores, or choosing from a few different data stores based on geographic
> location (UC2 in [0]).
>
> In my mind we could add capability to DataStoreBlobStore.writeStream()
> where the decision is made whether to write a stream to the data store
> delegate or put it in-memory.  Instead we could defer the decision directly
> to the delegate, adding a method to the appropriate interface (BlobStore or
> GarbageCollectibleBlobStore) to handle this decision, and default the
> decision in AbstractBlobStore to be based on the record size (which is the
> current behavior, except currently that decision is made in
> DataStoreBlobStore IIUC).  All other existing data stores should then
> behave the same.  But in the case of the federated data store this decision
> would be more involved, selecting the right data store based on
> configuration.
>
> The federated data store would need to exist independent of other data
> stores, so figuring out how to create those data stores without having a
> code dependency would be a challenge to figure out.
>
>
> Please let me know what you think, is my idea about the implementation
> flawed, is there a better way to accomplish this, what concerns are there
> about it, etc.  I'd like to brainstorm with the list something that can
> work in this area and then I'll create a ticket for it.  Or I can create
> the ticket, and we can have the discussion in the ticket.  Let me know
> which is best.
>
>
> [0] - https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase
>
>
> - Matt Ryan
>


Re: A federated data store

2017-05-05 Thread Matt Ryan
On Fri, Apr 21, 2017 at 7:20 AM, Davide Giannella  wrote:

> On 20/04/2017 19:30, Matt Ryan wrote:
> > I misremembered above when I was describing a possible implementation.  I
> > was thinking we'd add a method to the delegate, but that would be added
> to
> > the DataStore interface, obviously (not BlobStore or
> > GarbageCollectibleBlobStore).  Likewise, the default implementation would
> > exist in AbstractDataStore (not AbstractBlobStore).
>
> I like the idea overall and I'm not familiar with the DS codebase so
> what I'm saying can be wrong.
>
> If I think about the idea without knowing the current implementation I
> would expect some sort of API which allows for the Visitor pattern to be
> leveraged. In this way in an OSGi environment we could simply pull in
> all the Visitor services and act and in plain java it will be more
> around the repository construction/configuration.
>
>
Davide, thanks for the suggestion of using the Visitor pattern.

I spent a fair bit of time over the past couple of weeks researching the
Visitor pattern again and thinking about how it would apply.  I am not
opposed to using that or any other relevant design pattern (I'm generally a
fan).  But I'm struggling to see how the Visitor pattern would work here,
so maybe you can help me see what you had in mind.

>From [0] there is an image of a sequence diagram for the visitor pattern
[1] that is essentially taken right out of the GoF "Design Patterns" book.
Looking at the sequence diagram and trying to map it to this problem:
-  I believe the class labeled "xx:Composite" would be the
FederatedDataStore (some class within this component).
-  I believe the classes labeled "anA:ConcreteA" and "aB:ConcreteB" would
be delegate data stores, e.g. FileDataStore, S3DataStore, or something like
that.
-  I believe the class labeled "v:ConcreteVisitorType1" is ... ???

That's where I get stuck - I can't figure out what the delegated data
stores would be visiting.

In the GoF "Design Patterns" book for the Visitor Pattern under
"Applicability" (page 333):
-  Bullet one says use the Visitor when "an object structure contains many
classes of objects with differing interfaces".  Shouldn't be the case here
- all the data store delegates should be able to be treated pretty much the
same.
-  Bullet two says use the Visitor when "many distinct and unrelated
operations need to be performed on an object structure, and you want to
avoid 'polluting' their classes with these operations."  I don't think this
applies either - the operations are slightly different implementation but
similar in purpose, and are not unrelated; we don't need to perform many
operations but rather select which one is right; we actually do want to
'pollute' their classes with the operations, because it is within those
classes where the logic to do the operation is contained.

Can you help me see what you had in mind?  I think I'm missing it.


[0] - http://www.ghytred.com/ShowArticle.aspx?VisitorPattern
[1] - http://www.ghytred.com/images/visitor2.jpg


-MR


Re: Intent to backport to 1.6: OAK-5641

2017-05-05 Thread Alex Parvulescu
Hi,

I'm not sure what the latest is here, but isn't this regarded as bad
practice?
Maintenance branches have always been subsets of the subsequent branch: 1.0
< 1.2 < 1.4 < 1.6 and so on? Any 1.6 release should contain *all* of 1.4
(give or take a week or two until the next stable release).
Has this changed in the meantime?


alex






On Fri, May 5, 2017 at 12:26 PM, Julian Reschke 
wrote:

> ...change was already applied to 1.4, so we really ought to have it in 1.6
> as well.
>


Re: when to close bugs

2017-05-05 Thread Alex Parvulescu
Hi,

This might be a side effect of the many maintenance branches we have. Based
on the current process, releasing any of them would mark the issue as
'closed', but I think it should be fine as long as the people doing the
releases don't lose track of these issues.
For example branch 1.2 might have 2 open issues which get closed by a 1.4
release, and if someone were to only check 'open issues' against the 1.2
branch, they'd be wrongly assuming that there is no reason to release 1.2
as there's nothing fixed.

Otherwise confusion wrt. release version dates can be easily cleared by
looking at the versions tab in JIRA, right? [0]

best,
alex

[0]
https://issues.apache.org/jira/browse/OAK/?selectedTab=com.atlassian.jira.jira-projects-plugin:versions-panel








On Fri, May 5, 2017 at 12:30 PM, Julian Reschke 
wrote:

> Hi there,
>
> we currently close bugs when a release is made which fixes the bug.
>
> This can lead to somewhat strange effects, if a stable release is made
> before a release from trunk happened.
>
> For instance, OAK-5641 was marked closed when 1.4.14 was released, but we
> it's not actually fixed in any unstable release from trunk.
>
>
> Best regards, Julian
>


Expiring the META/repository-* markers used by MarkSweepGarbageCollector ?

2017-05-05 Thread Bertrand Delacretaz
Hi,

When a client connects only temporarily to a SharedS3DataStore (for
example) and then goes away, the META/repository-* marker created by
SegmentNodeStoreService is not removed.

This causes MarkSweepGarbageCollector to abort with a "not all
repositories have marked references available" message.

Do people see an issue with adding an expiration time to those
META/repository-* markers?

MarkSweepGarbageCollector can then ignore expired markers, considering
them to belong to long gone clients.

I suppose that the expiration time can be stored as data in the marker
blob, and it would have to be refreshed periodically by then client,
unless configured to never expire.

I can provide a patch for that but wanted to first check for any
issues that I overlooked, as I'm not familiar with that code.

What do people think?

-Bertrand


Re: new name for the multiplexing node store

2017-05-05 Thread Matt Ryan
I was wondering about this also WRT federated data store.  If the intent
and effect of both are the same ("both" meaning what is currently called
the "multiplexing node store" and the proposed (and in-progress) "federated
data store"), it seems they should use a similar naming convention at least.

WDYT?  Does that make it more confusing or less confusing?

-MR

On Fri, May 5, 2017 at 6:10 AM, Julian Sedding  wrote:

> Hi Tomek
>
> In all related discussions the term "mount" appears a lot. So why not
> Mounting NodeStore? The module could be "oak-store-mount".
>
> Regards
> Julian
>
>
> On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek  wrote:
> > Hello oak-dev,
> >
> > the multiplexing node store has been recently extracted from the
> oak-core into a separate module and I’ve used it as an opportunity to
> rename the thing. The name I suggested is Federated Node Store. Robert
> doesn’t agree it’s the right name, mostly because the “partial” node
> stores, creating the combined (multiplexing / federated) one, are not
> usable on their own and stores only a part of the overall repository
> content.
> >
> > Our arguments in their full lengths can be found in the OAK-6136 (last
> 3-4 comments), so there’s no need to repeat them here. We wanted to ask you
> for opinion about the name. We kind of agree that the “multiplexing” is not
> the best choice - can you suggest something else or maybe you think that
> “federated” is good enough?
> >
> > Thanks for the feedback.
> >
> > Regards,
> > Tomek
> >
> > --
> > Tomek Rękawek | Adobe Research | www.adobe.com
> > reka...@adobe.com
> >
>


BUILD FAILURE: Jackrabbit Oak - Build # 259 - Failure

2017-05-05 Thread Apache Jenkins Server
The Apache Jenkins build system has built Jackrabbit Oak (build #259)

Status: Failure

Check console output at https://builds.apache.org/job/Jackrabbit%20Oak/259/ to 
view the results.

Changes:
[chetanm] OAK-6176 - Service to provide access to async indexer state

[chetanm] OAK-6176 - Service to provide access to async indexer state

-- Add a method to construct lastIndexedTo property
-- Add a method to check if given name is async name or not

[chetanm] OAK-6176 - Service to provide access to async indexer state

Expose the async indexer name as part of IndexStatsMBean interface itself
in addition to service property

 

Test results:
1 tests failed.
FAILED:  
org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore2Datasources

Error Message:
Service of type interface org.apache.jackrabbit.oak.spi.state.NodeStore was 
found. Expression: (sr == null). Values: sr = 
[org.apache.jackrabbit.oak.spi.state.NodeStore, 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore, 
org.apache.jackrabbit.oak.spi.state.Clusterable]

Stack Trace:
java.lang.AssertionError: Service of type interface 
org.apache.jackrabbit.oak.spi.state.NodeStore was found. Expression: (sr == 
null). Values: sr = [org.apache.jackrabbit.oak.spi.state.NodeStore, 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore, 
org.apache.jackrabbit.oak.spi.state.Clusterable]
at 
org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore2Datasources(DocumentNodeStoreConfigTest.groovy:110)

Re: new name for the multiplexing node store

2017-05-05 Thread Julian Sedding
Hi Tomek

In all related discussions the term "mount" appears a lot. So why not
Mounting NodeStore? The module could be "oak-store-mount".

Regards
Julian


On Fri, May 5, 2017 at 1:39 PM, Tomek Rekawek  wrote:
> Hello oak-dev,
>
> the multiplexing node store has been recently extracted from the oak-core 
> into a separate module and I’ve used it as an opportunity to rename the 
> thing. The name I suggested is Federated Node Store. Robert doesn’t agree 
> it’s the right name, mostly because the “partial” node stores, creating the 
> combined (multiplexing / federated) one, are not usable on their own and 
> stores only a part of the overall repository content.
>
> Our arguments in their full lengths can be found in the OAK-6136 (last 3-4 
> comments), so there’s no need to repeat them here. We wanted to ask you for 
> opinion about the name. We kind of agree that the “multiplexing” is not the 
> best choice - can you suggest something else or maybe you think that 
> “federated” is good enough?
>
> Thanks for the feedback.
>
> Regards,
> Tomek
>
> --
> Tomek Rękawek | Adobe Research | www.adobe.com
> reka...@adobe.com
>


new name for the multiplexing node store

2017-05-05 Thread Tomek Rekawek
Hello oak-dev,

the multiplexing node store has been recently extracted from the oak-core into 
a separate module and I’ve used it as an opportunity to rename the thing. The 
name I suggested is Federated Node Store. Robert doesn’t agree it’s the right 
name, mostly because the “partial” node stores, creating the combined 
(multiplexing / federated) one, are not usable on their own and stores only a 
part of the overall repository content.

Our arguments in their full lengths can be found in the OAK-6136 (last 3-4 
comments), so there’s no need to repeat them here. We wanted to ask you for 
opinion about the name. We kind of agree that the “multiplexing” is not the 
best choice - can you suggest something else or maybe you think that 
“federated” is good enough?

Thanks for the feedback.

Regards,
Tomek

-- 
Tomek Rękawek | Adobe Research | www.adobe.com
reka...@adobe.com



smime.p7s
Description: S/MIME cryptographic signature


when to close bugs

2017-05-05 Thread Julian Reschke

Hi there,

we currently close bugs when a release is made which fixes the bug.

This can lead to somewhat strange effects, if a stable release is made 
before a release from trunk happened.


For instance, OAK-5641 was marked closed when 1.4.14 was released, but 
we it's not actually fixed in any unstable release from trunk.



Best regards, Julian


Intent to backport to 1.6: OAK-5641

2017-05-05 Thread Julian Reschke
...change was already applied to 1.4, so we really ought to have it in 
1.6 as well.


Intent to backport to 1.6/1.4/1.2/1.0: OAK-5667

2017-05-05 Thread Julian Reschke

https://issues.apache.org/jira/browse/OAK-5667

(dead code)


Re: MongoMK failover behaviour.

2017-05-05 Thread Stefan Egli
Hi,

On 04/05/17 16:56, "Justin Edelson"  wrote:

>>Hmm, depending on the Oak version, this may also be caused by OAK-5528.
>> The current fix versions are 1.4.15 and 1.6.0.
>>
>
>Would this show up in thread dumps? Based on the description, it seems
>like
>it should.

Not necessarily. In OAK-5528 the lease update thread goes into
performLeaseCheck which will do a 5x1sec retry loop. So if the thread dump
is taken during that time one would see it - if taken afterwards not.

Cheers,
Stefan




[ANNOUNCE] Apache Jackrabbit 2.15.2 released

2017-05-05 Thread Julian Reschke

The Apache Jackrabbit community is pleased to announce the release of
Apache Jackrabbit 2.15.2. The release is available for download at:

https://jackrabbit.apache.org/jcr/downloads.html#v2.15

See the full release notes below for details about this release:

Release Notes -- Apache Jackrabbit -- Version 2.15.2

Introduction


This is Apache Jackrabbit(TM) 2.15.2, a fully compliant implementation 
of the

Content Repository for Java(TM) Technology API, version 2.0 (JCR 2.0) as
specified in the Java Specification Request 283 (JSR 283).

Apache Jackrabbit 2.15.2 is an unstable release cut directly from
Jackrabbit trunk, with a focus on new features and other
improvements. For production use we recommend the latest stable 2.14.x
release.

Changes in Jackrabbit 2.15.2


Bug

[JCR-4118] - RepositoryChecker creates invalid node names
[JCR-4121] - ConcurrentModificationException in 
InternalVersionHistoryImpl.fixLegacy()

[JCR-4133] - fix javadoc problems that are errors with JDK8

Task

[JCR-4112] - Require Java 8
[JCR-4119] - Upgrade httpcomponents/httpmime to 4.5.3
[JCR-4122] - align parent pom references with Oak
[JCR-4127] - update to latest apache parent pom (18)
[JCR-4128] - update maven plugins and require Maven 3.2.1
[JCR-4129] - get rid of unused org.json dependency

In addition to the above-mentioned changes, this release contains
all the changes included up to the Apache Jackrabbit 2.14.x release.

For more detailed information about all the changes in this and other
Jackrabbit releases, please see the Jackrabbit issue tracker at

https://issues.apache.org/jira/browse/JCR

Release Contents


This release consists of a single source archive packaged as a zip file.
The archive can be unpacked with the jar tool from your JDK installation.
See the README.txt file for instructions on how to build this release.

The source archive is accompanied by SHA1 and MD5 checksums and a PGP
signature that you can use to verify the authenticity of your download.
The public key used for the PGP signature can be found at
https://svn.apache.org/repos/asf/jackrabbit/dist/KEYS.

About Apache Jackrabbit
---

Apache Jackrabbit is a fully conforming implementation of the Content
Repository for Java Technology API (JCR). A content repository is a
hierarchical content store with support for structured and unstructured
content, full text search, versioning, transactions, observation, and
more.

For more information, visit http://jackrabbit.apache.org/

About The Apache Software Foundation


Established in 1999, The Apache Software Foundation provides organizational,
legal, and financial support for more than 140 freely-available,
collaboratively-developed Open Source projects. The pragmatic Apache License
enables individual and commercial users to easily deploy Apache software;
the Foundation's intellectual property framework limits the legal exposure
of its 3,800+ contributors.

For more information, visit http://www.apache.org/

Trademarks
--

Apache Jackrabbit, Jackrabbit, Apache, the Apache feather logo, and the 
Apache

Jackrabbit project logo are trademarks of The Apache Software Foundation.