Dear users of Apache Jackrabbit Oak,
Java 8 will be needed for futures versions of Jackrabbit Oak 1.4 and 1.6,
starting with Apache Jackrabbit Oak 1.4.27 and 1.6.21.
For details, see https://issues.apache.org/jira/browse/OAK-9294
This change is needed in order to allow upgrading to Apache Solr
Hi,
I intend to backport the fix for OAK-9184 to the 1.8 and 1.22 branches. The
risk should be very limited.
Let me know i you have any concerns.
Regards,
Thomas
https://issues.apache.org/jira/browse/OAK-9184
e also
https://jackrabbit.apache.org/oak/docs/roadmap.html
We shouldn't do any backports to that branch anymore.
Regards
Marcel
On 05.06.20, 17:30, "Thomas Mueller" wrote:
Hi,
I intend to backport the fix for OAK-9065 to the 1.8, 1.10, and 1.22
branches.
Hi,
I intend to backport the fix for OAK-9065 to the 1.8, 1.10, and 1.22 branches.
The risk should be very limited.
Let me know i you have any concerns.
Regards,
Thomas
https://issues.apache.org/jira/browse/OAK-9065
Hi,
We are not sure yet if the property was 600 million characters long. It might
have been only 1 million. Sure, we need to investigate this and log an issue
about this. But we need a generic solution.
I think we should log a warning for strings larger than 100'000 characters. At
some point
Hi,
Yes, this definitely looks like a bug... Could you file a Jira issue please?
Regards,
Thomas
On 22.03.19, 13:12, "Vikas Saurabh" wrote:
That sounds like a bug to me. Would love to hear Thomas Mueller's thoughts
too though.
--Vikas
(sent from mobile)
On Fri
Hi,
> Wouldn't it make sense to introduce a query option ala [1] to disable
> read/memory limits for one particular query?
It's possible, but my fear is that people would use the option in their queries
too often...
> OAK-6875 does not always have the desired effect (for sure there is some
>
Hi,
I think we should discuss this. The right now, we use some of the beans like
"global static" singletons. This might be a mistake, but it's like that. Now by
introducing a second bean, this "contract" breaks. It's kind of like breaking
backward compatibility...
Regards,
Thomas
On
Hi
I would like to backport https://issues.apache.org/jira/browse/OAK-7437.
Please let me know if you have any concern/objection.
Regards,
Thomas
+1
On 04.04.18, 10:23, "Tommaso Teofili" wrote:
Hi all,
In the context of creating an (abstract) implementation for Oak full text
indexes [1], I'd like to create a new module called _oak-search_.
Such module will contain:
- implementation
I want to backport OAK-7152 to all maintenance branches. The fix is simple and
low risk.
Regards,
Thomas
+1
On 15.01.18, 09:47, "Marcel Reutegger" wrote:
Hi,
I will backport OAK-7152 to all maintenance branches. The fix is trivial
and very low risk because the method currently simply does not return.
Regards
Marcel
Hi,
> Upgrading lucene to version 6 would probably warrant using 2.0, but that's
> not ready yet for 1.8?
No, it's not yet ready for 1.8.
Regards,
Thomas
I vote for 1.8. I don't see any big changes that would justify version 2.0. The
modularization (moving code around) is an ongoing process, I don't think this
is "fixed", and shouldn't have a big impact on users.
> Please vote on releasing this package as Apache Jackrabbit Oak 1.6.4.
+1
Thomas
eview. I’ve been going through
the suggested readings and will continue to do so.
Some comments inline below.
On August 15, 2017 at 12:25:54 AM, Thomas Mueller
(muel...@adobe.com.invalid)
wrote:
Hi,
It is important to understand which operations a
/index.html
* https://en.wikipedia.org/wiki/Content-addressable_storage
* https://en.wikipedia.org/wiki/Log-structured_merge-tree
Regards,
Thomas
On 15.08.17, 08:00, "Thomas Mueller" <muel...@adobe.com> wrote:
Hi,
I read you wiki update, and this caught my eye:
Hi,
I read you wiki update, and this caught my eye:
> If a match is found, the write is treated as an update; if no match is
> found, the write is treated as a create.
In the DataStore, there is no such thing as an update. There are only the
following operations:
* write
* read
* delete,
+1
On 12.07.17, 06:29, "Chetan Mehrotra" wrote:
OAK-5899
+1
On 13.07.17, 13:22, "Julian Reschke" wrote:
https://issues.apache.org/jira/browse/OAK-5827
"Don't use SHA-1 for new DataStore binaries"
(security related and in trunk since March)
Best regards, Julian
Hi,
OAK-6359 prevents complex queries to result in using 100% CPU (in an almost
endless loop) / eventually running out of memory. The fix is simple and already
tested in trunk. There is a feature flag that allows switching to the old
behaviour.
Regards,
Thomas
Hi,
On 10.07.17, 11:18, "Bertrand Delacretaz" wrote:
> Throw an exception maybe? BinaryNotAvailableAtThisTime, including an
> ETA for availability. The application can then decide how to handle
>that.
Bertrand, this is exactly what I have suggested in two
Hi,
> (a) the implementation of an automatism is not *quite* what they need/want
> (b) they want to be able to manually select (or more likely override)
whether a file can be archived
Well, behind the scenes, we anyway need a way to move entries to / from cold
storage. But in my view,
+1 Release this package as Apache Jackrabbit Oak 1.7.3
Hi,
> a property on the node, e.g. "archiveState=toArchive"
I wonder if we _can_ easily write to the version store? Also, some nodetypes
don't allow such properties? It might need to be a hidden property, but then
you can't use the JCR API. Or maintain this data in a "shadow" structure (not
> From my perspective as an Oak user I would like to have control on that.
> It would be nice for Oak to make *suggestions* about moving things to
> cold storage, but there might be application constraints that need to
> be accounted for.
That sounds reasonable. What would be the "API" for this?
Hi,
I guess you talk about Amazon Glacier. Did you know about "Expedited
retrievals" by the way?
https://aws.amazon.com/about-aws/whats-new/2016/11/access-your-amazon-glacier-data-in-minutes-with-new-retrieval-options/
- it looks like it's more than just "slow" + "fast".
About deciding which
Hi,
Right now, there is only one nodetype index. So, if you add a nodetype / mixin
to that index (as you know the lists of nodetypes / mixins is a multi-valued
property), then you need to reindex that index. Which needs to read all the
nodes.
The alternative would be to have multiple nodetype
Hi,
I'd like to backport OAK-6391 to the maintenance branches. The query result
getSize() method is often used, and it is important that the result is as
accurate as possible (even though the spec allows to return -1).
Regards,
Thomas
+1
On 08.06.17, 11:29, "Tommaso Teofili" wrote:
Hi all,
I'd like to backport the fix for a bug in LMSEstimator [1] (LMSEstimator is
used by oak-solr-core to estimate the no. of entries in the index without
issuing a query to Solr) until branch 1.2
Ah, same as Alex!
On 06.06.17, 18:06, "Alex Parvulescu" wrote:
[X] +1 Release this package as Apache Jackrabbit Oak 1.7.1
I had a transient error on
'ActiveDeletedBlobCollectorTest.multiThreadedCommits:230' but it went away
on the second
FYI I got a test failure in oak-lucene,
ActiveDeletedBlobCollectorTest.multiThreadedCommits, see comment in OAK-2808.
This is not a vote. I guess it just happens just on my machine, no need to
block the release.
+1 (fixing tests is always good)
On 31.05.17, 12:01, "Julian Reschke" wrote:
https://issues.apache.org/jira/browse/OAK-5612
(test case improvement)
+1
On 27.04.17, 11:29, "Julian Reschke" wrote:
https://issues.apache.org/jira/browse/OAK-5652
(test dependency)
Severity:
* Affects generated queries (generated using a query builder tool).
* The workaround is to _not_ use path restrictions in the query, which slows
down the query.
* Only affects new queries.
* Failure is during the parse phase.
The risk of fixing:
* The change is limited to "union"
Hi,
> I would prefer to stay aligned with Maven boundaries as much as possible
> as this simplifies bug reporting for parties not deeply involved with
> Oak very much.
Actually, I don't think that's a problem. I wouldn't expect such a person to
specify any module (logical or maven).
> Most
Hi,
I think the main question is, what do we use the Jira component for. Right now,
I don't use it. Do we want to use it for statistics, or to be able to "monitor"
or "group" issues by group? Depending on that, we can use "Maven" module
boundaries, or "Logical" module boundaries. For example,
Hi,
Yes, it's safe to disable. Actually it's a good idea to disable, or at least
change so that only few nodetypes are indexed (for example
oak:QueryIndexDefinition nodes, or other config nodes).
Regards,
Thomas
On 29.03.17, 20:00, "Alex Benenson" wrote:
Hi all
Could you post the index definition please?
From: Ancona Francesco
Reply-To: "oak-dev@jackrabbit.apache.org"
Date: Thursday, 23 March 2017 at 15:19
To: "oak-dev@jackrabbit.apache.org"
Cc: Diquigiovanni
Hi,
>So we can implement a "paginated tree traversal"
Yes, I thinks that's a first step, something for oak-core which can be
re-used in multiple places. It might make sense to also create a JCR
version, for other use cases.
Regards,
Thomas
nyone to create a
pair of PDFs that hash to the same SHA-1 sum given two distinct images
with some pre-conditions."
Regards,
Thomas
On 24/02/17 08:12, "Thomas Mueller" <muel...@adobe.com> wrote:
>Hi,
>
>A SHA-1 collision has been published:
>https://www.
Hi,
My suggestion is to _not_ support "resumable" operations on a large tree,
but instead don't use large operations. But I wouldn't call my solution
"sharding", but more "bit-by-bit reindexing". Some more details: For
indexing (specially synchronous property indexes) I suggest to do the
Hi,
A SHA-1 collision has been published:
https://www.schneier.com/blog/archives/2017/02/sha-1_collision.html
https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Our FileDataStore and S3DataStore use SHA-1. For new binaries, we should use
(for example) SHA-256.
Right
Hi,
>I like Marcel proposal for "enforcing" use of mixin on parent node to
>indicate that it can have a child node of 'oak:index'. So we can
>leverage mxin 'mix:indexable' (OAK-3725) to mark such parent nodes
>(like root) and IndexUpdate would only look for 'oak:index' node if
>current node has
Hi,
For "oak:index" of type oak:QueryIndexDefinition, what about a hidden
property ":hasOakIndex" = true. That would be
NodeBuilder.hasProperty(":hasOakIndex").
Regards,
Thomas
On 22/02/17 12:57, "Chetan Mehrotra" wrote:
>We have some CommitEditors in Oak which
Hi,
>No I actually meant getting individual time-out values (or a scaling
>factor for time-outs) from CIHelper. That class already provides the
>means to skip tests based on where they are running. So it should be
>relatively straight forward to have it supply scaling factors for
>time-outs in a
Hi,
I assume with (b) you mean: change tests to use loops, combined with very
high timeouts. Example:
Before:
save();
Thread.sleep(1000);
assertTrue(abc());
After:
save();
for(int i=0; !abc() && i<600; i++) {
Thread.sleep(100);
}
assertTrue(abc());
The
Hi,
For re-indexing, there are two problems actually:
* Indexing can take multiple days, so resume would be nice
* For synchronous indexes, indexing create a large commit, which is
problematic (specially for MongoDB)
To solve both problems ("kill two birds with one stone"), we could instead
try
[X] +1 Release this package as Apache Jackrabbit Oak 1.6.0
Regards,
Thomas
Hi,
I think within a major version of Oak (1.4.x, 1.6.x), there should be no
backward-incompatible data format changes.
If there are changes, then trying to start with an old version (1.2.x)
should fail. It might be possible to open the repository in read-only
mode; for that, then a "read" and a
Hi,
>And option #4 - donate some computing capacity to run some dedicated
>Jenkins slaves for Oak.
I don't think it's a hardware problem. The problem seems to be turnaround
times from the Apache infra *team*: they seem to be overloaded. It's not
just with Jenkins, see for example:
[X] +1 Release this package as Apache Jackrabbit Oak 1.2.21
Hi,
There are two "extreme" cases, and both are used and work fine (please
nobody says "it's a joke", and "monolithic" is worse):
* "Monolithic": Linux, Apache Lucene, and so on: one version for everything
* "Fine grained": Apache Sling: separate, independent versions for
everything
(actually
>
>and using a different release
>cycle for oak-segment-tar is not a problem.
Sorry, I wanted to write "and using a different release cycle for
oak-segment-tar *created new problems*"
nto
>their own independently released bundles. We should split oak-run in
>different CLI utility modules, so that every implementation can take
>better care of their own utilities. Oak is not a pet project and we
>have to admit that its current level of complexity doesn't allow us to
>use
Hi,
>The release process in Oak is a joke.
I don't think it's a joke.
> Releasing every two weeks by
>using version numbers as counters just for the sake of it is
>embarrassing.
Why? It's simple.
> I don't even know how many releases of our parent POM we
>have, every one of them equal to the
Hi,
> could adding an oak-core-api with independent lifecycle solve the
>situation?
"All problems in computer science can be solved by another level of
indirection"
I would prefer if we get oak-segment-tar in line with the rest of oak
(release it at the same time and so on). I understand, there
Hi,
I would prefer C-T-R (commit, then review), because it reduces
bureaucracy. Except for changes just before a major release (when there is
little time to und or change things).
+1 to [REVIEW] emails. In my view, this should include new configuration
and new features. Basically all
eryIndexDefinition on propertyNames
>jcr:primaryType and declaringNodeTypes rep:ACL solved the issue.
>I presumed there already was an index for all the existing
>jcr:primaryTypes :), not that you have to specifically have them in the
>declaringNodeTypes
>
>Thanks!
>Roy
>> On 18 Oct 2016,
Hi,
> I really don¹t see the reason why this could be such a hard query
Who said it's a hard query? :-)
Is the problem performance, or is the problem that you get an exception?
If the problem is performance, then you need an index on the node type
rep:ACL.
If the problem is the exception:
Hi,
This is an old problem, but never solved. See OAK-1150.
Regards,
Thomas
On 17/10/16 16:08, "Chetan Mehrotra" wrote:
>Hi Team,
>
>While doing some benchmarks I realized that default setup is
>configured to index *all* nodetypes. In InitialContent the nodetype
Hi,
>
>Currently I am under the impression that we have no knowledge of what
>*might* break, with varying opinions on the matter. Maybe we should to
>find out what *does* break.
I don't think it's possible to easily find out. Customer code might expect
the current behavior, and might silently
Hi,
I agree with Julian, I think making nt:resource unreferenceable would
(hardcoding some "magic" in Oak) would lead to hard-to-find bugs and
problems.
> So whatever solution we pick, there is a risk that existing code fails.
Yes. But I think if we create a new nodetype, at least it would be
[X] +1 Release this package as Apache Jackrabbit Oak 1.2.20
Hi,
Sorry typo in "type", wanted to write "typo":
>I thought even in Jackrabbit 2.x, the "test" was assumed to be a type and
>automatically converted to "@test"...
Should read:
I thought even in Jackrabbit 2.x, the "test" was assumed to be a typo ...
Regards,
Thomas
[X] +1 Release this package as Apache Jackrabbit Oak 1.5.12
Hi,
I thought even in Jackrabbit 2.x, the "test" was assumed to be a type and
automatically converted to "@test"... Maybe I'm wrong.
What should work (for both Jackrabbit 2.x and Oak) is using
"test/@jcr:primaryType" instead of "test". So:
/jcr:root//*[test/@jcr:primaryType]
Hi,
Possibly the binary is downloaded from S3 in this case. We have seen
similar performance issues with datastore GC when using the S3 datastore.
It should be possible to verify this with full thread dumps. Plus we would
see where exactly the download occurs. Maybe it is checking the length or
Hi,
>I agree if conflicts conceptually with MVCC. However: is there an actual
>problem with the auto-refresh behaviour?
Yes. For example with queries. If changes are made while iterating over
the result of a query, the current behavior is problematic. Example code
(simplified):
RowIterator
Hi,
I'm sorry this feature is not available. You would need to set a property
"childCount" explicitly.
Could you explain the use case please?
Regards,
Thomas
On 20/07/16 15:48, "Milan Milanov" wrote:
>Hello there,
>
>I¹m trying to order some nodes by how many child nodes
Hi,
> I still don't believe that Oak is the right place to implement these
>solutions.
What would be the right place then? The Oak user can store the path of the
file as a string, but he would lose some features (garbage collection for
example).
>Every use case you outlined requires Oak to
Hi,
I would keep the "oak-segment-*" name, so that it's clear what it is based
on. So:
-1 oak-local-store
-1 oak-embedded-store
+1 oak-segment-*
Within the oak-segment-* options, I don't have a preference.
Regards,
Thomas
On 25/04/16 16:46, "Michael Dürig" wrote:
>
+1 Release this package as Apache Jackrabbit Oak 1.4.0
Hi,
Sure, there is a performance advantage (for both the persistent cache and
the Lucene index cache). But how much exactly depends on the use case.
You forgot the "persistent cache" by the way.
When restoring, you need to ensure that the local cache is not newer than
the remote (MongoDB), and
Hi,
I'm not in favour of this, as it breaks links, and I don't see a clear
improvement. I'm more in favour of incremental, small changes.
> an easier way to add/update documentation about oak specific features.
Sorry I don't understand, what is the problem with the current approach?
Is the menu
Hi,
I also always deploy the whole site with maven.
Regards,
Thomas
On 20/01/16 10:16, "Davide Giannella" <dav...@apache.org> wrote:
>On 20/01/2016 08:14, Thomas Mueller wrote:
>> ...
>>> This mean that if we add a feature over there we have to update
Hi,
Could we get rid of unused stuff? Like Hadoop (7 MB!). Do we need Solr
(2.3 MB), Tika, Zookeeper, Jetty, H2 (the SQL part)? Do we need the
Jackrabbit remoting stuff? I guess we need Groovy (4 MB) and Lucene (4 MB).
Of those 50MB, just 8% is Oak, and the rest is dependencies.
Regards,
Hi,
I think the main difference between Oak and Sling is, AFAIK, that Sling is
"forward only", and does not maintain branches, and does not backport
things.
In Oak, we add new features in trunk (changing the API), and backport some
of those features, and not necessarily all of them, and not
"Out of heap space"
On 27/11/15 18:54, "Travis CI" wrote:
>Build Update for apache/jackrabbit-oak
>-
>
>Build: #6972
>Status: Broken
>
>Duration: 420 seconds
>Commit: 3f083e0134aca930ed44bdb5a19ccff9794aef1f (trunk)
>Author: Julian Reschke
Hi,
I will disabled those tests. Even thought the unit tests always worked for
me, I couldn't get UDP to work with two "real" repositories (two
processes).
Regards,
Thomas
On 24/11/15 13:33, "Francesco Mari" wrote:
>broadcastUDP
Hi,
I would say initializing the collections (or tables when using a
relational database) is not expected to be done concurrently. Maybe we can
somehow preven that in Oak (patches are welcome!), or we document that
this is not supported.
Regards,
Thomas
On 26/10/15 18:36, "Navarro, Gabriela
OK, I think we (kind of) agree on how to ensure important indexes are
available.
>>Additionally, for "synchronous" indexes (property index and so on), I
>>would like to always create and reindex them asynchronously by default,
OK, I see that large branches are a problem.
Instead of using
Hi,
If an index provider is (temporarily) not available, the
MissingIndexProviderStrategy resets the index so it is re-indexed. This is a
problem (OAK-2024, OAK-2203, OAK-2429, OAK-3325, OAK-3366, OAK-3505, OAK-3512,
OAK-3513), because re-indexing is slow and one transaction. It can also cause
Hi,
> some fix missing in the 1.2 branch?
You are right, it looks like the cause is OAK-3432. This is fixed in the
trunk but not in the branch, because I thought it doesn't affect the
branch (because the branch doesn't contain OAK-3234). But, now I see it
also affects the branch.
Regards,
Hi,
Is this a 32-bit or 64-bit JVM?
Could you try
ulimit -v unlimited
See
http://stackoverflow.com/questions/8892143/error-when-opening-a-lucene-index-map-failed
and possibly
http://stackoverflow.com/questions/11683850/how-much-memory-could-vm-use-in-linux
Regards,
Thomas
From:
ged by my property index. How could i fix this
>node count supposed to be stored into a ":count" property ?
>
>Regards
>
>2015-09-23 12:15 GMT+02:00 Thomas Mueller <muel...@adobe.com>:
>
>> Hi,
>>
>> I think you are hitting OAK-2852.
>>
>
Hi,
Yes, makes sense... what about getIndexCostInfo?
Regards,
Thomas
On 24/09/15 07:41, "Chetan Mehrotra" wrote:
>Hi Thomas,
>
>On Wed, Sep 23, 2015 at 6:51 PM, wrote:
>> /**
>> + * Get the index cost. The query must already be
Hi,
I think you are hitting OAK-2852.
Regards,
Thomas
On 23/09/15 11:42, "Thomas Mueller" <muel...@adobe.com> wrote:
>Hi,
>
>Do you have a node called /oak:index/counter ? Out of the-box, it should
>be there (with a recent version of Oak). That is the approximate
Hi,
Do you have a node called /oak:index/counter ? Out of the-box, it should
be there (with a recent version of Oak). That is the approximate counter
index that is used to estimate how many nodes to traverse. As a
workaround, you probably have to re-index that one manually. I wonder why
that
Hi,
I will change the documentation to "Oak does not index _as_much_ content
by default as does Jackrabbit 2".
Regards,
Thomas
On 21/09/15 10:11, "Michael Lemler" wrote:
>Oak
>does not index content by default as does Jackrabbit 2
Hi,
Which version of Oak do you use?
Could you get the estimated node count for the root node, and for this
index? To get that, for example use the NodeCounter JMX bean
(NodeCounterMBean), getEstimatedChildNodeCounts("/", 2) and
getEstimatedChildNodeCounts("/oak:index", 3).
Regards,
Thomas
On
Hi,
Could someone please update OAK-3169 with a link to the new issue, or the
resolution?
Regards,
Thomas
On 24/08/15 08:46, "Davide Giannella" wrote:
>On 22/08/2015 19:59, Manfred Baedke wrote:
>> Hi,
>>
>> OAK-3169 caused inconsistencies that currently have to be repaired
[X] +1 Release this package as Apache Jackrabbit Oak 1.3.5
On 01/09/15 15:49, "Julian Reschke" wrote:
>On 2015-09-01 15:30, Davide Giannella wrote:
>> ...
>> [ ] +1 Release this package as Apache Jackrabbit Oak 1.3.5
>> [ ] -1 Do not release this package
Hi,
I wonder what does the team think about using final for variables and
parameters. In Oak, so far we didn't use it a lot. This question has come up
with OAK-3148. The patch uses final for variables, but not for parameters.
Lately, I have seens some code (I forgot where) that uses final for
Hi,
- I know that the variable ³state created at the beginning of the method
is the same one I can access at the end.
For short methods, it's easy to see this, by reading the code. In my view
easier than using final, which makes the code less readable.
For large methods,... well you should
Hi,
I thought about SegmentStore compaction and made a few slides:
http://www.slideshare.net/ThomasMueller12/multi-store-compaction
Feedback is welcome! The idea is at quite an early stage, so if you don't
understand or agree with some items, I'm to blame.
Regards,
Thomas
Hi,
AFAIK your tests re. restart have been done from within an OSGi
container (AEM) restarting the repository bundle.
Actually I restarted the JVM. But I probably didn't use the very latest
version of Oak, checkpoints may also have played a role in my case (there
were 2 checkpoints; probably
On 28.8.15 10:05 , Thomas Mueller wrote:
Hi,
I thought about SegmentStore compaction and made a few slides:
http://www.slideshare.net/ThomasMueller12/multi-store-compaction
Feedback is welcome! The idea is at quite an early stage, so if you
don't understand or agree with some items, I'm
Hi,
The DocumentStore doesn't really know the path, it only knows the key, and
if the key is hashed you can't calculate the path.
There are some options:
(a) Each document that has a hashed path as the key also has a path
property (with the real path). You could use that (cache it, read it if
Hi,
I have nothing against modularization, I'm just against modularization =
create many many Maven projects. I prefer modularization *within* one
project. Why can't we do that instead?
Ideally you have a ³root² project, e.g.
/oak
/security
/api
/implementationA
/implementationB
1 - 100 of 442 matches
Mail list logo