[
https://issues.apache.org/jira/browse/JCRVLT-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719725#comment-17719725
]
Thomas Mueller commented on JCRVLT-683:
---
> Well in this case I don't think it is necess
[
https://issues.apache.org/jira/browse/JCRVLT-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719719#comment-17719719
]
Thomas Mueller edited comment on JCRVLT-683 at 5/5/23 7:49 AM:
---
> Ev
[
https://issues.apache.org/jira/browse/JCRVLT-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719719#comment-17719719
]
Thomas Mueller commented on JCRVLT-683:
---
> Every code relying on that bug can be easily fi
[
https://issues.apache.org/jira/browse/JCRVLT-496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290024#comment-17290024
]
Thomas Mueller commented on JCRVLT-496:
---
> Should index definitions be allowed by default or j
Dear users of Apache Jackrabbit Oak,
Java 8 will be needed for futures versions of Jackrabbit Oak 1.4 and 1.6,
starting with Apache Jackrabbit Oak 1.4.27 and 1.6.21.
For details, see https://issues.apache.org/jira/browse/OAK-9294
This change is needed in order to allow upgrading to Apache Solr
Dear users of Apache Jackrabbit Oak,
Java 8 will be needed for futures versions of Jackrabbit Oak 1.4 and 1.6,
starting with Apache Jackrabbit Oak 1.4.27 and 1.6.21.
For details, see https://issues.apache.org/jira/browse/OAK-9294
This change is needed in order to allow upgrading to Apache Solr
Hi,
I intend to backport the fix for OAK-9184 to the 1.8 and 1.22 branches. The
risk should be very limited.
Let me know i you have any concerns.
Regards,
Thomas
https://issues.apache.org/jira/browse/OAK-9184
e also
https://jackrabbit.apache.org/oak/docs/roadmap.html
We shouldn't do any backports to that branch anymore.
Regards
Marcel
On 05.06.20, 17:30, "Thomas Mueller" wrote:
Hi,
I intend to backport the fix for OAK-9065 to the 1.8, 1.10, and 1.22
branches.
Hi,
I intend to backport the fix for OAK-9065 to the 1.8, 1.10, and 1.22 branches.
The risk should be very limited.
Let me know i you have any concerns.
Regards,
Thomas
https://issues.apache.org/jira/browse/OAK-9065
Hi,
We are not sure yet if the property was 600 million characters long. It might
have been only 1 million. Sure, we need to investigate this and log an issue
about this. But we need a generic solution.
I think we should log a warning for strings larger than 100'000 characters. At
some point
Hi,
Yes, this definitely looks like a bug... Could you file a Jira issue please?
Regards,
Thomas
On 22.03.19, 13:12, "Vikas Saurabh" wrote:
That sounds like a bug to me. Would love to hear Thomas Mueller's thoughts
too though.
--Vikas
(sent from mobile)
On Fri
Hi,
> Wouldn't it make sense to introduce a query option ala [1] to disable
> read/memory limits for one particular query?
It's possible, but my fear is that people would use the option in their queries
too often...
> OAK-6875 does not always have the desired effect (for sure there is some
>
Hi,
I think we should discuss this. The right now, we use some of the beans like
"global static" singletons. This might be a mistake, but it's like that. Now by
introducing a second bean, this "contract" breaks. It's kind of like breaking
backward compatibility...
Regards,
Thomas
On
Hi
I would like to backport https://issues.apache.org/jira/browse/OAK-7437.
Please let me know if you have any concern/objection.
Regards,
Thomas
Hi,
This is about Jackrabbit 2.x; I'm working on the Oak query + indexing
implementation do I don't have much insight there. Would you be interested to
use Oak?
However, I guess it would make sense to create Jira issues, and provide patches
and test cases, if you are interested in improving
+1
On 04.04.18, 10:23, "Tommaso Teofili" wrote:
Hi all,
In the context of creating an (abstract) implementation for Oak full text
indexes [1], I'd like to create a new module called _oak-search_.
Such module will contain:
- implementation
I want to backport OAK-7152 to all maintenance branches. The fix is simple and
low risk.
Regards,
Thomas
+1
On 15.01.18, 09:47, "Marcel Reutegger" wrote:
Hi,
I will backport OAK-7152 to all maintenance branches. The fix is trivial
and very low risk because the method currently simply does not return.
Regards
Marcel
Hi,
> Upgrading lucene to version 6 would probably warrant using 2.0, but that's
> not ready yet for 1.8?
No, it's not yet ready for 1.8.
Regards,
Thomas
I vote for 1.8. I don't see any big changes that would justify version 2.0. The
modularization (moving code around) is an ongoing process, I don't think this
is "fixed", and shouldn't have a big impact on users.
> Please vote on releasing this package as Apache Jackrabbit Oak 1.6.4.
+1
Thomas
eview. I’ve been going through
the suggested readings and will continue to do so.
Some comments inline below.
On August 15, 2017 at 12:25:54 AM, Thomas Mueller
(muel...@adobe.com.invalid)
wrote:
Hi,
It is important to understand which operations a
/index.html
* https://en.wikipedia.org/wiki/Content-addressable_storage
* https://en.wikipedia.org/wiki/Log-structured_merge-tree
Regards,
Thomas
On 15.08.17, 08:00, "Thomas Mueller" <muel...@adobe.com> wrote:
Hi,
I read you wiki update, and this caught my eye:
Hi,
I read you wiki update, and this caught my eye:
> If a match is found, the write is treated as an update; if no match is
> found, the write is treated as a create.
In the DataStore, there is no such thing as an update. There are only the
following operations:
* write
* read
* delete,
+1
On 12.07.17, 06:29, "Chetan Mehrotra" wrote:
OAK-5899
+1
On 13.07.17, 13:22, "Julian Reschke" wrote:
https://issues.apache.org/jira/browse/OAK-5827
"Don't use SHA-1 for new DataStore binaries"
(security related and in trunk since March)
Best regards, Julian
Hi,
OAK-6359 prevents complex queries to result in using 100% CPU (in an almost
endless loop) / eventually running out of memory. The fix is simple and already
tested in trunk. There is a feature flag that allows switching to the old
behaviour.
Regards,
Thomas
Hi,
On 10.07.17, 11:18, "Bertrand Delacretaz" wrote:
> Throw an exception maybe? BinaryNotAvailableAtThisTime, including an
> ETA for availability. The application can then decide how to handle
>that.
Bertrand, this is exactly what I have suggested in two
Hi,
> (a) the implementation of an automatism is not *quite* what they need/want
> (b) they want to be able to manually select (or more likely override)
whether a file can be archived
Well, behind the scenes, we anyway need a way to move entries to / from cold
storage. But in my view,
+1 Release this package as Apache Jackrabbit Oak 1.7.3
Hi,
> a property on the node, e.g. "archiveState=toArchive"
I wonder if we _can_ easily write to the version store? Also, some nodetypes
don't allow such properties? It might need to be a hidden property, but then
you can't use the JCR API. Or maintain this data in a "shadow" structure (not
> From my perspective as an Oak user I would like to have control on that.
> It would be nice for Oak to make *suggestions* about moving things to
> cold storage, but there might be application constraints that need to
> be accounted for.
That sounds reasonable. What would be the "API" for this?
Hi,
I guess you talk about Amazon Glacier. Did you know about "Expedited
retrievals" by the way?
https://aws.amazon.com/about-aws/whats-new/2016/11/access-your-amazon-glacier-data-in-minutes-with-new-retrieval-options/
- it looks like it's more than just "slow" + "fast".
About deciding which
Hi,
Right now, there is only one nodetype index. So, if you add a nodetype / mixin
to that index (as you know the lists of nodetypes / mixins is a multi-valued
property), then you need to reindex that index. Which needs to read all the
nodes.
The alternative would be to have multiple nodetype
Hi,
I'd like to backport OAK-6391 to the maintenance branches. The query result
getSize() method is often used, and it is important that the result is as
accurate as possible (even though the spec allows to return -1).
Regards,
Thomas
+1
On 08.06.17, 11:29, "Tommaso Teofili" wrote:
Hi all,
I'd like to backport the fix for a bug in LMSEstimator [1] (LMSEstimator is
used by oak-solr-core to estimate the no. of entries in the index without
issuing a query to Solr) until branch 1.2
Ah, same as Alex!
On 06.06.17, 18:06, "Alex Parvulescu" wrote:
[X] +1 Release this package as Apache Jackrabbit Oak 1.7.1
I had a transient error on
'ActiveDeletedBlobCollectorTest.multiThreadedCommits:230' but it went away
on the second
FYI I got a test failure in oak-lucene,
ActiveDeletedBlobCollectorTest.multiThreadedCommits, see comment in OAK-2808.
This is not a vote. I guess it just happens just on my machine, no need to
block the release.
+1 (fixing tests is always good)
On 31.05.17, 12:01, "Julian Reschke" wrote:
https://issues.apache.org/jira/browse/OAK-5612
(test case improvement)
+1
On 27.04.17, 11:29, "Julian Reschke" wrote:
https://issues.apache.org/jira/browse/OAK-5652
(test dependency)
Severity:
* Affects generated queries (generated using a query builder tool).
* The workaround is to _not_ use path restrictions in the query, which slows
down the query.
* Only affects new queries.
* Failure is during the parse phase.
The risk of fixing:
* The change is limited to "union"
Hi,
> I would prefer to stay aligned with Maven boundaries as much as possible
> as this simplifies bug reporting for parties not deeply involved with
> Oak very much.
Actually, I don't think that's a problem. I wouldn't expect such a person to
specify any module (logical or maven).
> Most
Hi,
I think the main question is, what do we use the Jira component for. Right now,
I don't use it. Do we want to use it for statistics, or to be able to "monitor"
or "group" issues by group? Depending on that, we can use "Maven" module
boundaries, or "Logical" module boundaries. For example,
Hi,
Yes, it's safe to disable. Actually it's a good idea to disable, or at least
change so that only few nodetypes are indexed (for example
oak:QueryIndexDefinition nodes, or other config nodes).
Regards,
Thomas
On 29.03.17, 20:00, "Alex Benenson" wrote:
Hi all
Could you post the index definition please?
From: Ancona Francesco
Reply-To: "oak-dev@jackrabbit.apache.org"
Date: Thursday, 23 March 2017 at 15:19
To: "oak-dev@jackrabbit.apache.org"
Cc: Diquigiovanni
Hi,
> I think your help is mandatory, given the level of voodoo in the five lines
> you propose :-)
Sure, I can help.
> I did some preliminary tests with the "partial entropy" method … and it seems
> the algorithm works but it does not get as fast as the content type detection
> method.
<timothee.ma...@gmail.com>
Reply-To: "dev@jackrabbit.apache.org" <dev@jackrabbit.apache.org>
Date: Tuesday, 7 March 2017 at 14:28
To: "dev@jackrabbit.apache.org" <dev@jackrabbit.apache.org>
Subject: Re: [FileVault][discuss] performance improvement proposal
Hi Thomas,
2017-03-07 1
[
https://issues.apache.org/jira/browse/JCRVLT-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899314#comment-15899314
]
Thomas Mueller commented on JCRVLT-163:
---
I guess the vast majority of binary data is stored
Hi,
> As for configuration: What is the reason for having a configuration option ?
Detecting if data is compressible can be done with low overhead, without having
to look at the content type, and without having to use configuration options:
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Mueller updated JCR-4115:
Attachment: JCR-4115b.patch
JCR-4115b.patch: new patch, including changes to the test case
> Do
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882713#comment-15882713
]
Thomas Mueller commented on JCR-4115:
-
Patch for the test case, with generator function
{noformat
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882664#comment-15882664
]
Thomas Mueller commented on JCR-4115:
-
[~julian.resc...@gmx.de] just noticed
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882491#comment-15882491
]
Thomas Mueller commented on JCR-4115:
-
[~ajain] could you please review the patch?
> Don't use SH
Hi,
>So we can implement a "paginated tree traversal"
Yes, I thinks that's a first step, something for oak-core which can be
re-used in multiple places. It might make sense to also create a JCR
version, for other use cases.
Regards,
Thomas
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882257#comment-15882257
]
Thomas Mueller edited comment on JCR-4115 at 2/24/17 9:24 AM:
--
Not sure
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Mueller updated JCR-4115:
Attachment: JCR-4115.patch
JCR-4115.patch (patch for testing). Just jackrabbit-data.
> Don't use
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882288#comment-15882288
]
Thomas Mueller commented on JCR-4115:
-
We could also use the approach here to detect problematic
[
https://issues.apache.org/jira/browse/JCR-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882257#comment-15882257
]
Thomas Mueller commented on JCR-4115:
-
Not sure if AbstractDataStore.HmacSHA1 is also affected
Thomas Mueller created JCR-4115:
---
Summary: Don't use SHA-1 for new DataStore binaries (Jackrabbit)
Key: JCR-4115
URL: https://issues.apache.org/jira/browse/JCR-4115
Project: Jackrabbit Content
nyone to create a
pair of PDFs that hash to the same SHA-1 sum given two distinct images
with some pre-conditions."
Regards,
Thomas
On 24/02/17 08:12, "Thomas Mueller" <muel...@adobe.com> wrote:
>Hi,
>
>A SHA-1 collision has been published:
>https://www.
Hi,
My suggestion is to _not_ support "resumable" operations on a large tree,
but instead don't use large operations. But I wouldn't call my solution
"sharding", but more "bit-by-bit reindexing". Some more details: For
indexing (specially synchronous property indexes) I suggest to do the
Hi,
A SHA-1 collision has been published:
https://www.schneier.com/blog/archives/2017/02/sha-1_collision.html
https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Our FileDataStore and S3DataStore use SHA-1. For new binaries, we should use
(for example) SHA-256.
Right
Hi,
>I like Marcel proposal for "enforcing" use of mixin on parent node to
>indicate that it can have a child node of 'oak:index'. So we can
>leverage mxin 'mix:indexable' (OAK-3725) to mark such parent nodes
>(like root) and IndexUpdate would only look for 'oak:index' node if
>current node has
Hi,
For "oak:index" of type oak:QueryIndexDefinition, what about a hidden
property ":hasOakIndex" = true. That would be
NodeBuilder.hasProperty(":hasOakIndex").
Regards,
Thomas
On 22/02/17 12:57, "Chetan Mehrotra" wrote:
>We have some CommitEditors in Oak which
Hi,
>No I actually meant getting individual time-out values (or a scaling
>factor for time-outs) from CIHelper. That class already provides the
>means to skip tests based on where they are running. So it should be
>relatively straight forward to have it supply scaling factors for
>time-outs in a
Hi,
I assume with (b) you mean: change tests to use loops, combined with very
high timeouts. Example:
Before:
save();
Thread.sleep(1000);
assertTrue(abc());
After:
save();
for(int i=0; !abc() && i<600; i++) {
Thread.sleep(100);
}
assertTrue(abc());
The
Hi,
For re-indexing, there are two problems actually:
* Indexing can take multiple days, so resume would be nice
* For synchronous indexes, indexing create a large commit, which is
problematic (specially for MongoDB)
To solve both problems ("kill two birds with one stone"), we could instead
try
[X] +1 Release this package as Apache Jackrabbit Oak 1.6.0
Regards,
Thomas
> it might be that the source dist didn't include all the files (thanks Tom for
> the hint.)
Ah sorry you saw my mail... I thought you might have missed it.
From: Thomas Mueller <muel...@adobe.com<mailto:muel...@adobe.com>>
Date: Thursday 19 January 2017 at 11:23
To: "de
Hi,
Also, I believe the .content.xml files are missing in the zip file, for example:
jackrabbit-filevault-3.1.34/vault-core/src/test/resources/org/apache/jackrabbit/vault/packaging/integration/testpackages/mode_ac_test_a.zip/jcr_root/.content.xml
Regards,
Thomas
From: Tobias Bocanegra
Hi,
I think within a major version of Oak (1.4.x, 1.6.x), there should be no
backward-incompatible data format changes.
If there are changes, then trying to start with an old version (1.2.x)
should fail. It might be possible to open the repository in read-only
mode; for that, then a "read" and a
Hi,
>And option #4 - donate some computing capacity to run some dedicated
>Jenkins slaves for Oak.
I don't think it's a hardware problem. The problem seems to be turnaround
times from the Apache infra *team*: they seem to be overloaded. It's not
just with Jenkins, see for example:
[X] +1 Release this package as Apache Jackrabbit Oak 1.2.21
[
https://issues.apache.org/jira/browse/JCR-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Mueller updated JCR-4042:
Description:
Currently GQL does not have a escape character. Adding the escape character
will help
Hi,
There are two "extreme" cases, and both are used and work fine (please
nobody says "it's a joke", and "monolithic" is worse):
* "Monolithic": Linux, Apache Lucene, and so on: one version for everything
* "Fine grained": Apache Sling: separate, independent versions for
everything
(actually
>
>and using a different release
>cycle for oak-segment-tar is not a problem.
Sorry, I wanted to write "and using a different release cycle for
oak-segment-tar *created new problems*"
nto
>their own independently released bundles. We should split oak-run in
>different CLI utility modules, so that every implementation can take
>better care of their own utilities. Oak is not a pet project and we
>have to admit that its current level of complexity doesn't allow us to
>use
Hi,
>The release process in Oak is a joke.
I don't think it's a joke.
> Releasing every two weeks by
>using version numbers as counters just for the sake of it is
>embarrassing.
Why? It's simple.
> I don't even know how many releases of our parent POM we
>have, every one of them equal to the
Hi,
> could adding an oak-core-api with independent lifecycle solve the
>situation?
"All problems in computer science can be solved by another level of
indirection"
I would prefer if we get oak-segment-tar in line with the rest of oak
(release it at the same time and so on). I understand, there
Hi,
I would prefer C-T-R (commit, then review), because it reduces
bureaucracy. Except for changes just before a major release (when there is
little time to und or change things).
+1 to [REVIEW] emails. In my view, this should include new configuration
and new features. Basically all
eryIndexDefinition on propertyNames
>jcr:primaryType and declaringNodeTypes rep:ACL solved the issue.
>I presumed there already was an index for all the existing
>jcr:primaryTypes :), not that you have to specifically have them in the
>declaringNodeTypes
>
>Thanks!
>Roy
>> On 18 Oct 2016,
Hi,
> I really don¹t see the reason why this could be such a hard query
Who said it's a hard query? :-)
Is the problem performance, or is the problem that you get an exception?
If the problem is performance, then you need an index on the node type
rep:ACL.
If the problem is the exception:
Hi,
This is an old problem, but never solved. See OAK-1150.
Regards,
Thomas
On 17/10/16 16:08, "Chetan Mehrotra" wrote:
>Hi Team,
>
>While doing some benchmarks I realized that default setup is
>configured to index *all* nodetypes. In InitialContent the nodetype
Hi,
>
>Currently I am under the impression that we have no knowledge of what
>*might* break, with varying opinions on the matter. Maybe we should to
>find out what *does* break.
I don't think it's possible to easily find out. Customer code might expect
the current behavior, and might silently
Hi,
I agree with Julian, I think making nt:resource unreferenceable would
(hardcoding some "magic" in Oak) would lead to hard-to-find bugs and
problems.
> So whatever solution we pick, there is a risk that existing code fails.
Yes. But I think if we create a new nodetype, at least it would be
[X] +1 Release this package as Apache Jackrabbit Oak 1.2.20
Hi,
Sorry typo in "type", wanted to write "typo":
>I thought even in Jackrabbit 2.x, the "test" was assumed to be a type and
>automatically converted to "@test"...
Should read:
I thought even in Jackrabbit 2.x, the "test" was assumed to be a typo ...
Regards,
Thomas
[X] +1 Release this package as Apache Jackrabbit Oak 1.5.12
Hi,
I thought even in Jackrabbit 2.x, the "test" was assumed to be a type and
automatically converted to "@test"... Maybe I'm wrong.
What should work (for both Jackrabbit 2.x and Oak) is using
"test/@jcr:primaryType" instead of "test". So:
/jcr:root//*[test/@jcr:primaryType]
Hi,
Possibly the binary is downloaded from S3 in this case. We have seen
similar performance issues with datastore GC when using the S3 datastore.
It should be possible to verify this with full thread dumps. Plus we would
see where exactly the download occurs. Maybe it is checking the length or
Hi,
>I agree if conflicts conceptually with MVCC. However: is there an actual
>problem with the auto-refresh behaviour?
Yes. For example with queries. If changes are made while iterating over
the result of a query, the current behavior is problematic. Example code
(simplified):
RowIterator
Hi,
I'm sorry this feature is not available. You would need to set a property
"childCount" explicitly.
Could you explain the use case please?
Regards,
Thomas
On 20/07/16 15:48, "Milan Milanov" wrote:
>Hello there,
>
>I¹m trying to order some nodes by how many child nodes
Hi,
> I still don't believe that Oak is the right place to implement these
>solutions.
What would be the right place then? The Oak user can store the path of the
file as a string, but he would lose some features (garbage collection for
example).
>Every use case you outlined requires Oak to
Hi,
With Oak, moving nodes is now much slower unfortunately, at least with
some storage engines (for example the document store / MongoDB). It is
basically the same as adding new nodes and deleting old nodes. I'm afraid
there is no easy solution for this problem.
Regards,
Thomas
On 20/05/16
Hi,
I would keep the "oak-segment-*" name, so that it's clear what it is based
on. So:
-1 oak-local-store
-1 oak-embedded-store
+1 oak-segment-*
Within the oak-segment-* options, I don't have a preference.
Regards,
Thomas
On 25/04/16 16:46, "Michael Dürig" wrote:
>
+1 Release this package as Apache Jackrabbit Oak 1.4.0
Hi,
Sure, there is a performance advantage (for both the persistent cache and
the Lucene index cache). But how much exactly depends on the use case.
You forgot the "persistent cache" by the way.
When restoring, you need to ensure that the local cache is not newer than
the remote (MongoDB), and
Hi,
I'm not in favour of this, as it breaks links, and I don't see a clear
improvement. I'm more in favour of incremental, small changes.
> an easier way to add/update documentation about oak specific features.
Sorry I don't understand, what is the problem with the current approach?
Is the menu
Hi,
I also always deploy the whole site with maven.
Regards,
Thomas
On 20/01/16 10:16, "Davide Giannella" <dav...@apache.org> wrote:
>On 20/01/2016 08:14, Thomas Mueller wrote:
>> ...
>>> This mean that if we add a feature over there we have to update
Hi,
Could we get rid of unused stuff? Like Hadoop (7 MB!). Do we need Solr
(2.3 MB), Tika, Zookeeper, Jetty, H2 (the SQL part)? Do we need the
Jackrabbit remoting stuff? I guess we need Groovy (4 MB) and Lucene (4 MB).
Of those 50MB, just 8% is Oak, and the rest is dependencies.
Regards,
1 - 100 of 1738 matches
Mail list logo