Re: On setting component boundaries in Oak

2012-03-09 Thread Thomas Mueller
Hi, therefore i would strongly suggest to separate jcr-transient space from an SPI layer from the very beginning. Yes, I think we all agree on about the separation. What we not seem to agree is if separate packages is a good enough separation for now, or if it needs to be separate projects right

Re: On setting component boundaries in Oak

2012-03-09 Thread Thomas Mueller
Hi, Why do we need an SPI? My understanding is: so that non-Java clients such as PHP can access Oak/Jackrabbit. Plus, in case of Java, for remoting. I don't think non-Java clients will want to use JNI, so the remoting aspect is very important in my view (not necessarily urgent, but important).

Re: Interface for modifying trees

2012-03-09 Thread Thomas Mueller
Hi, I wonder whether it would be useful if the transient state can be accessed before creating a new NodeState, that is: interface NodeBuilder { String getProperty(String name); NodeState getChildNode(String name); } Regards, Thomas On 3/9/12 4:56 PM, Jukka Zitting

Re: On setting component boundaries in Oak

2012-03-12 Thread Thomas Mueller
Hi, Alternatively, if we intend to have JCR as the only API through which Oak repositories are accessed, then I completely agree with Thomas that there's not much point in putting the JCR binding to a separate component. There is probably some misunderstanding... I didn't mean that the JCR

Re: Semantics of MicroKernel.getNodes()

2012-03-14 Thread Thomas Mueller
Hi, Another problem is the String-fits-all return type; it would make it impossible to implement streaming of the result to the client; which will make the behavior for large collections non-optimal (the caller needs to wait for the complete JSON string to be ready before it can start forwarding

Re: MicroKernel API vs. protocol

2012-03-20 Thread Thomas Mueller
Hi, Serializing objects to strings just for the purpose of parsing them again makes little sense to me. ... doesn't need to be the only way to do it. +1 about flat or not. Forcing getNodes to return things as a hierarchy, when it also could be a list of objects, decorated with a path, will make

Re: Re (OAK-36) Implement a query parser - what about indexing?

2012-03-22 Thread Thomas Mueller
Hi, OAK-36 covers the Query implementation effort, but I'm wondering if now would be a good time to mention indexing as well. We want to have dedicated indexes, I think that would be accomplished via observation. Any ideas about the availability of this feature? Sure. One such a mechanism is

Re: Exceptions used in oak-core

2012-03-22 Thread Thomas Mueller
Hi, *If* we use unchecked exceptions, by all means let them all extend a common base exception :-) I think we should use unchecked exceptions except possibly in the JCR API itself. As for 'possibly' see http://java.net/jira/browse/JSR_333-14 - I still hope this will be accepted for JSR 333 :-)

Re: Multiple JCR Repositories in a single MikroKernel instance (?)

2012-03-22 Thread Thomas Mueller
Hi, in other words: the URL use to create the MK instance must not only identify the store and the workspace but also the repository? I thought we want to keep the data of all workspaces within the same physical storage, so use the same MK instance for all workspaces within a repository.

Re: Values in oak-core

2012-03-24 Thread Thomas Mueller
Hi, - Does it really make sense to allow for multiple Scalar implementations? Shouldn't we just have one final class for that? I would prefer one final class. Regards, Thomas

Commit messages

2012-03-26 Thread Thomas Mueller
Hi, I think we should try to use meaningful commit message. In this case I would have preferred remove the file system abstraction or replace the file system abstraction with java.io.* over cleanup. Regards, Thomas On 3/23/12 6:00 PM, ste...@apache.org ste...@apache.org wrote: Author:

Re: Re (OAK-36) Implement a query parser - what about indexing?

2012-03-26 Thread Thomas Mueller
Hi, I'm not sure if you already saw that, but one of the goals for Oak is to Simple/Fast queries (i.e. through specialized indexes) - see also http://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackrab bit%203 - what I understand under specialized indexes is user defined indexes.

Re: Re (OAK-36) Implement a query parser - what about indexing?

2012-03-26 Thread Thomas Mueller
Hi, Currently, for Hippo, I am doing something similar for the query api, that can seamlessly delegate to Solr or jackrabbit, both returning a jcr node iterator (although the solr index through solrj can also return plain pojo's). I really like the first option (pre-commit example) and third

Re: full text search improvements

2012-03-26 Thread Thomas Mueller
Hi, I haven't looked at / tested JCR joins : I just can't imagine that is scales enough, but perhaps this is more related to my 'Lucene 1.4 experience' :) Lucene 1.4? For Oak, joins should perform well (I guess with 'scale' you mean 'perform'). Currently only nested loop joins are implemented

Re: full text search improvements

2012-03-26 Thread Thomas Mueller
Hi, I'd like to prototype Solr integratation, however, I cannot commit to this as I am dependent on the time I will be given by my manager: April is completely booked already. I hope I can get a time slot in May to work on a prototype That would be really nice! May is better than April anyway,

Re: svn commit: r1306222 - in /jackrabbit/oak/trunk/oak-jcr/src: main/java/org/apache/jackrabbit/oak/jcr/ main/java/org/apache/jackrabbit/oak/jcr/query/ test/java/org/apache/jackrabbit/oak/jcr/ test/j

2012-03-28 Thread Thomas Mueller
Hi, Could you come up with an implementation which does not directly depend on the Microkernel but rather uses a (to be defined) API on oak-core? Sure, let's define such an API. I'm in the process of removing all dependencies from oak-jcr to oak-mk (OAK-20) so this will ultimately break. Sure.

Re: oak api : initial draft

2012-03-29 Thread Thomas Mueller
Hi, Commit failure is an expected error condition in the sense that I can write client code that I can expect to always produce a failed commit. Just for completeness, could you give an example? For this reason l would prefer to extend from RuntimeException only. Me too. Regards, Thomas

Re: oak-api and move operations

2012-03-30 Thread Thomas Mueller
Hi, reconstruct move and copy operations from looking at the raw trees Aha there it is! My original phrasing was: (emphasis added) there is no way to *reliably* recover move and copy operations. Git gives up on reliability. Please note this only applies if you try to recover the operations from

Re: Only one workspace (Was: Handling namespace mapppings)

2012-03-30 Thread Thomas Mueller
Hi, Let's drop trying to support more than one workspace for now! Would that mean, don't support multiple workspaces in the whole oak-core *API* for now? Or just don't support multiple workspaces in the *implementation* for now (and don't test with multiple workspaces)? Regards, Thomas

Re: oak-api and move operations

2012-04-03 Thread Thomas Mueller
Hi, I think there is some misunderstanding here. There is not more merging to be done by the Microkernel as it already does. Currently transient changes are kept in memory and passed to the Microkernel in a single commit call passing along a possible prohibitively large json diff. With the

Re: oak-api and move operations

2012-04-03 Thread Thomas Mueller
Hi, This is exactly the same operation as two MicroKernel cluster nodes will need to perform when syncing concurrent commits. Yes, only that synching concurrent commits can be implemented on a higher layer (above indexing). So it's not quite the same. AFAICT it should be possible to do this

Re: oak-api and move operations

2012-04-03 Thread Thomas Mueller
Hi, Tom, I think you are still miss reading things. This has nothing to do with transactions, 2 phase commit, isolation, etc. It is just a way how transient changes can be stored (written ahead) instead of kept in memory. As far as I know, we want to support MicroKernel implementations that

Re: test case failures

2012-04-10 Thread Thomas Mueller
Hi, testAbsoluteRelative(org.apache.jackrabbit.mk.fs.FileSystemTest) I guess you use Windows? I think this should be fixed in revision #1311705 - could you try this please (I don't have Windows currently, but from the Javadocs I think I know what was wrong)? If it's still failing, could you

Re: MicroKernel.getInstance

2012-04-11 Thread Thomas Mueller
Hi, So how would this work? Do I do something like this: MicroKernelManagerFactory factory = getFactoryFromSomewhere(); MicroKernelManager manager = factory.getManager(something); try { MicroKernel mk = manager.getMK(); // use the mk } finally {

Re: Exception handling in oak-core

2012-04-27 Thread Thomas Mueller
Hi, my preference was to just throw the jcr-exceptions where ever this was appropriate and unambiguous. for example namespaceexception, versionexception, constraintviolation... I can't comment on NamespaceException, VersionException, and so on. What I find problematic is, if almost all methods

Re: svn commit: r1331368 - in /jackrabbit/oak/trunk: oak-core/src/main/java/org/apache/jackrabbit/oak/core/ oak-core/src/main/java/org/apache/jackrabbit/oak/osgi/ oak-it/osgi/ oak-it/osgi/src/test/jav

2012-04-30 Thread Thomas Mueller
Hi, My idea here is that any pluggable components passed to a ContentRepositoryImpl or other core class should be already initialized or be able to lazily initialize itself when needed. I understand the concern about the lifecycle, but I would also like to avoid reading from the repository in

Re: [VOTE] Release Apache Jackrabbit Oak 0.2.1

2012-05-03 Thread Thomas Mueller
Hi, +1 I also used the check-release.sh script, there were a few minor issues: - gpg: BAD (but md5 and sha were GOOD) - the script created two directories: -v and -p - svn: URL 'http://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-0.2.1' doesn't exist (there are 0.2.1/ and

Re: Exception handling in oak-core

2012-05-03 Thread Thomas Mueller
Hi, let's assume CoreValue.getString() could throw a RepositoryException (when there is an error converting the value to a string). If we do that, then we would have to add exception handling to a That examples seems a bit academic right now, as CoreValue.getString() indeed never throws (it

Re: Exception handling in oak-core

2012-05-03 Thread Thomas Mueller
Hi, I wouldn't want to catch RuntimeException. I'd prefer if oak-core would only throw OakException (extends RuntimeException), as suggested by Michael Dürig. +1 I also think it would be good if these Oak exceptions could carry sufficient information to generate a JCR exception. +1 The

Re: Exception handling in oak-core

2012-05-03 Thread Thomas Mueller
Hi, If the OakException wraps a RepositoryException, we extract that one and rethrow it, it will come with an incorrect stack trace, no? You mean, if we use throw ((OakException) e).getOriginalException() or something similar when unwrapping? So it seems we need to re-wrap. Not sure what you

Re: [jira] [Created] (OAK-87) Declarative services and OSGi configuration

2012-05-07 Thread Thomas Mueller
Hi Jukka, I guess you wanted to write non-OSGi instead of plain Java? Regards, Thomas On 5/6/12 9:51 PM, Jukka Zitting (JIRA) j...@apache.org wrote: Jukka Zitting created OAK-87: Summary: Declarative services and OSGi configuration

Re: Plain Java as deployment/configuration platform (Was: [jira] [Created] (OAK-87) Declarative services and OSGi configuration)

2012-05-07 Thread Thomas Mueller
...@gmail.com wrote: Hi, On Mon, May 7, 2012 at 9:02 AM, Thomas Mueller muel...@adobe.com wrote: I guess you wanted to write non-OSGi instead of plain Java? Essentially yes, but there's a subtle point here. Let me explain. A specific non-OSGi setup could be something like repository.xml -based

Re: value conversions, and relative paths in Oak

2012-05-10 Thread Thomas Mueller
Hi, My thinking was to make this opaque to oak-core. That is, set the value of the path property to the jcr path and let oak-jcr handle it. Unless there is a case (which might well be) where we need to dereference such paths properties from within oak-core this should work. I don't think this

Re: Query

2012-05-15 Thread Thomas Mueller
Hi, 1) Would it make sense to start tracking more closely which Query tests are passing? Right now all of them are marked as failing in the pom... (and yes, I can do that, but I wanted to wait until it makes sense) I ran the TCK test to find out what areas are missing or not working correctly.

Re: Query

2012-05-15 Thread Thomas Mueller
Hi, The reason why I'm asking is that if we leave things as they are right now, we won't notice regressions, because all of the tests are marked as known failures. Thus, once the majority of functionality is there, we should make the exclusions more fine-grained. I see. Yes, if you have time,

Re: findbugs

2012-05-23 Thread Thomas Mueller
Hi, +1 to adding it to the parent POM. Did you already figure out a way for us to integrate FindBugs checks to to the test phase of the build so we could fail the build if explicitly enabled FindBugs checks fail? Would it be possible (until we have more experience) to only *log* warnings where

Path and name normalization in the query engine

2012-05-24 Thread Thomas Mueller
Hi, The query engine in oak-core needs to normalize paths and names (convert ./name to name), for example for the TCK test org.apache.jackrabbit.test.api.query.qom.NodeNameTest.testURILiteral. I don't see a way to do the normalization within oak-jcr. The component that seems to be able to do

Re: Identifier (UUID) handling

2012-05-29 Thread Thomas Mueller
Hi, I'm fine with getNodeByUUID() relying on the query system. The query system on the other hand should be able to work even when no indexes are available, falling back to tree traversal when all else fails. Yes, that's the current plan. This is not an urgent feature, but if no index is

Re: Observation design (Was: svn commit: r1351414 - in /jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak: api/ChangeSet.java api/ContentSession.java core/ContentSessionImpl.java)

2012-06-19 Thread Thomas Mueller
Hi, - We can implement the polling approach (using a 0 timeout) but also have the option to do blocking. Since this can be directly delegated to the Microkernel (waitForCommit) the added complexity for this is minimal. The complexity is still there, it's just one level below. Personally I'd

Re: Observation design (Was: svn commit: r1351414 - in /jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak: api/ChangeSet.java api/ContentSession.java core/ContentSessionImpl.java)

2012-06-19 Thread Thomas Mueller
it) and whether this needs to be in the MicroKernel API is a different question. Regards, Thomas On 6/19/12 9:27 AM, Thomas Mueller muel...@adobe.com wrote: Hi, - We can implement the polling approach (using a 0 timeout) but also have the option to do blocking. Since this can be directly

Re: Native HTTP bindings for Oak

2012-06-27 Thread Thomas Mueller
Hi, I understand the point Felix is making. As of now, I would propose to drop separate URI spaces. I would also propose to drop the related MicroKernel branch/merge feature, or at least not rely on the feature to be available. In my view, the MicroKernel branch/merge feature which was

Re: Updated Oak roadmap

2012-06-28 Thread Thomas Mueller
Hi, I would prefer if the Scalability and performance review would start earlier. Regards, Thomas On 6/28/12 1:43 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, We missed pushing the 0.3 release out a month ago, but let's fix that by cutting the release instead of 0.4 now at the turn

Re: Updated Oak roadmap

2012-06-28 Thread Thomas Mueller
Hi, Yes, we can do such a benchmark (the oak-bench component already has the basics in place), though personally I don't see the single-node case as a particularly interesting one to benchmark. Unless we want to also target embedded systems, any truly performance-critical deployments will

Re: Notes from the Oakathon

2012-07-05 Thread Thomas Mueller
Hi, I agree with Felix. I don't currently see a need for Sling to bypass the JCR API. If that would be needed, there is something wrong with the JCR API or the JCR implementation. About indexing: Another potential Sling extension would be a custom Oak index provider for optimizing the kinds

Re: Notes from the Oakathon

2012-07-05 Thread Thomas Mueller
Hi, I'm not sure what you mean here. I don't see Sling having it's own index *mechanism*. Probably Sling will need specific indexes, but that's just configuration, and not implementation (code)... Being able to integrate totally different indexing mechanisms can be useful if Oak can provide

Re: Internal content in Oak

2012-07-19 Thread Thomas Mueller
Hi, I believe we have quite different views about what the architecture should look like and what the goals of the separation between the layers are. I understand the view people have changes as we implement things, but could we discuss the architecture and problems we found in the next meeting?

Re: Internal content in Oak

2012-07-19 Thread Thomas Mueller
Hi, i raised the same issue: OAK-162 I read it, good to know it was already discussed. I basically matches my concern. Regards, Thomas

Re: Internal content in Oak

2012-07-19 Thread Thomas Mueller
Hi, Š oak-http are the only complex direct Java clients of the Oak API I thought Oak HTTP basically _is_ (very closely matches) the Oak API. That's why I qualified it with to a lesser degree, which you didn't include in your quote above. I guess this is an example where a direct

Re: Design idea for a production-scale in-memory microkernel

2012-08-13 Thread Thomas Mueller
Hi, I think the problem you want to solve is having a high performance storage subsystem. There are multiple ways to solve this. One is to write a new in-memory microkernel. Another is to use a very big cache, and write in a background thread. This would also work if the data doesn't fit in

Re: Design idea for a production-scale in-memory microkernel

2012-08-13 Thread Thomas Mueller
Hi, I know this isn't related directly to in-memory microkernel, but it seems to me the reason to propose an in-memory microkernel is to improve performance. Unless I misunderstood the mail? As for Jackrabbit 2.x read and write performance, I found that JCR-2857 helps, specially for larger

Re: svn commit: r1368425 - in /jackrabbit/oak/trunk: oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/ oak-jcr/src/main/java/org/apache/jackrabbit/oak/jcr/observation/ oak-jcr/src/main/java/org/apa

2012-08-16 Thread Thomas Mueller
Hi, In general the repository shutdown should IMO *never* block for any background tasks. A kill -9 on a JVM process should IMHO be considered a valid way to shutdown a repository running inside the JVM, so there's nothing that the repository should expect a background task to finish in time for

Re: equalsIgnoreCase

2012-08-20 Thread Thomas Mueller
Hi, Actually, equalsIgnoreCase isn't a problem. The problem is just String.toLower() and String.toUpper(). See also: http://mattryall.net/blog/2009/02/the-infamous-turkish-locale-bug So this returns true, as required: Locale.setDefault(new Locale(tr));

Re: Issue while converting from Xpath to Sql2

2012-08-20 Thread Thomas Mueller
Hi, Thanks a lot! I'm currently working on trying to support such queries in the XPathToSQL2Converter. I'm not sure yet how much work this will be, let's see. Is the query really using '**' and '%%'? If yes are there any nodes that contain '**', or is this just test data (just wondering)?

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-12 Thread Thomas Mueller
Hi, 2. The client does need to know the size, so it calls getSize() and I currently can't come up with a convincing use case - what is your use case? Display the actual number of search results to the user? Do you want to risk that the method getSize() takes 1.5 hours just to display the

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-13 Thread Thomas Mueller
Hi, There's no need for the Oak API to reflect JCR in all its details. Sure. First we need to define how the JCR API implementation is supposed to behave. Based on that we can then still decide what the Oak API should look like. The Oak API is (more or less) an implementation detail. Of course

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-13 Thread Thomas Mueller
Hi, The main question is still what the JCR API method getSize() should return. A new method getSize(int max) is nice, and of course we can do that. But I guess people will not use it in the near future because it's not part of the JCR API. Regards, Thomas On 9/12/12 8:06 PM, Michael Marth

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-13 Thread Thomas Mueller
Hi, return the correct size if the result set has fewer than something like 1000 entries. That should cover most practical cases Yes. Let's discuss the value now! 1000 sounds OK in general, however there is a potential performance problem. For Jackrabbit 2.x, if there are more than a few million

ContentSession.getCurrentRoot() is slow

2012-09-13 Thread Thomas Mueller
Hi, To read a node, the query engine currently uses: session.getCurrentRoot().getTree(path); The query engine calls this whenever it has to evaluate a property. It turns out internally the getCurrentRoot() method always calls MicroKernel.getHeadRevision(). I wonder if this is required, and

Re: ContentSession.getCurrentRoot() is slow

2012-09-13 Thread Thomas Mueller
Hi, It would be better if the query engine used the same tree snapshot as the rest of the session. There shouldn't be any need to call getCurrentRoot(). This is actually what I expected that ContentSession.getCurrentRoot() would provide me: I expected getCurrentRoot() would always return the

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-13 Thread Thomas Mueller
Hi, The idea with the timeout sounds good, but what should we recommend an application to do if getSize() takes too long and returns -1? Imagine while paging search results, the first page query is fast enough (getSize() returns something), but the second is too long and now returns -1: should

Re: ContentSession.getCurrentRoot() is slow

2012-09-14 Thread Thomas Mueller
and ContentSession. But for now I guess it's enough to add a comment in the Javadoc. Regards, Thomas On 9/14/12 7:32 AM, Thomas Mueller muel...@adobe.com wrote: Hi, It would be better if the query engine used the same tree snapshot as the rest of the session. There shouldn't be any need to call

Re: ContentSession.getCurrentRoot() is slow

2012-09-14 Thread Thomas Mueller
Hi, About workspaces - I know we said we don't support workspaces currently, but there still is ContentSession.getWorkspaceName(). Could we remove this method for now? Currently the workspace root and the repository root are the same thing, while we only support a single workspace. Once we get

Re: On custom index configuration

2012-09-18 Thread Thomas Mueller
Hi, First of all I think there shouldn't be just one single place in the repository where all index configuration should go. Hm, how would the query engine detect what indexes are available? I think keeping the index configuration at one place is the most simple solution, and I don't currently

Re: On custom index configuration

2012-09-19 Thread Thomas Mueller
Hi, At query time, when it knows the main path constraint used in the query, it can walk down that path to detect which indexes are available and useful for resolving the query. I guess we could make it work. It would make the query engine a bit more complex, and some of the queries would get a

Re: On custom index configuration

2012-09-19 Thread Thomas Mueller
wrote: Hi, On Tue, Sep 18, 2012 at 5:30 PM, Thomas Mueller muel...@adobe.com wrote: First of all I think there shouldn't be just one single place in the repository where all index configuration should go. Hm, how would the query engine detect what indexes are available? At query time, when

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-19 Thread Thomas Mueller
Hi, How do I know it's for sure more than 20 Because the PrefetchIterator will try to prefetch 20 nodes. or whatever my page size happens to be? If you have a higher page size then you need getSize(max). Please note if you use offset and limit, getSize() will return the size of the result

Re: On custom index configuration

2012-09-19 Thread Thomas Mueller
Hi, +1 From a content modeling perspective, forcing all indexes in a central location is very restricting and not modular. Where the index configuration nodes are stored is normally internal to the implementation, with the exceptions of export and import. The index configuration is similar to

Re: [Broken] apache/jackrabbit-oak#114 (trunk - a107923)

2012-09-20 Thread Thomas Mueller
Hi, org.apache.jackrabbit.test.api.query.qom.UpperLowerCaseTest#testLength org.apache.jackrabbit.test.api.query.qom.UpperLowerCaseTest#testNodeName Strange, those tests didn't (and still don't) fail for me. I will investigate. Regards, Thomas On 9/20/12 12:21 PM, Michael Dürig

Re: The infamous getSize() == -1 (Was: [jira] [Created] (OAK-300) Query: QueryResult.getRows().getSize())

2012-09-20 Thread Thomas Mueller
Hi, 1) Next to getSize() iterator we also added getTotalSize(). I don't like the name because it is actually more something like: getTotalSizeWithoutCheckingACLs(). Hm, wouldn't that be a security problem? Couldn't it be better (from a security perspective) if you can only get this number if

Re: On custom index configuration

2012-09-20 Thread Thomas Mueller
Hi, It is problematic to fix the configuration if you don't know where to look, if the configuration can be basically anywhere in the repository. Let's say there are special Lucene indexes configured at: /long/path/deep/in/repository And if you want to switch to another query engine (let's

Re: On custom index configuration

2012-09-20 Thread Thomas Mueller
Hi, Yes, but the problem is if the repository-administrator has no way of knowing what indexes are configured. That's easy to find out: SELECT * FROM [oak:indexed] ;-) Your are right. To do this efficiently it would require an index on [oak:indexed], which would effectively be the same

Re: [MongoMK] Reading blobs incrementally

2012-10-17 Thread Thomas Mueller
Hi, As a workaround, you could keep the last few streams open in the Mongo MK for some time (a cache) together with the current position. That way seek is not required in most cases, as usually binaries are read as a stream. However, keeping resources open is problematic (we do that in the

Re: twin issues on Jira OAK-13 and OAK-57

2012-10-24 Thread Thomas Mueller
Hi, Isn't the property index (o.a.j.oak.plugins.index.old) still used for testing? Regards, Thomas On 10/24/12 3:05 PM, Alex Parvulescu alex.parvule...@gmail.com wrote: Hi gang, I'm wondering about 2 issues that are talking about the same refactoring task [0] and [1]. There is still a bunch

Re: twin issues on Jira OAK-13 and OAK-57

2012-10-25 Thread Thomas Mueller
Hi, I'm not sure what you mean by used for testing. I mean used for manual testing. I'm planning to use it for some manual testing, and I thought others use it as well or plan to use it. I also want to run some benchmarks to find out if a Lucene index or the property index is faster. It has

Re: [MongoMK] BlobStore garbage collection

2012-11-05 Thread Thomas Mueller
Hi, Do we have tests somewhere where we can compare different BlobStore implementations? There are some tests in oak-mk-it; MicroKernelIT (testSmallBlob, testMediumBlob, testLargeBlob). I guess they could be quite easily converted to test performance. Regards, Thomas

Re: [MongoMK] BlobStore garbage collection

2012-11-05 Thread Thomas Mueller
:38 PM, Mete Atamel mata...@adobe.com wrote: Thanks. Yes, I also think it's worthwhile to try implementing MongoDB BlobStore based on AbstractBlobStore. Do we have tests somewhere where we can compare different BlobStore implementations? -Mete On 11/2/12 3:50 PM, Thomas Mueller muel...@adobe.com

Re: [MongoMK] BlobStore garbage collection

2012-11-06 Thread Thomas Mueller
Hi, 1- What's considered an old node or commit? Technically, anything other than the head revision is old but can we remove them right away or do we need to retain a number of revisions? If the latter, then how far back do we need to retain? we discussed this a while back, no good solution back

Re: [MongoMK] BlobStore garbage collection

2012-11-06 Thread Thomas Mueller
Hi, If we go down this path for node GC With this path of node GC, do you mean the ability to configure the lifetime of a revision? , doesn't MicroKernel interface have to change to account for this? Where would you change this default 10 minutes value as far as MicroKernel is concerned? I

Re: [MongoMK] BlobStore garbage collection

2012-11-07 Thread Thomas Mueller
operationally limit the repo sizes Oak can support. -- Michael Marth | Engineering Manager +41 61 226 55 22 | mma...@adobe.commailto:mma...@adobe.com Barfüsserplatz 6, CH-4001 Basel, Switzerland On Nov 6, 2012, at 9:24 AM, Thomas Mueller wrote: Hi, 1- What's considered an old node

Re: [MongoMK] BlobStore garbage collection

2012-11-07 Thread Thomas Mueller
Hi, the format of references to binaries is documented in the MicroKernel java doc, see Retention Policy for Binaries [0]. Thanks! I didn't know it's already documented. Regards, Thomas

Re: [MongoMK] BlobStore garbage collection

2012-11-07 Thread Thomas Mueller
Hi, Generational garbage collection should work pretty well for this case. If the blob store can keep track of all blobs added since revision X, it needs to only go through the diff from that revision the latest ones to determine which of those blobs can be removed early. Since most extra

Re: TreeImpl.isRemove performance problem and possible workaround

2012-11-07 Thread Thomas Mueller
Hi, Yes, I see. My patch is really just a quick hack (my idea was to avoid creating many HashMaps), but I'm not sure if it will always work correctly. Regards, Thomas On 11/7/12 4:02 PM, Michael Dürig mdue...@apache.org wrote: On 7.11.12 14:54, Thomas Mueller wrote: Hi, I think

Re: svn commit: r1409134 - in /jackrabbit/oak/trunk: oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/old/ oak-core/src/main/java/org/apache/jackrabbit/oak/query/ast/ oak-core/src/main/j

2012-11-14 Thread Thomas Mueller
Hi, Yes I came to the same conclusion. However I would prefer it Cursor stays an interface, because it's an API. I could add an abstract class somewhere else (in the Cursors class): an AbstractCursor class where remove() throws an exception. Would that be OK as well? Regards, Thomas On

Re: Support for long multivalued properties

2012-11-15 Thread Thomas Mueller
Hi, personally i am not aware of real life use cases requiring 'large' mv properties. since the ultimate goal of oak is to provide a JCR implementation and the JCR API doesn't provide any methods to manipulate/access single members of a mv property i don't think we need to support it under the

Re: Support for long multivalued properties

2012-11-15 Thread Thomas Mueller
Hi, before adding this i would rather want to see support for hash maps. Sounds interesting.. could you give more details please? Regards, Thomas

Re: [jira] [Moved] (OAK-465) PropertyIndex uses TraversingCursor but should not

2012-11-22 Thread Thomas Mueller
Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller The org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex uses the traversing cursor (that traverses over the whole repository) when there is no index. This is not how the index mechanism is supposed

Re: Handling copies and moves with tree diffs

2012-11-27 Thread Thomas Mueller
Hi, 1) For each tree node we keep track of its original location. However, I think recording the path is not sufficient. We need to record the parent node. I guess you don't mean remembering the original path of the parent? keeping track of the original location of the parent (let's say /test)

Re: OAK-343 considered harmful?

2012-11-27 Thread Thomas Mueller
Hi, Currently it is planned to use the commit hook feature to update the index. Is the commit hook called before saving? If it can't be easily supported, maybe a simple per-session map (uuid - path) could be used for temporary nodes, within oak-jcr? Regards, Thomas On 11/27/12 3:08 PM,

Re: Conflict handling in Oak

2012-12-18 Thread Thomas Mueller
Hi, In addition it would be nice to annotate conflicts in some way. This is quite easy to do and would allow upper layers to resolve conflicts based on specific business logic. Currently we do something along these lines with the AnnotatingConflictHandler [1] in oak-core. Sure, that would make

Re: Conflict handling in Oak

2012-12-18 Thread Thomas Mueller
Hi, 1) Make the definition of conflicts sufficiently strong to exclude such cases. That's Tom's proposal from this Thread. Ah, OK, I thought you meant it could still be a problem even with my proposal. I guess failing on (node-level-) conflicts would be the most simple solution, as a start. It

Re: Conflict handling in Oak

2012-12-18 Thread Thomas Mueller
Hi, 2) Allow inconsistent journals. I guess we don't want that. But the question is how close the journal has to match the original commit, specially move and copy operations. If they need to be preserved (do they?), then it's complicated. There is no use for a journal which is not accurate.

Re: Conflict handling in Oak

2012-12-18 Thread Thomas Mueller
Hi, So, do move and copy operations need to be preserved, or can they be converted to add node / remove node? Now we are getting somewhere: This is exactly the original topic of OAK-464. If the Microkernel converts moves to add/remove, implementing rebase on top of that results in moves of big

Re: Conflict handling in Oak

2012-12-18 Thread Thomas Mueller
Hi, But the question is how close the journal has to match the original commit, specially move and copy operations. Yes. There are various degrees of how close the journal is to the commit. One option is: the commit is preserved 1:1. The other extreme is: moves are fully converted to add+remove.

Re: Conversion Mechanism for Non-Bundle PM Repositories?

2013-01-08 Thread Thomas Mueller
Hi, For migration options see http://wiki.apache.org/jackrabbit/BackupAndMigration There is no migration tool specially for Oak yet. It would be great of course if you could write such a tool and contribute it! Regards, Thomas On 1/8/13 9:44 PM, Melanie Drake melanie.dr...@gmail.com wrote:

Re: MicroKernelIT#conflictingMove

2013-01-15 Thread Thomas Mueller
Hi, So I guess I need to understand why it is important to commit against base revision in CommitCommandNew. First, I had a really hard time understanding why we need a base revision in the commit method. We found a case where the base revision does make a difference, this case is documented in

Re: MicroKernelIT#conflictingMove

2013-01-15 Thread Thomas Mueller
Hi, Instead of the MicroKernel trying to merge changes, I would prefer if the MicroKernel would fail if a node was changed, moved or deleted after the base revision of a commit. That way, the MicroKernel API would still need a base revision in the commit call (the base revision would arguably

Re: Accessing ACLs (Was: [jira] [Created] (OAK-581) IndexDefinition for Access Control Content)

2013-01-24 Thread Thomas Mueller
Hi, Yes, I would also try to avoid using a query to read the ACLs from the content tree (for multiple reasons: to avoid maintaining indexes, to speed up access, to simplify caching). Regards, Thomas On 1/24/13 2:22 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Thu, Jan 24, 2013 at

Re: Accessing ACLs (Was: [jira] [Created] (OAK-581) IndexDefinition for Access Control Content)

2013-01-24 Thread Thomas Mueller
Hi, Ah OK. Then the only solution to avoid those indexes is to not support the feature I guess. Regards, Thomas On 1/24/13 2:27 PM, Angela Schreiber anch...@adobe.com wrote: hi jukka it's not for the access control evaluation nor for access by path that the query is used... is for the

Re: MongoMK^2 design proposal

2013-01-29 Thread Thomas Mueller
Hi, It's not clear to me how to support scalable concurrent writes. This is also a problem with the current MongoMK design, but I in your design I actually see more problems in this area (concurrent writes to nodes in the same segment for example). But maybe it's just that I don't understand this

Re: MongoMK^2 design proposal

2013-01-29 Thread Thomas Mueller
Hi, To better understand the design it would help me if you could list the MongoDb collections, and the key / properties / values. I guess the segment ids are MongoDB object ids? Or is it (part of) the path? Segments are immutable, so a commit would create a new segment So there are no MongoDB

  1   2   3   4   5   >