Re: HEADS UP: Jackrabbit restructuring ahead
Hi, On 12/4/06, wendy Lee [EMAIL PROTECTED] wrote: I checkout the jackrabbit-api jackrabbit-jcr-commonsjackrabbit-corejackrabbit-jcr-tests separately and under each directory I use command mvn install ,at last under the directory of jackrabbit-core .when I use mvn install . [...] Reason: Unable to download the artifact from any repository org.apache.jackrabbit:jackrabbit:pom:1.2-SNAPSHOT Ah, that's because the build doesn't find the Jackrabbit parent POM. I would suggest you to checkout the entire jackrabbit/trunk directory, then the component projects will automatically find the parent POM in ../pom.xml. Alternatively, you can copy the latest POM snapshot from http://people.apache.org/repo/m2-snapshot-repository/org/apache/jackrabbit/jackrabbit/ into your local Maven 2 repository or configure Maven 2 to use the the http://people.apache.org/repo/m2-snapshot-repository/ directly. Thanks for pointing this out, I'll see if I can make the instructions clearer on this. BR, Jukka Zitting
Re: Jackrabbit and Maven
Jukka Zitting schrieb: Hi, The Jackrabbit restructuring I did yesterday rendered some new Jackrabbit components (jackrabbit-api, etc.) without Maven 1 builds and broke the Maven 1 builds of some other components (most notably jackrabbit-core). I could fix the broken and add the missing Maven 1 builds, but since we're upgrading to Maven 2 in any case and since Maven 1 has been quite troublesome recently (the repository issue I mailed about on Friday seems to have reappeared, and I'm getting no quick help from [EMAIL PROTECTED]), I'd like to propose simply dropping Maven 1 and using Maven 2 for the main builds in trunk. Note that there are still quite a few contrib projects with just Maven 1 builds. I think we can upgrade them incrementally as time goes by. To make a mix of Maven 1 and Maven 2 projects work better, I enabled the install-maven-one-repository goal of the maven-one plugin in the Jackrabbit parent POM for Maven 2. This will make all artifacts installed to the local Maven 2 repository to automatically get installed also in the local Maven 1 repository. Jukka, thanks for all the hard work you did over the weekend (and the time to prepare for that before). Could you please confirm: with the new layout, Jackrabbit builds with Maven 2, except for some contrib components? Best regards, Julian
Scalability concerns, Alfresco performance tests
Dear Jackrabbit devs, we are considering Jackrabbit for a bigger CMS project (about 3 million documents, up to 150 concurrent editing users, lots of queries, transactions), Cocoon-based application. As I understand it, that would certainly require a scalable repository (has to be decided). Now, a news message [1] on TheServerSide about benchmarks provided by Alfresco to prove the superiority of their JCR implementation raises some concerns. Since the benchmarks are (going to be) open source, is someone interested in running them on Jackrabbit? A post in the thread claims that Jackrabbit isn't suited for large-scale scenarios and faces some problems in the transactional handling of some 100.000 nodes (Kev Smith, [2]): From what we've seen, Alfresco is comparable to JackRabbit for small case scenarios - but Alfresco is much more scalable [...] Do you agree to this statement? If yes - are these problems related to the persistence manager abstraction? Is this a known issue, and will it be addressed? Another paragraph from this post: We tried to load up JackRabbit with millions of nodes but always ran into blocker issues after about 2 million or so objects. Also when loading up JackRabbit, the load needed to be carefully performed in small chunks e.g. trying to load in 100,000 nodes at a time would cause PermGenSpace errors (even with a HUGE permgenspace!) and potentially place the repo into a non-recoverable state. I'm not sure if this will really be an issue for our usage scenario (except maybe from restoring backups), but I'm very interested in your opinions. Thanks a lot in advance! [1] http://www.theserverside.com/news/thread.tss?thread_id=43282 [2] http://www.theserverside.com/news/thread.tss?thread_id=43282#223061 -- Andreas
Re: Jackrabbit and Maven
Hi, On 12/4/06, Julian Reschke [EMAIL PROTECTED] wrote: Could you please confirm: with the new layout, Jackrabbit builds with Maven 2, except for some contrib components? Yes. The easiest way to do build and package all the components (excluding contrib) is: $ svn checkout https://svn.apache.org/repos/asf/jackrabbit/trunk jackrabbit $ cd jackrabbit $ mvn install $ (cd jackrabbit-jcr-rmi; mvn install) $ (cd jackrabbit-webapp; mvn install) This is achieved by the multimodule settings in the parent POM. There is some issue with running rmic through the antrun plugin when a component is a part of a multimodule build, so for now the jackrabbit-jcr-rmi and jackrabbit-webapp components need to be individually built. After the initial build above you have all the SNAPSHOT dependencies in your local Maven repository and you can use all the Maven 2 build commands easily also within the individual component projects. I'm looking at making setting up nightly builds (once INFRA-1008 is resolved) to have recent snapshots always available in the Apache snapshot repository. Then it will be possible to easily grab and build just a single component project. BR, Jukka Zitting
[jira] Created: (JCR-660) SQL Parser fails with SQL 92 timestamp format
SQL Parser fails with SQL 92 timestamp format - Key: JCR-660 URL: http://issues.apache.org/jira/browse/JCR-660 Project: Jackrabbit Issue Type: Improvement Affects Versions: 0.9, 1.0, 1.0.1, 1.1, 1.1.1 Reporter: Marcel Reutegger Assigned To: Marcel Reutegger Priority: Minor Fix For: 1.2 The SQL query parser fails with an exception if the SQL 92 timestamp format is used. E.g: ... WHERE my:date TIMESTAMP '1976-01-01 00:00:00.000+01:00' does not work, but the following will succeed using ISO8601: ... WHERE my:date TIMESTAMP '1976-01-01T00:00:00.000+01:00' -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (JCR-661) RMIC not working in subprojects when compiling parent using maven2
RMIC not working in subprojects when compiling parent using maven2 -- Key: JCR-661 URL: http://issues.apache.org/jira/browse/JCR-661 Project: Jackrabbit Issue Type: Bug Components: config Affects Versions: 1.2 Reporter: Jan Kuzniak This is because there is a bug such that if you have a child build which uses the ant plugin it inherits the plugin dependencies of the first time the plugin is declared. The workaround is to put the antrun plugin in the toplevel, and add the java jar to its plugin dependencies. (reference: http://mail-archives.apache.org/mod_mbox/maven-users/200602.mbox/[EMAIL PROTECTED]) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (JCR-661) RMIC not working in subprojects when compiling parent using maven2
[ http://issues.apache.org/jira/browse/JCR-661?page=all ] Jukka Zitting resolved JCR-661. --- Resolution: Fixed Excellent, thanks! That works fine, committed in revision 482149. RMIC not working in subprojects when compiling parent using maven2 -- Key: JCR-661 URL: http://issues.apache.org/jira/browse/JCR-661 Project: Jackrabbit Issue Type: Bug Components: maven Reporter: Jan Kuzniak Assigned To: Jukka Zitting Priority: Minor Fix For: 1.2 Attachments: pom-rmi-patch.patch This is because there is a bug such that if you have a child build which uses the ant plugin it inherits the plugin dependencies of the first time the plugin is declared. The workaround is to put the antrun plugin in the toplevel, and add the java jar to its plugin dependencies. (reference: http://mail-archives.apache.org/mod_mbox/maven-users/200602.mbox/[EMAIL PROTECTED]) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Jackrabbit and Maven
Hi, On 12/4/06, Jukka Zitting [EMAIL PROTECTED] wrote: There is some issue with running rmic through the antrun plugin when a component is a part of a multimodule build, so for now the jackrabbit-jcr-rmi and jackrabbit-webapp components need to be individually built. See JCR-661, where Jan Kuzniak already solved this issue! Now the sequence to checkout and *all* the Jackrabbit release components is simply: $ svn checkout http://svn.apache.org/repos/asf/jackrabbit/trunk jackrabbit $ cd jackrabbit $ mvn install This builds and packages all the components in correct order, installs them in the local Maven 2 and Maven 1 repositories, and even outputs a nice summary at the end. There are still some things (like checkstyle integration) missing, but overall things work even nicer than I had hoped. BR, Jukka Zitting
Re: Jackrabbit and Maven
Hi, On 12/4/06, Jan Kuźniak [EMAIL PROTECTED] wrote: On 12/4/06, Jukka Zitting [EMAIL PROTECTED] wrote: There are still some things (like checkstyle integration) missing, but overall things work even nicer than I had hoped. You say checkstyle - you have checkstyle. Thanks! But first I have a question about internals of checkstyle.xml. I would love to establish an eclipse code formatter profile and start cleaning up the code because it looks awful and inconsistent here and there. You are right, the current codebase does break a number of syntax guidelines, even the ones encoded in the checkstyle.xml profile. There's a meta-issue JCR-97 for improving this, but there hasn't been much coordinated effort to improve things other than for new code that gets written. I don't quite understand why max line length is set to 132 instead of 80? It is almost half more and makes it harder to read, especially on smaller screens. Also, when intendation makes it hard to fit in 80 characters at line it is good reason to extract method or variable instead of relaxating line constraints. I don't know the rationale. I generally try to keep lines below 80 chars in any case, so at least I wouldn't mind making the guideline more strict. BR, Jukka Zitting
[jira] Resolved: (JCR-619) CacheManager (Memory Management in Jackrabbit)
[ http://issues.apache.org/jira/browse/JCR-619?page=all ] Stefan Guggisberg resolved JCR-619. --- Resolution: Fixed applied patch cacheManager7.txt (svn r481196). xiaohua confirmed that it solved the latest deadlock issue. CacheManager (Memory Management in Jackrabbit) -- Key: JCR-619 URL: http://issues.apache.org/jira/browse/JCR-619 Project: Jackrabbit Issue Type: New Feature Components: core Affects Versions: 1.1 Reporter: Thomas Mueller Assigned To: Stefan Guggisberg Fix For: 1.2 Attachments: cacheManager.txt, cacheManager2.txt, cacheManager5.txt, cacheManager6.txt, cacheManager7.txt, stack.txt Jackrabbit can run out of memory because the the combined size of the various caches is not managed. The biggest problem (for me) is the combined size of the o.a.j.core.state.MLRUItemStateCache caches. Each session seems to create a few (?) of those caches, and each one is limited to 4 MB by default. I have implemented a dynamic (cache-) memory management service that distributes a fixed amount of memory dynamically to all those caches. Here is the patch -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Scalability concerns, Alfresco performance tests
Hi Andreas, Now, a news message [1] on TheServerSide about benchmarks provided by Alfresco to prove the superiority ermhh let's say state not prove ;) ...of their JCR implementation raises some concerns. I guess that this may exactly have been the intention ;) Also, the term JCR implementation may not be technically accurate, maybe someone could point me to an updated version of this: http://wiki.alfresco.com/w/index.php?title=JSR-170_Compliance A post in the thread claims that Jackrabbit isn't suited for large-scale scenarios and faces some problems in the transactional handling of some 100.000 nodes (Kev Smith, [2]): While Kev possibly has reasons to believe that, I don't. (Unless he talks about some 100k nodes a single transaction and a given memory size.) From what we've seen, Alfresco is comparable to JackRabbit for small case scenarios - but Alfresco is much more scalable [...] Do you agree to this statement? If yes - are these problems related to the persistence manager abstraction? Is this a known issue, and will it be addressed? I do not even remotely agree with this statement. Jackrabbit has been built to scale freely in size. I have a hard time understanding this argument since both Jackrabbit and Alfresco can use the same RDBMS as the persistence layer, so at least on the persistence layer there should not be a substantial difference. Thoughts? We tried to load up JackRabbit with millions of nodes but always ran into blocker issues after about 2 million or so objects. Also when loading up JackRabbit, the load needed to be carefully performed in small chunks e.g. trying to load in 100,000 nodes at a time would cause PermGenSpace errors (even with a HUGE permgenspace!) and potentially place the repo into a non-recoverable state. I'm not sure if this will really be an issue for our usage scenario (except maybe from restoring backups), but I'm very interested in your opinions. That's true, the size of the non-binary portions of a commit are currently memory constrained. Backup/Restore operations in my experience usually happen on the persistence layer, which means that restore operation (obviously) does not go through the normal user API. I actually would go as far as stating that it would be close to abuse of the API to go through the transient layer to restore an entire content repository. We are currently working on a solution for that, but since nobody had a pressing need, it had a relatively low priority. If this is a pressing issue for your project feel free to file a JIRA issue. regards, david
Re: Scalability concerns, Alfresco performance tests
David, thanks for the clarification! David Nuescheler schrieb: Hi Andreas, Now, a news message [1] on TheServerSide about benchmarks provided by Alfresco to prove the superiority ermhh let's say state not prove ;) Agreed, my wording was quite provoking, this was not intended :) [...] From what we've seen, Alfresco is comparable to JackRabbit for small case scenarios - but Alfresco is much more scalable [...] Do you agree to this statement? If yes - are these problems related to the persistence manager abstraction? Is this a known issue, and will it be addressed? I do not even remotely agree with this statement. Jackrabbit has been built to scale freely in size. That's good to know. In your answer on TheServerSide, you said that Scalability is mainly a matter of choosing and configuring the persistence layer correctly. Are there any scenario recommendations / best practises available? I'll check out the website again, but insider knowledge is as always greatly appreciated. [...] Backup/Restore operations in my experience usually happen on the persistence layer, which means that restore operation (obviously) does not go through the normal user API. How would a transactional replication be implemented (e.g. from an authoring system to a live system in a DMZ)? If a lot of documents are involved, for instance after an URL change which affects a lot of links, this could probably lead to such a massive transaction. Should this be implemented by accessing the persistence layer directly? IIUC this would have the drawback that the JCR implementation couldn't be replaced without changing the replication code ... I actually would go as far as stating that it would be close to abuse of the API to go through the transient layer to restore an entire content repository. We are currently working on a solution for that, but since nobody had a pressing need, it had a relatively low priority. If this is a pressing issue for your project I hope it won't be :) Thanks a lot, -- Andreas
[jira] Created: (JCR-662) RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene indexing
RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene indexing -- Key: JCR-662 URL: http://issues.apache.org/jira/browse/JCR-662 Project: Jackrabbit Issue Type: Bug Components: indexing Affects Versions: 1.0.1 Reporter: Anthony Ogier There is a problem in POI that makes the Lucene indexer (which calls the jackrabbit MsExcelTextFilter while defined in the correct xml) crashes. Actually, in line 85 of MsExcelTextFilter.java : HSSFWorkbook workbook = new HSSFWorkbook(fs); Could sometime throws a RecordFormatException which extends *RuntimeException* !! So, I think it would be good to try / catch that exception surrounding this line, and then, throwing a IOException instead (so the calling classes could correctly reacts). See the POI bug : http://issues.apache.org/bugzilla/show_bug.cgi?id=29982 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (JCR-662) RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene indexing
[ http://issues.apache.org/jira/browse/JCR-662?page=all ] Jukka Zitting resolved JCR-662. --- Resolution: Duplicate Assignee: Jukka Zitting This seems to be a duplicate of JCR-574. The fix is included in the 1.1.1 release that I'm going to announce tonight. You can already access the official release packages at http://www.apache.org/dyn/closer.cgi/jackrabbit/. RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene indexing -- Key: JCR-662 URL: http://issues.apache.org/jira/browse/JCR-662 Project: Jackrabbit Issue Type: Bug Components: indexing Affects Versions: 1.0.1 Reporter: Anthony Ogier Assigned To: Jukka Zitting There is a problem in POI that makes the Lucene indexer (which calls the jackrabbit MsExcelTextFilter while defined in the correct xml) crashes. Actually, in line 85 of MsExcelTextFilter.java : HSSFWorkbook workbook = new HSSFWorkbook(fs); Could sometime throws a RecordFormatException which extends *RuntimeException* !! So, I think it would be good to try / catch that exception surrounding this line, and then, throwing a IOException instead (so the calling classes could correctly reacts). See the POI bug : http://issues.apache.org/bugzilla/show_bug.cgi?id=29982 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[ANNOUNCE] Apache Jackrabbit 1.1.1 released
The Apache Jackrabbit community is pleased to announce the release of Apache Jackrabbit version 1.1.1. The release is available for download at: http://jackrabbit.apache.org/downloads.cgi Release Notes -- Apache Jackrabbit -- Version 1.1.1 Introduction The Apache Jackrabbit project is an effort to build and maintain an open source implementation of the Content Repository for Java Technology API (JCR) specified in the Java Specification Request 170 (JSR-170). The project also produces a various tools and components related to the JCR API. Apache Jackrabbit 1.1.1 is a patch release that fixes a number of issues, see the include change history for details. No new features or configuration changes have been introduced since the 1.1 release. See the Apache Jackrabbit website at http://jackrabbit.apache.org/ for more information. Release Contents The main contents of this release are the Apache Jackrabbit core content repository implementation and the related general-purpose JCR utilities: jackrabbit-core-1.1.1-src.jar jackrabbit-core-1.1.1.jar jackrabbit-jcr-commons-1.1.1.jar This release contains also additional components that offer extra functionality for use with either Apache Jackrabbit core or any JCR compliant content repository. These modules should be considered beta quality: * RMI network layer for the JCR API. jackrabbit-jcr-rmi-1.1.1-src.jar jackrabbit-jcr-rmi-1.1.1.jar * Deployable Jackrabbit installation with WebDAV support for JCR. jackrabbit-jcr-server-1.1.1-src.jar jackrabbit-jcr-webdav-1.1.1.jar jackrabbit-jcr-client-1.1.1.jar jackrabbit-jcr-server-1.1.1.jar jackrabbit-server-1.1.1.war * J2EE Connector Architecture (JCA) resource adapter for Jackrabbit. jackrabbit-jca-1.1.1-src.jar jackrabbit-jca-1.1.1.rar * Text indexing filters for Jackrabbit. Includes example filters for Adobe PDF and MS Excel, PowerPoint, and Word. jackrabbit-index-filters-1.1.1-src.jar jackrabbit-index-filters-1.1.1.jar All components are released as a source jar file and one or more compiled binary files. All files contain a README.txt file with more information. Note that external runtime dependencies are only included for the war and rar archives. Other dependencies can be downloaded either manually or automatically using the Maven build system. Each release file is accompanied by SHA1 and MD5 checksums and a PGP signature. The public key used for the signatures can be found in the KEYS file located in the parent directory. Upgrading from 1.0 -- Apache Jackrabbit 1.1.1 is fully compatible with the 1.0 release. An Apache Jackrabbit 1.0 installation can be upgraded by replacing the relevant jar files with the new versions. No changes to repository contents are needed. Change History -- Changes since 1.1: * [JCR-67] - Node.canAddMixin(String) * [JCR-550] - OutOfMemoryError when re-indexing the repository * [JCR-562] - 'OR' in XPath query badly interpreted * [JCR-563] - encode/decode * [JCR-574] - MsExcelTextFilter throws Exception. Repository is not * [JCR-586] - Removing a mixin that adds a same-name-sibling child node * [JCR-587] - XMLTextFilter does not extract text elements * [JCR-594] - It's not possible to register event listeners that filters * [JCR-598] - DateValue.equals() relies on Calendar.equals() * [JCR-600] - Repository does not release all resources on shutdown * [JCR-602] - importXML still depends on Xerces * [JCR-603] - OracleFileSystem can't handle empty files * [JCR-605] - Error when registering node types on virgin repository * [JCR-606] - RMI-DateValue does not support full ISO8601 format * [JCR-620] - Workspace.getImportHandler() doesn't handle namespace * [JCR-624] - OutOfMemoryError When repeat login and the logout many times * [JCR-628] - OutOfMemory problem: HandleMonitor does not release closed * [JCR-629] - CompactNodeTypeDefWriter does not escaped names properly * [JCR-636] - Local AuthContext authenticates if LoginModule should be * [JCR-637] - Multiple namespace definitions in CND prevent definition of * [JCR-646] - Misleading exception message for jcr:deref() * [JCR-649] - Like expression does not match line terminator in String See the issue tracker at http://issues.apache.org/jira/browse/JCR for issue details and the full change histories of all Apache Jackrabbit versions. Known Issues The known issues in this release are listed below: * [JCR-18] - Multithreading issue with versioning * [JCR-43] - Restore on node creates same-name-sibling of OPV-Version * [JCR-320] - BinaryValue equals fails for two objects with two different * [JCR-385] - ClassCastExeption when executing union queries * [JCR-392] - Accessing element by number does not work * [JCR-406] - If header evaluation compliance provlems * [JCR-435] - Node.update() does not work