[jira] Created: (JCR-550) ObservationManagerFactory) -
OutOfMemoryError when re-indexing the repository MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit ObservationManagerFactory) - OutOfMemoryError when re-indexing the repository -- Key: JCR-550 URL: http://issues.apache.org/jira/browse/JCR-550 Project: Jackrabbit Issue Type: Bug Components: indexing Affects Versions: 1.0.1 Environment: tomcat 5.0 [256 up to 512 mb of ram] jackrabbit 1.0.1 jdk 1.4.2_12 Intel Xeon 3.2GHz with 2Gb of memory poi-3.0-alpha2-20060616.jar poi-contrib-3.0-alpha2-20060616.jar poi-scratchpad-3.0-alpha2-20060616.jar jackrabbit-core-1.0.1.jar jackrabbit-index-filters-1.0.1.jar jackrabbit-jcr-commons-1.0.1.jar jcr-1.0.jar tm-extractors-0.4.jar lucene-1.4.3.jar Reporter: Christian Zanata Attachments: log_files.zip [ERROR] 20060825 17:06:40 (org.apache.jackrabbit.core.observation.ObservationManagerFactory) - Synchronous EventConsumer threw exception. java.lang.OutOfMemoryError when we try to re-index a repository, the repository is quite big (more then 4 Gb of disk usage) and sometimes it stores 40Mb size documents. As attach I put all the last logs we registered, with the full stack traces. Related to this whe have also errors with Lucene: [DEBUG] 20060803 08:24:01 (org.apache.jackrabbit.core.query.LazyReader) - Dump: java.io.IOException: Invalid header signature; read 8656037701166316554, expected -2226271756974174256 at org.apache.jackrabbit.core.query.MsWordTextFilter and then this ones: [DEBUG] 20060803 08:37:17 (org.apache.jackrabbit.core.ItemManager) - removing item 8637bf5f-4689-4e75-888f-b7b89bef40c8 from cache [ WARN] 20060803 08:40:13 (org.apache.jackrabbit.core.RepositoryImpl) - Existing lock file at C:\Wave\Repository\.lock deteteced. Repository was not shut down properly. [ERROR] 20060803 09:33:14 (org.apache.jackrabbit.core.observation.ObservationManagerFactory) - Synchronous EventConsumer threw exception. java.lang.NullPointerException: null values not allowed this is our repository.xml configuration for indexing SearchIndex class=org.apache.jackrabbit.core.query.lucene.SearchIndex param name=path value=${wsp.home}/index/ param name=textFilterClasses value=org.apache.jackrabbit.core.query.lucene.TextPlainTextFilter, org.apache.jackrabbit.core.query.MsExcelTextFilter, org.apache.jackrabbit.core.query.MsPowerPointTextFilter, org.apache.jackrabbit.core.query.MsWordTextFilter, org.apache.jackrabbit.core.query.PdfTextFilter, org.apache.jackrabbit.core.query.HTMLTextFilter, org.apache.jackrabbit.core.query.XMLTextFilter, org.apache.jackrabbit.core.query.RTFTextFilter, org.apache.jackrabbit.core.query.OpenOfficeTextFilter/ param name=useCompoundFile value=true/ param name=minMergeDocs value=100/ param name=volatileIdleTime value=3/ param name=maxMergeDocs value=10/ param name=mergeFactor value=10/ param name=bufferSize value=10/ param name=cacheSize value=1000/ param name=forceConsistencyCheck value=false/ param name=autoRepair value=true/ param name=respectDocumentOrder value=false/ param name=analyzer value=org.apache.lucene.analysis.standard.StandardAnalyzer/ /SearchIndex -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: problem retrieving nodes from different workspaces
Were you able to reproduce our problem? Jukka Zitting-3 wrote: Hi, On 8/28/06, J Kuijpers [EMAIL PROTECTED] wrote: Supplied repository.xml and runnable MultipleWorkspaceTest.java http://www.nabble.com/user-files/235783/repository.xml repository.xml http://www.nabble.com/user-files/235784/MultipleWorkspaceTest.java MultipleWorkspaceTest.java The MultipleWorkspaceTest.java file appears to be empty. Could you resend it, inline if necessary? BR, Jukka Zitting -- Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED] Software craftsmanship, JCR consulting, and Java development -- View this message in context: http://www.nabble.com/problem-retrieving-nodes-from-different-workspaces-tf2177041.html#a6037018 Sent from the Jackrabbit - Dev forum at Nabble.com.
Re: problem retrieving nodes from different workspaces
Your repository.xml file is broken. You have: PersistenceManager class=org.apache.jackrabbit.core.state.db.DerbyPersistenceManager param name=url value=jdbc:derby:${rep.home}/version/db;create=true/ param name=schemaObjectPrefix value=version_/ /PersistenceManager A fixed value for the parameter 'schemaObjectPrefix' will cause Jackrabbit to write content of multiple workspaces into the same table, thus possibly overwriting content. You must use a value that includes the workspace name as a variable. E.g. the sample configuration uses this: param name=schemaObjectPrefix value=${wsp.name}_/ See also: https://svn.apache.org/repos/asf/jackrabbit/trunk/jackrabbit/src/main/config/repository.xml Using the sample repository.xml the test works fine even with a shutdown in between. regards marcel J Kuijpers wrote: Were you able to reproduce our problem? Jukka Zitting-3 wrote: Hi, On 8/28/06, J Kuijpers [EMAIL PROTECTED] wrote: Supplied repository.xml and runnable MultipleWorkspaceTest.java http://www.nabble.com/user-files/235783/repository.xml repository.xml http://www.nabble.com/user-files/235784/MultipleWorkspaceTest.java MultipleWorkspaceTest.java The MultipleWorkspaceTest.java file appears to be empty. Could you resend it, inline if necessary? BR, Jukka Zitting -- Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED] Software craftsmanship, JCR consulting, and Java development -- Marcel Reutegger Day Management AG Barfuesserplatz 6, 4001 Basel Switzerland [EMAIL PROTECTED] www.day.com T 41 61 226 98 98 F 41 61 226 98 97 This message is a private communication. If you are not the intended recipient, please do not read, copy, or use it, and do not disclose it to others. Please notify the sender of the delivery error by replying to this message, and then delete it from your system. Thank you. The sender does not assume any liability for timely, trouble-free, complete, virus free, secure, error free or uninterrupted arrival of this e-mail. For verification please request a hard copy version.
Re: ItemNotFoundException while switching between workspaces
quipere wrote: See http://www.nabble.com/problem-retrieving-nodes-from-different-workspaces-tf2177041.html Is about the same problem, doesn't throw ItemNotFounException but returns unexpected nodes. I asume this because example code lacks an ordeable noodtypedefinition. can you please check your repository.xml and see if there is the same configuration issue as with the other 'workspace test'. regards marcel
[jira] Assigned: (JCR-550) ObservationManagerFactory) -
OutOfMemoryError when re-indexing the repository In-Reply-To: [EMAIL PROTECTED] MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit [ http://issues.apache.org/jira/browse/JCR-550?page=all ] Marcel Reutegger reassigned JCR-550: Assignee: Marcel Reutegger ObservationManagerFactory) - OutOfMemoryError when re-indexing the repository -- Key: JCR-550 URL: http://issues.apache.org/jira/browse/JCR-550 Project: Jackrabbit Issue Type: Bug Components: indexing Affects Versions: 1.0.1 Environment: tomcat 5.0 [256 up to 512 mb of ram] jackrabbit 1.0.1 jdk 1.4.2_12 Intel Xeon 3.2GHz with 2Gb of memory poi-3.0-alpha2-20060616.jar poi-contrib-3.0-alpha2-20060616.jar poi-scratchpad-3.0-alpha2-20060616.jar jackrabbit-core-1.0.1.jar jackrabbit-index-filters-1.0.1.jar jackrabbit-jcr-commons-1.0.1.jar jcr-1.0.jar tm-extractors-0.4.jar lucene-1.4.3.jar Reporter: Christian Zanata Assigned To: Marcel Reutegger Attachments: log_files.zip [ERROR] 20060825 17:06:40 (org.apache.jackrabbit.core.observation.ObservationManagerFactory) - Synchronous EventConsumer threw exception. java.lang.OutOfMemoryError when we try to re-index a repository, the repository is quite big (more then 4 Gb of disk usage) and sometimes it stores 40Mb size documents. As attach I put all the last logs we registered, with the full stack traces. Related to this whe have also errors with Lucene: [DEBUG] 20060803 08:24:01 (org.apache.jackrabbit.core.query.LazyReader) - Dump: java.io.IOException: Invalid header signature; read 8656037701166316554, expected -2226271756974174256 at org.apache.jackrabbit.core.query.MsWordTextFilter and then this ones: [DEBUG] 20060803 08:37:17 (org.apache.jackrabbit.core.ItemManager) - removing item 8637bf5f-4689-4e75-888f-b7b89bef40c8 from cache [ WARN] 20060803 08:40:13 (org.apache.jackrabbit.core.RepositoryImpl) - Existing lock file at C:\Wave\Repository\.lock deteteced. Repository was not shut down properly. [ERROR] 20060803 09:33:14 (org.apache.jackrabbit.core.observation.ObservationManagerFactory) - Synchronous EventConsumer threw exception. java.lang.NullPointerException: null values not allowed this is our repository.xml configuration for indexing SearchIndex class=org.apache.jackrabbit.core.query.lucene.SearchIndex param name=path value=${wsp.home}/index/ param name=textFilterClasses value=org.apache.jackrabbit.core.query.lucene.TextPlainTextFilter, org.apache.jackrabbit.core.query.MsExcelTextFilter, org.apache.jackrabbit.core.query.MsPowerPointTextFilter, org.apache.jackrabbit.core.query.MsWordTextFilter, org.apache.jackrabbit.core.query.PdfTextFilter, org.apache.jackrabbit.core.query.HTMLTextFilter, org.apache.jackrabbit.core.query.XMLTextFilter, org.apache.jackrabbit.core.query.RTFTextFilter, org.apache.jackrabbit.core.query.OpenOfficeTextFilter/ param name=useCompoundFile value=true/ param name=minMergeDocs value=100/ param name=volatileIdleTime value=3/ param name=maxMergeDocs value=10/ param name=mergeFactor value=10/ param name=bufferSize value=10/ param name=cacheSize value=1000/ param name=forceConsistencyCheck value=false/ param name=autoRepair value=true/ param name=respectDocumentOrder value=false/ param name=analyzer value=org.apache.lucene.analysis.standard.StandardAnalyzer/ /SearchIndex -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-550) ObservationManagerFactory) -
OutOfMemoryError when re-indexing the repository In-Reply-To: [EMAIL PROTECTED] MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit [ http://issues.apache.org/jira/browse/JCR-550?page=comments#action_12431236 ] Marcel Reutegger commented on JCR-550: -- Your log files seem to indicate that some of your content is corrupt: Caused by: java.lang.IllegalArgumentException: invalid QName literal at org.apache.jackrabbit.name.QName.valueOf(QName.java:618) at org.apache.jackrabbit.core.state.util.Serializer.deserialize(Serializer.java:124) at org.apache.jackrabbit.core.state.obj.ObjectPersistenceManager.load(ObjectPersistenceManager.java:206) ... 61 more Please note that using the ObjectPersistenceManager on a production system is not recommended because it is not transactional. You should consider using DerbyPersistenceManager as your version storage. ObservationManagerFactory) - OutOfMemoryError when re-indexing the repository -- Key: JCR-550 URL: http://issues.apache.org/jira/browse/JCR-550 Project: Jackrabbit Issue Type: Bug Components: indexing Affects Versions: 1.0.1 Environment: tomcat 5.0 [256 up to 512 mb of ram] jackrabbit 1.0.1 jdk 1.4.2_12 Intel Xeon 3.2GHz with 2Gb of memory poi-3.0-alpha2-20060616.jar poi-contrib-3.0-alpha2-20060616.jar poi-scratchpad-3.0-alpha2-20060616.jar jackrabbit-core-1.0.1.jar jackrabbit-index-filters-1.0.1.jar jackrabbit-jcr-commons-1.0.1.jar jcr-1.0.jar tm-extractors-0.4.jar lucene-1.4.3.jar Reporter: Christian Zanata Assigned To: Marcel Reutegger Attachments: log_files.zip [ERROR] 20060825 17:06:40 (org.apache.jackrabbit.core.observation.ObservationManagerFactory) - Synchronous EventConsumer threw exception. java.lang.OutOfMemoryError when we try to re-index a repository, the repository is quite big (more then 4 Gb of disk usage) and sometimes it stores 40Mb size documents. As attach I put all the last logs we registered, with the full stack traces. Related to this whe have also errors with Lucene: [DEBUG] 20060803 08:24:01 (org.apache.jackrabbit.core.query.LazyReader) - Dump: java.io.IOException: Invalid header signature; read 8656037701166316554, expected -2226271756974174256 at org.apache.jackrabbit.core.query.MsWordTextFilter and then this ones: [DEBUG] 20060803 08:37:17 (org.apache.jackrabbit.core.ItemManager) - removing item 8637bf5f-4689-4e75-888f-b7b89bef40c8 from cache [ WARN] 20060803 08:40:13 (org.apache.jackrabbit.core.RepositoryImpl) - Existing lock file at C:\Wave\Repository\.lock deteteced. Repository was not shut down properly. [ERROR] 20060803 09:33:14 (org.apache.jackrabbit.core.observation.ObservationManagerFactory) - Synchronous EventConsumer threw exception. java.lang.NullPointerException: null values not allowed this is our repository.xml configuration for indexing SearchIndex class=org.apache.jackrabbit.core.query.lucene.SearchIndex param name=path value=${wsp.home}/index/ param name=textFilterClasses value=org.apache.jackrabbit.core.query.lucene.TextPlainTextFilter, org.apache.jackrabbit.core.query.MsExcelTextFilter, org.apache.jackrabbit.core.query.MsPowerPointTextFilter, org.apache.jackrabbit.core.query.MsWordTextFilter, org.apache.jackrabbit.core.query.PdfTextFilter, org.apache.jackrabbit.core.query.HTMLTextFilter, org.apache.jackrabbit.core.query.XMLTextFilter, org.apache.jackrabbit.core.query.RTFTextFilter, org.apache.jackrabbit.core.query.OpenOfficeTextFilter/ param name=useCompoundFile value=true/ param name=minMergeDocs value=100/ param name=volatileIdleTime value=3/ param name=maxMergeDocs value=10/ param name=mergeFactor value=10/ param name=bufferSize value=10/ param name=cacheSize value=1000/ param name=forceConsistencyCheck value=false/ param name=autoRepair value=true/ param name=respectDocumentOrder value=false/ param name=analyzer value=org.apache.lucene.analysis.standard.StandardAnalyzer/ /SearchIndex -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (JCR-482) DocViewSaxEventGenerator may generate non-NS-wellformed XML
[ http://issues.apache.org/jira/browse/JCR-482?page=comments#action_12431254 ] Julian Reschke commented on JCR-482: Related to this, ExportDocViewTest.compareNamespaces() makes the assumption that *all* registered namespaces need to be serialized in the root element (and refers to 6.4.2.1 as justification). However, 6.4.2.1 only talks about the relevant declarations. In any case, both the requirement in the spec and the test case should be relaxed to permit any serialization that produces a valid XML document: it should be left to the implementation when and where to include namespace declarations, as long as they the result document is namespace-wellformed. DocViewSaxEventGenerator may generate non-NS-wellformed XML --- Key: JCR-482 URL: http://issues.apache.org/jira/browse/JCR-482 Project: Jackrabbit Issue Type: Bug Components: xml Affects Versions: 0.9, 1.0, 1.0.1 Environment: n/a Reporter: Julian Reschke Assigned To: Jukka Zitting Priority: Minor Fix For: 1.1 Attachments: JIRA-482.diff.txt The XML serialization code relies on the fact that all required prefix-to-uri mappings are known beforehand (actually, when serializing the root node). So there's an assumption that the permanent namespace registry will never change during serialization, which may be incorrect when another client adds namespace registrations while the XML export is in progress. To fix this, addNamespacePrefixes should ensure that namespace declarations have been written for all prefixes used on the current node (node name + properties), potentially going back to the namespace resolver when needed. (Should there be consensus for that change I'm happy to give it a try) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Backup Tool Refactored
Hi, The GSOC is over and the backup tool need a little bit of refactoring before being committed (see past threads). Here are the changes I plan to implement. - Add a method to import/export the node version histories in VersionManager and implement them in its classes. - Subclass PropInfo to avoid writing a custom method in the original class. - Trailing spaces, comment and checkStyle on all new classes - Check everything and send the patches. The changes in the core would be limited to a new class NodeVersionHistoriesUpdatableStateManager in org.apache.jackrabbit.core.state and in the VersionManager and its various implemented classes. Those changes would be filed in a new JIRA issue as discussed previously (for administrative reasons with Google). BR, Nico my blog! http://www.deviant-abstraction.net !!