Re: HEADS UP: Jackrabbit restructuring ahead

2006-12-04 Thread Jukka Zitting

Hi,

On 12/4/06, wendy Lee [EMAIL PROTECTED] wrote:

I checkout the jackrabbit-api
jackrabbit-jcr-commonsjackrabbit-corejackrabbit-jcr-tests separately and
under each directory I use command mvn install ,at last under the
directory of jackrabbit-core .when I use mvn install .
[...]
Reason: Unable to download the artifact from any repository

  org.apache.jackrabbit:jackrabbit:pom:1.2-SNAPSHOT


Ah, that's because the build doesn't find the Jackrabbit parent POM.

I would suggest you to checkout the entire jackrabbit/trunk directory,
then the component projects will automatically find the parent POM in
../pom.xml.

Alternatively, you can copy the latest POM snapshot from
http://people.apache.org/repo/m2-snapshot-repository/org/apache/jackrabbit/jackrabbit/
into your local Maven 2 repository or configure Maven 2 to use the the
http://people.apache.org/repo/m2-snapshot-repository/ directly.

Thanks for pointing this out, I'll see if I can make the instructions
clearer on this.

BR,

Jukka Zitting


Re: Jackrabbit and Maven

2006-12-04 Thread Julian Reschke

Jukka Zitting schrieb:

Hi,

The Jackrabbit restructuring I did yesterday rendered some new
Jackrabbit components (jackrabbit-api, etc.) without Maven 1 builds
and broke the Maven 1 builds of some other components (most notably
jackrabbit-core).

I could fix the broken and add the missing Maven 1 builds, but since
we're upgrading to Maven 2 in any case and since Maven 1 has been
quite troublesome recently (the repository issue I mailed about on
Friday seems to have reappeared, and I'm getting no quick help from
[EMAIL PROTECTED]), I'd like to propose simply dropping Maven 1 and
using Maven 2 for the main builds in trunk.

Note that there are still quite a few contrib projects with just Maven
1 builds. I think we can upgrade them incrementally as time goes by.
To make a mix of Maven 1 and Maven 2 projects work better, I enabled
the install-maven-one-repository goal of the maven-one plugin in the
Jackrabbit parent POM for Maven 2. This will make all artifacts
installed to the local Maven 2 repository to automatically get
installed also in the local Maven 1 repository.


Jukka,

thanks for all the hard work you did over the weekend (and the time to 
prepare for that before).


Could you please confirm: with the new layout, Jackrabbit builds with 
Maven 2, except for some contrib components?


Best regards, Julian


Scalability concerns, Alfresco performance tests

2006-12-04 Thread Andreas Hartmann
Dear Jackrabbit devs,

we are considering Jackrabbit for a bigger CMS project (about
3 million documents, up to 150 concurrent editing users,
lots of queries, transactions), Cocoon-based application.
As I understand it, that would certainly require a scalable
repository (has to be decided).

Now, a news message [1] on TheServerSide about benchmarks provided
by Alfresco to prove the superiority of their JCR implementation
raises some concerns.

Since the benchmarks are (going to be) open source, is someone
interested in running them on Jackrabbit?

A post in the thread claims that Jackrabbit isn't suited for
large-scale scenarios and faces some problems in the transactional
handling of some 100.000 nodes (Kev Smith, [2]):

From what we've seen, Alfresco is comparable to JackRabbit for small
case scenarios - but Alfresco is much more scalable [...]

Do you agree to this statement? If yes - are these problems related
to the persistence manager abstraction? Is this a known issue, and
will it be addressed?

Another paragraph from this post:

We tried to load up JackRabbit with millions of nodes but always ran
into blocker issues after about 2 million or so objects. Also when
loading up JackRabbit, the load needed to be carefully performed in
small chunks e.g. trying to load in 100,000 nodes at a time would cause
PermGenSpace errors (even with a HUGE permgenspace!) and potentially
place the repo into a non-recoverable state.

I'm not sure if this will really be an issue for our usage
scenario (except maybe from restoring backups), but I'm very
interested in your opinions.

Thanks a lot in advance!

[1] http://www.theserverside.com/news/thread.tss?thread_id=43282
[2] http://www.theserverside.com/news/thread.tss?thread_id=43282#223061



-- Andreas



Re: Jackrabbit and Maven

2006-12-04 Thread Jukka Zitting

Hi,

On 12/4/06, Julian Reschke [EMAIL PROTECTED] wrote:

Could you please confirm: with the new layout, Jackrabbit builds with
Maven 2, except for some contrib components?


Yes. The easiest way to do build and package all the components
(excluding contrib) is:

   $ svn checkout https://svn.apache.org/repos/asf/jackrabbit/trunk jackrabbit
   $ cd jackrabbit
   $ mvn install
   $ (cd jackrabbit-jcr-rmi; mvn install)
   $ (cd jackrabbit-webapp; mvn install)

This is achieved by the multimodule settings in the parent POM. There
is some issue with running rmic through the antrun plugin when a
component is a part of a multimodule build, so for now the
jackrabbit-jcr-rmi and jackrabbit-webapp components need to be
individually built.

After the initial build above you have all the SNAPSHOT dependencies
in your local Maven repository and you can use all the Maven 2 build
commands easily also within the individual component projects. I'm
looking at making setting up nightly builds (once INFRA-1008 is
resolved) to have recent snapshots always available in the Apache
snapshot repository. Then it will be possible to easily grab and build
just a single component project.

BR,

Jukka Zitting


[jira] Created: (JCR-660) SQL Parser fails with SQL 92 timestamp format

2006-12-04 Thread Marcel Reutegger (JIRA)
SQL Parser fails with SQL 92 timestamp format
-

 Key: JCR-660
 URL: http://issues.apache.org/jira/browse/JCR-660
 Project: Jackrabbit
  Issue Type: Improvement
Affects Versions: 0.9, 1.0, 1.0.1, 1.1, 1.1.1
Reporter: Marcel Reutegger
 Assigned To: Marcel Reutegger
Priority: Minor
 Fix For: 1.2


The SQL query parser fails with an exception if the SQL 92 timestamp format is 
used.

E.g:
... WHERE my:date  TIMESTAMP '1976-01-01 00:00:00.000+01:00'

does not work, but the following will succeed using ISO8601:

... WHERE my:date  TIMESTAMP '1976-01-01T00:00:00.000+01:00'

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (JCR-661) RMIC not working in subprojects when compiling parent using maven2

2006-12-04 Thread Jan Kuzniak (JIRA)
RMIC not working in subprojects when compiling parent using maven2
--

 Key: JCR-661
 URL: http://issues.apache.org/jira/browse/JCR-661
 Project: Jackrabbit
  Issue Type: Bug
  Components: config
Affects Versions: 1.2
Reporter: Jan Kuzniak


This is because there is a bug such that if you have a child build which uses 
the ant plugin it inherits the plugin dependencies of the first time the plugin 
is declared.

The workaround is to put the antrun plugin in the toplevel, and add the java 
jar to its plugin dependencies.

(reference: 
http://mail-archives.apache.org/mod_mbox/maven-users/200602.mbox/[EMAIL 
PROTECTED])

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (JCR-661) RMIC not working in subprojects when compiling parent using maven2

2006-12-04 Thread Jukka Zitting (JIRA)
 [ http://issues.apache.org/jira/browse/JCR-661?page=all ]

Jukka Zitting resolved JCR-661.
---

Resolution: Fixed

Excellent, thanks! That works fine, committed in revision 482149.

 RMIC not working in subprojects when compiling parent using maven2
 --

 Key: JCR-661
 URL: http://issues.apache.org/jira/browse/JCR-661
 Project: Jackrabbit
  Issue Type: Bug
  Components: maven
Reporter: Jan Kuzniak
 Assigned To: Jukka Zitting
Priority: Minor
 Fix For: 1.2

 Attachments: pom-rmi-patch.patch


 This is because there is a bug such that if you have a child build which uses 
 the ant plugin it inherits the plugin dependencies of the first time the 
 plugin is declared.
 The workaround is to put the antrun plugin in the toplevel, and add the java 
 jar to its plugin dependencies.
 (reference: 
 http://mail-archives.apache.org/mod_mbox/maven-users/200602.mbox/[EMAIL 
 PROTECTED])

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Jackrabbit and Maven

2006-12-04 Thread Jukka Zitting

Hi,

On 12/4/06, Jukka Zitting [EMAIL PROTECTED] wrote:

There  is some issue with running rmic through the antrun plugin
when a component is a part of a multimodule build, so for now the
jackrabbit-jcr-rmi and jackrabbit-webapp components need to be
individually built.


See JCR-661, where Jan Kuzniak already solved this issue!

Now the sequence to checkout and *all* the Jackrabbit release
components is simply:

   $ svn checkout http://svn.apache.org/repos/asf/jackrabbit/trunk jackrabbit
   $ cd jackrabbit
   $ mvn install

This builds and packages all the components in correct order, installs
them in the local Maven 2 and Maven 1 repositories, and even outputs a
nice summary at the end.

There are still some things (like checkstyle integration) missing, but
overall things work even nicer than I had hoped.

BR,

Jukka Zitting


Re: Jackrabbit and Maven

2006-12-04 Thread Jukka Zitting

Hi,

On 12/4/06, Jan Kuźniak [EMAIL PROTECTED] wrote:

  On 12/4/06, Jukka Zitting [EMAIL PROTECTED] wrote:
 There are still some things (like checkstyle integration) missing, but
 overall things work even nicer than I had hoped.

You say checkstyle - you have checkstyle.


Thanks!


But first I have a question about internals of checkstyle.xml. I would love to
establish an eclipse code formatter profile and start cleaning up the code
because it looks awful and inconsistent here and there.


You are right, the current codebase does break a number of syntax
guidelines, even the ones encoded in the checkstyle.xml profile.
There's a meta-issue JCR-97 for improving this, but there hasn't been
much coordinated effort to improve things other than for new code that
gets written.


I don't quite understand why max line length is set to 132 instead of 80? It is
almost half more and makes it harder to read, especially on smaller screens.
Also, when intendation makes it hard to fit in 80 characters at line it is good
reason to extract method or variable instead of relaxating line constraints.


I don't know the rationale. I generally try to keep lines below 80
chars in any case, so at least I wouldn't mind making the guideline
more strict.

BR,

Jukka Zitting


[jira] Resolved: (JCR-619) CacheManager (Memory Management in Jackrabbit)

2006-12-04 Thread Stefan Guggisberg (JIRA)
 [ http://issues.apache.org/jira/browse/JCR-619?page=all ]

Stefan Guggisberg resolved JCR-619.
---

Resolution: Fixed

applied patch cacheManager7.txt (svn r481196).

xiaohua confirmed that it solved the latest deadlock issue.

 CacheManager (Memory Management in Jackrabbit)
 --

 Key: JCR-619
 URL: http://issues.apache.org/jira/browse/JCR-619
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Affects Versions: 1.1
Reporter: Thomas Mueller
 Assigned To: Stefan Guggisberg
 Fix For: 1.2

 Attachments: cacheManager.txt, cacheManager2.txt, cacheManager5.txt, 
 cacheManager6.txt, cacheManager7.txt, stack.txt


 Jackrabbit can run out of memory because the the combined size of the various 
 caches is not managed. The biggest problem (for me) is the combined size of 
 the o.a.j.core.state.MLRUItemStateCache caches. Each session seems to create 
 a few (?) of those caches, and each one is limited to 4 MB by default.
 I have implemented a dynamic (cache-) memory management service that 
 distributes a fixed amount of memory dynamically to all those caches.
 Here is the patch

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Scalability concerns, Alfresco performance tests

2006-12-04 Thread David Nuescheler

Hi Andreas,



Now, a news message [1] on TheServerSide about benchmarks provided
by Alfresco to prove the superiority

ermhh let's say state not prove ;)


...of their JCR implementation raises some concerns.

I guess that this may exactly have been the intention ;)

Also, the term JCR implementation may not be technically
accurate, maybe someone could point me to an updated
version of this:
http://wiki.alfresco.com/w/index.php?title=JSR-170_Compliance


A post in the thread claims that Jackrabbit isn't suited for
large-scale scenarios and faces some problems in the transactional
handling of some 100.000 nodes (Kev Smith, [2]):

While Kev possibly has reasons to believe that, I don't.
(Unless he talks about some 100k nodes a single transaction
and a given memory size.)


From what we've seen, Alfresco is comparable to JackRabbit for small
case scenarios - but Alfresco is much more scalable [...]
Do you agree to this statement? If yes - are these problems related
to the persistence manager abstraction? Is this a known issue, and
will it be addressed?

I do not even remotely agree with this statement.
Jackrabbit has been built to scale freely in size.

I have a hard time understanding this argument since both Jackrabbit
and Alfresco can use the same RDBMS as the persistence layer, so
at least on the persistence layer there should not be a substantial
difference. Thoughts?


We tried to load up JackRabbit with millions of nodes but always ran
into blocker issues after about 2 million or so objects. Also when
loading up JackRabbit, the load needed to be carefully performed in
small chunks e.g. trying to load in 100,000 nodes at a time would cause
PermGenSpace errors (even with a HUGE permgenspace!) and potentially
place the repo into a non-recoverable state.
I'm not sure if this will really be an issue for our usage
scenario (except maybe from restoring backups), but I'm very
interested in your opinions.

That's true, the size of the non-binary portions of a commit are
currently memory constrained.
Backup/Restore operations in my experience usually happen on the
persistence layer, which means that restore operation (obviously) does
not go through the normal user API. I actually would go as far as stating
that it would be close to abuse of the API to go through the transient layer
to restore an entire content repository.
We are currently working on a solution for that, but since nobody had
a pressing need, it had a relatively low priority. If this is a pressing issue
for your project feel free to file a JIRA issue.

regards,
david


Re: Scalability concerns, Alfresco performance tests

2006-12-04 Thread Andreas Hartmann
David,

thanks for the clarification!

David Nuescheler schrieb:
 Hi Andreas,
 
 
 Now, a news message [1] on TheServerSide about benchmarks provided
 by Alfresco to prove the superiority
 ermhh let's say state not prove ;)

Agreed, my wording was quite provoking, this was not intended :)

[...]

 From what we've seen, Alfresco is comparable to JackRabbit for small
 case scenarios - but Alfresco is much more scalable [...]
 Do you agree to this statement? If yes - are these problems related
 to the persistence manager abstraction? Is this a known issue, and
 will it be addressed?
 I do not even remotely agree with this statement.
 Jackrabbit has been built to scale freely in size.

That's good to know.

In your answer on TheServerSide, you said that Scalability is mainly
a matter of choosing and configuring the persistence layer correctly.
Are there any scenario recommendations / best practises available?
I'll check out the website again, but insider knowledge is as always
greatly appreciated.

[...]

 Backup/Restore operations in my experience usually happen on the
 persistence layer, which means that restore operation (obviously) does
 not go through the normal user API.

How would a transactional replication be implemented (e.g. from an
authoring system to a live system in a DMZ)? If a lot of documents
are involved, for instance after an URL change which affects a lot
of links, this could probably lead to such a massive transaction.
Should this be implemented by accessing the persistence layer directly?
IIUC this would have the drawback that the JCR implementation couldn't
be replaced without changing the replication code ...

 I actually would go as far as stating
 that it would be close to abuse of the API to go through the transient
 layer
 to restore an entire content repository.
 We are currently working on a solution for that, but since nobody had
 a pressing need, it had a relatively low priority. If this is a pressing
 issue for your project 

I hope it won't be :)

Thanks a lot,

-- Andreas



[jira] Created: (JCR-662) RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene indexing

2006-12-04 Thread Anthony Ogier (JIRA)
RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene 
indexing
--

 Key: JCR-662
 URL: http://issues.apache.org/jira/browse/JCR-662
 Project: Jackrabbit
  Issue Type: Bug
  Components: indexing
Affects Versions: 1.0.1
Reporter: Anthony Ogier


There is a problem in POI that makes the Lucene indexer (which calls the 
jackrabbit MsExcelTextFilter while defined in the correct xml) crashes.
Actually, in line 85 of MsExcelTextFilter.java :
HSSFWorkbook workbook = new HSSFWorkbook(fs);

Could sometime throws a RecordFormatException which extends *RuntimeException* 
!!
So, I think it would be good to try / catch that exception surrounding this 
line, and then, throwing a IOException instead (so the calling classes could 
correctly reacts).

See the POI bug : http://issues.apache.org/bugzilla/show_bug.cgi?id=29982

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (JCR-662) RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene indexing

2006-12-04 Thread Jukka Zitting (JIRA)
 [ http://issues.apache.org/jira/browse/JCR-662?page=all ]

Jukka Zitting resolved JCR-662.
---

Resolution: Duplicate
  Assignee: Jukka Zitting

This seems to be a duplicate of JCR-574.

The fix is included in the 1.1.1 release that I'm going to announce tonight.  
You can already access the official release packages at 
http://www.apache.org/dyn/closer.cgi/jackrabbit/.

 RecordFormatException in MsExcelTextFilter.initializeReader breaks lucene 
 indexing
 --

 Key: JCR-662
 URL: http://issues.apache.org/jira/browse/JCR-662
 Project: Jackrabbit
  Issue Type: Bug
  Components: indexing
Affects Versions: 1.0.1
Reporter: Anthony Ogier
 Assigned To: Jukka Zitting

 There is a problem in POI that makes the Lucene indexer (which calls the 
 jackrabbit MsExcelTextFilter while defined in the correct xml) crashes.
 Actually, in line 85 of MsExcelTextFilter.java :
 HSSFWorkbook workbook = new HSSFWorkbook(fs);
 Could sometime throws a RecordFormatException which extends 
 *RuntimeException* !!
 So, I think it would be good to try / catch that exception surrounding this 
 line, and then, throwing a IOException instead (so the calling classes could 
 correctly reacts).
 See the POI bug : http://issues.apache.org/bugzilla/show_bug.cgi?id=29982

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[ANNOUNCE] Apache Jackrabbit 1.1.1 released

2006-12-04 Thread Jukka Zitting

The Apache Jackrabbit community is pleased to announce the release of
Apache Jackrabbit version 1.1.1. The release is available for download
at:

 http://jackrabbit.apache.org/downloads.cgi


Release Notes -- Apache Jackrabbit -- Version 1.1.1

Introduction


The Apache Jackrabbit project is an effort to build and maintain an
open source implementation of the Content Repository for Java
Technology API (JCR) specified in the Java Specification Request 170
(JSR-170). The project also produces a various tools and components
related to the JCR API.

Apache Jackrabbit 1.1.1 is  a patch release that fixes a number of
issues, see the include change history for details. No new features or
configuration changes have been introduced since the 1.1 release.

See the Apache Jackrabbit website at http://jackrabbit.apache.org/ for
more information.

Release Contents


The main contents of this release are the Apache Jackrabbit core
content repository implementation and the related general-purpose JCR
utilities:

jackrabbit-core-1.1.1-src.jar

jackrabbit-core-1.1.1.jar
jackrabbit-jcr-commons-1.1.1.jar

This release contains also additional components that offer extra
functionality for use with either Apache Jackrabbit core or any JCR
compliant content repository. These modules should be considered beta
quality:

* RMI network layer for the JCR API.

jackrabbit-jcr-rmi-1.1.1-src.jar
jackrabbit-jcr-rmi-1.1.1.jar

* Deployable Jackrabbit installation with WebDAV support for JCR.

jackrabbit-jcr-server-1.1.1-src.jar

jackrabbit-jcr-webdav-1.1.1.jar
jackrabbit-jcr-client-1.1.1.jar
jackrabbit-jcr-server-1.1.1.jar
jackrabbit-server-1.1.1.war

* J2EE Connector Architecture (JCA) resource adapter for Jackrabbit.

jackrabbit-jca-1.1.1-src.jar
jackrabbit-jca-1.1.1.rar

* Text indexing filters for Jackrabbit. Includes example filters
  for Adobe PDF and MS Excel, PowerPoint, and Word.

jackrabbit-index-filters-1.1.1-src.jar
jackrabbit-index-filters-1.1.1.jar

All components are released as a source jar file and one or more
compiled binary files. All files contain a README.txt file with more
information. Note that external runtime dependencies are only included
for the war and rar archives. Other dependencies can be downloaded
either manually or automatically using the Maven build system.

Each release file is accompanied by SHA1 and MD5 checksums and a PGP
signature. The public key used for the signatures can be found in the
KEYS file located in the parent directory.

Upgrading from 1.0
--

Apache Jackrabbit 1.1.1 is fully compatible with the 1.0 release. An
Apache Jackrabbit 1.0 installation can be upgraded by replacing the
relevant jar files with the new versions. No changes to repository
contents are needed.

Change History
--

Changes since 1.1:

   * [JCR-67] - Node.canAddMixin(String)
   * [JCR-550] - OutOfMemoryError when re-indexing the repository
   * [JCR-562] - 'OR' in XPath query badly interpreted
   * [JCR-563] - encode/decode
   * [JCR-574] - MsExcelTextFilter throws Exception. Repository is not
   * [JCR-586] - Removing a mixin that adds a same-name-sibling child node
   * [JCR-587] - XMLTextFilter does not extract text elements
   * [JCR-594] - It's not possible to register event listeners that filters
   * [JCR-598] - DateValue.equals() relies on Calendar.equals()
   * [JCR-600] - Repository does not release all resources on shutdown
   * [JCR-602] - importXML still depends on Xerces
   * [JCR-603] - OracleFileSystem can't handle empty files
   * [JCR-605] - Error when registering node types on virgin repository
   * [JCR-606] - RMI-DateValue does not support full ISO8601 format
   * [JCR-620] - Workspace.getImportHandler() doesn't handle namespace
   * [JCR-624] - OutOfMemoryError When repeat login and the logout many times
   * [JCR-628] - OutOfMemory problem: HandleMonitor does not release closed
   * [JCR-629] - CompactNodeTypeDefWriter does not escaped names properly
   * [JCR-636] - Local AuthContext authenticates if LoginModule should be
   * [JCR-637] - Multiple namespace definitions in CND prevent definition of
   * [JCR-646] - Misleading exception message for jcr:deref()
   * [JCR-649] - Like expression does not match line terminator in String

See the issue tracker at http://issues.apache.org/jira/browse/JCR for
issue details and the full change histories of all Apache Jackrabbit
versions.

Known Issues


The known issues in this release are listed below:

   * [JCR-18] - Multithreading issue with versioning
   * [JCR-43] - Restore on node creates same-name-sibling of OPV-Version
   * [JCR-320] - BinaryValue equals fails for two objects with two different
   * [JCR-385] - ClassCastExeption when executing union queries
   * [JCR-392] - Accessing element by number does not work
   * [JCR-406] - If header evaluation compliance provlems
   * [JCR-435] - Node.update() does not work