Re: Tight coupling to XML configurations

2007-08-28 Thread Thomas Mueller
Hi,

Yes, in my view, repository.xml and workspace.xml should go away or at
least be less visible for a user. Or do you mean something else with
XML configuration?

 interceptor stacks

Could you provide an example?

Thanks,
Thomas


[jira] Commented: (JCR-926) Global data store for binaries

2007-08-28 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523171
 ] 

Thomas Mueller commented on JCR-926:


Revision 570336: BLOBFileValue and InternalValue refactoring, improved 
GarbageCollector

 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (JCR-1093) Separate initial index creation from MultiIndex construction

2007-08-28 Thread Marcel Reutegger (JIRA)
Separate initial index creation from MultiIndex construction


 Key: JCR-1093
 URL: https://issues.apache.org/jira/browse/JCR-1093
 Project: Jackrabbit
  Issue Type: Improvement
  Components: query
Reporter: Marcel Reutegger
Priority: Minor


If there is no index present the MultiIndex constructor will create an initial 
index by traversing the workspace item states. This makes it difficult for an 
outside class to detect the situation where no index is present.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

2007-08-28 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523173
 ] 

Marcel Reutegger commented on JCR-1064:
---

 I am doing the tests, with the parent index in old format, and the workspace 
 index in new format,
 and this is no problem

well, that just means that there is no appropriate test

WRT the bootstrapping issue with the index and its format, I will create a 
separate issue and extract the initial index creation from the MultiIndex 
constructor.  See JCR-1093. Once this is solved, you can set the index format 
version before indexing the workspace.

 will never be called since indexFormatVersion == null. 

that's actually another point that should be changed. There should be a default 
value. I suggest we set it to V1.

 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

2007-08-28 Thread Ard Schrijvers (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523176
 ] 

Ard Schrijvers commented on JCR-1064:
-

 well, that just means that there is no appropriate test  

You mean that the tests just happen to work with new and old format by 
coincidence?  I just really am in the assumption, that a query is done on one 
index at the time, and wether this index is in the old or new format does not 
matter. Wether the system index is in old format, and the query runs on a 
workspace index in new format shouldn't give problems AFAICS. IMO, it is 
possible to port the jr impl to the new version while keeping all the indices, 
and when adding a new workspace, only this workspace will run in the new 
format.  But I do not have the overview like you do, so I probably just miss 
something :-). I'll stop worrying about it and go for your solution.

 WRT the bootstrapping issue with the index and its format, I will create a 
 separate issue and extract the initial index creation from the MultiIndex 
 constructor. See JCR-1093. Once this is solved, you can set the index format 
 version before indexing the workspace. 

That would be very nice. When you have finished, I'll create a new patch

 that's actually another point that should be changed. There should be a 
 default value. I suggest we set it to V1.

Agreed. 

I'll wait for JCR-1093 and then create a new patch. 

 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Approve the Sling project for incubation

2007-08-28 Thread Christoph Kiehl

Jukka Zitting wrote:


Please cast your votes:

[x] +1 Approve the Sling project for incubation
[ ] -1 Don't approve the project, because...


Looking very much forward to it. Sounds like a _very_ interesting project.

Cheers,
Christoph




Re: [VOTE] Approve the Sling project for incubation

2007-08-28 Thread Marcel Reutegger

Jukka Zitting wrote:

I'd like to call the Jackrabbit PMC to vote on
sponsoring the Sling project and approving it for incubation.


+1 Approve the Sling project for incubation

regards
 marcel



[jira] Commented: (JCR-926) Global data store for binaries

2007-08-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523195
 ] 

Claus Köll commented on JCR-926:


thanks for the quick answer thomas.

ok and where will the datastore be stored on the filesystem.
can i configure it because i think for data backup and recory process it
will be very interesting to save the files on a huge, fast filesystem. (SAN)

 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-926) Global data store for binaries

2007-08-28 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523190
 ] 

Thomas Mueller commented on JCR-926:


 Does the GlobalDataStore also prevent the BundleDBPersistenceManager 
 to load the binary property automatically when you get a node ?

Yes. If 'Global Data Store' is enabled, larger binary properties are be stored 
there and loaded from there. Only the DataIdentifier (a String) and small 
binaries (up to 1 KB or so, needs to be tested) will be stored in the 
persistence manager.


 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: SystemSession question

2007-08-28 Thread Marcel Reutegger

Esteban Franqueiro wrote:

Hi all.
We want our component to be notified of every property event, and we
were thinking about using the system sessions to connect the listeners.
The idea is to install the listeners on repository startup, with the
system sessions so that we don't miss any possible event.
Is there any significant performace issue related to using them in such
a way?


no, there is no performance impact. event listeners are informed using a 
background thread.



Is it advisable? Is there a better way?


I would rather use a regular session, which has read access to the whole 
repository. The system session is jackrabbit internal and should not be used 
unless there is a very good reason.



regards
 marcel


[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

2007-08-28 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523202
 ] 

Marcel Reutegger commented on JCR-1064:
---

 You mean that the tests just happen to work with new and old format by 
 coincidence?

yes, that's what I mean.

 I just really am in the assumption, that a query is done on one index at the 
 time

Ah, I see. That's where the misunderstanding is. Unless otherwise indicated (by 
static analysis of the query tree, see JCR-1066) a query is executed on both 
indexes using a MultiReader. This means the query is only executed once and 
across both indexes.

Btw. JCR-1093 is now fixed.

 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (JCR-1093) Separate initial index creation from MultiIndex construction

2007-08-28 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved JCR-1093.
---

   Resolution: Fixed
Fix Version/s: 1.4

Fixed in revision: 570350

 Separate initial index creation from MultiIndex construction
 

 Key: JCR-1093
 URL: https://issues.apache.org/jira/browse/JCR-1093
 Project: Jackrabbit
  Issue Type: Improvement
  Components: query
Reporter: Marcel Reutegger
Priority: Minor
 Fix For: 1.4


 If there is no index present the MultiIndex constructor will create an 
 initial index by traversing the workspace item states. This makes it 
 difficult for an outside class to detect the situation where no index is 
 present.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-926) Global data store for binaries

2007-08-28 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523201
 ] 

Thomas Mueller commented on JCR-926:


 where will the datastore be stored on the filesystem. 
This will be a configuration option for the FileDataStore


 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1079) Extend the IndexingConfiguration to allow configuration of reuseable analyzers

2007-08-28 Thread Ard Schrijvers (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523206
 ] 

Ard Schrijvers commented on JCR-1079:
-

 That's OK with me, but I think being able to configure an analyzer in an 
 index rule also seems useful to me.

That is fine with me, but we do have to realize, that I cannot make a 
distinction between setting it for a property in a single index-rule or 
setting it global like I did describe it. It is because when analyzing or when 
parsing some query for a field, all I know in the analyzer is the  the string 
representation (JCR-style name) of the given property. 

If that is fine with you I will add this configuration option and write 
documentation about it.

 Extend the IndexingConfiguration to allow configuration of reuseable analyzers
 --

 Key: JCR-1079
 URL: https://issues.apache.org/jira/browse/JCR-1079
 Project: Jackrabbit
  Issue Type: New Feature
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4


 To the indexing_configuration.xml a xml block of analyzers should be 
 configurable. In each index-rule to a property an analyzer can be assigned. 
 This means, that property will be analyzed with that specific analyzer. In 
 the first place, it enables multilingual indexing. 
 Documentation needs to be added explaining the difference in searching in the 
 node scope [jcr:contains(.,'foo')] and in some property 
 [jcr:contains(@myprop,'foo')]. The node scope will always be searched and 
 indexed with the default analyzer, which can be configured in the 
 workspace.xml in  the  SearchIndex element.
 Below a possible indexing_configuration.xml snippet is shown. Also node the 
 possible enhancement (not sure wether this implementation will have it, 
 because it requires a lot of filter Factories and is probably out of scope). 
 Adding custom filters which do not need a factory might be easier.
 analyzers
   analyzer name=fr 
 class=org.apache.lucene.analysis.fr.FrenchAnalyzer/
   analyzer name=de 
 class=org.apache.lucene.analysis.de.GermanAnalyzer/
 analyzer name=compound 
 class=org.apache.lucene.analysis.SimpleAnalyzer
  filter class=jr.StopFilterFactory words=stopwords.txt/
  filter class=jr.EdgeNGramTokenizerFactory side=front 
 minGram=1 maxGram=2/
 /analyzer
 /analyzers
 index-rule nodeType=nt:unstructured
property analyzer=frbode_fr/property
property analyzer=debode_de/property
 /index-rule

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1079) Extend the IndexingConfiguration to allow configuration of reuseable analyzers

2007-08-28 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523203
 ] 

Marcel Reutegger commented on JCR-1079:
---

That's OK with me, but I think being able to configure an analyzer in an index 
rule also seems useful to me.

 Extend the IndexingConfiguration to allow configuration of reuseable analyzers
 --

 Key: JCR-1079
 URL: https://issues.apache.org/jira/browse/JCR-1079
 Project: Jackrabbit
  Issue Type: New Feature
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4


 To the indexing_configuration.xml a xml block of analyzers should be 
 configurable. In each index-rule to a property an analyzer can be assigned. 
 This means, that property will be analyzed with that specific analyzer. In 
 the first place, it enables multilingual indexing. 
 Documentation needs to be added explaining the difference in searching in the 
 node scope [jcr:contains(.,'foo')] and in some property 
 [jcr:contains(@myprop,'foo')]. The node scope will always be searched and 
 indexed with the default analyzer, which can be configured in the 
 workspace.xml in  the  SearchIndex element.
 Below a possible indexing_configuration.xml snippet is shown. Also node the 
 possible enhancement (not sure wether this implementation will have it, 
 because it requires a lot of filter Factories and is probably out of scope). 
 Adding custom filters which do not need a factory might be easier.
 analyzers
   analyzer name=fr 
 class=org.apache.lucene.analysis.fr.FrenchAnalyzer/
   analyzer name=de 
 class=org.apache.lucene.analysis.de.GermanAnalyzer/
 analyzer name=compound 
 class=org.apache.lucene.analysis.SimpleAnalyzer
  filter class=jr.StopFilterFactory words=stopwords.txt/
  filter class=jr.EdgeNGramTokenizerFactory side=front 
 minGram=1 maxGram=2/
 /analyzer
 /analyzers
 index-rule nodeType=nt:unstructured
property analyzer=frbode_fr/property
property analyzer=debode_de/property
 /index-rule

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

2007-08-28 Thread Ard Schrijvers (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523205
 ] 

Ard Schrijvers commented on JCR-1064:
-

 Ah, I see. That's where the misunderstanding is. Unless otherwise indicated 
 (by static analysis of the query tree, see JCR-1066) a query is 
 executed on both indexes using a MultiReader. This means the query is only 
 executed once and across both indexes. 

Now I am convinced and understand your concerns! :-)

I'll create a new patch with JCR-1093 taken into account, and a default value 
for indexFormatVersion 



 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

2007-08-28 Thread Ard Schrijvers (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523228
 ] 

Ard Schrijvers commented on JCR-1064:
-

Implemented the new indexing format again. There is a subtle difficulty though:

When I have one sysIndex and 2 workspace indices in format style like:

sysIndex = old
ws1Index = old
ws2Index = old

now, only deleting the sysIndex, will generate a sysIndex in new format style 
in index.createInitialIndex(). 

Since ws1Index and ws2Index  are old, the parentQueryHandler should be set to 
old index style again. This is implemented. 

Now, when you would have again 

sysIndex = old
ws1Index = old
ws2Index = old

and remove sysIndex  *and*  ws1Index, then  at doInit() we would get 

sysIndex = new -- old  (but changed to old when ws2Index is initialised)
ws1Index = new
ws2Index = old

but, when querying ws1Index, this might give problems, because sysIndex is 
reverted to old when ws2Index was initialized. To solve this, at 
getIndexFormatVersion() always a check is done wether parent handler and 
current index format are the same. If not, default back to old style.

This implies, that when updating jackrabbit version, you will *only* get the 
new indexing format style if and only if you re-index all the existing indices 
you have so far. 

Hope my explanation is clear! I'll prepare the patch



 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-926) Global data store for binaries

2007-08-28 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523229
 ] 

Thomas Mueller commented on JCR-926:


Revision 570407: add DataStore to constructors

 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

2007-08-28 Thread Ard Schrijvers (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ard Schrijvers updated JCR-1064:


Attachment: JCR-1064-2.patch

Patch that should implement all previous comments. 



 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-2.patch, JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation

2007-08-28 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke reassigned JCR-1094:
---

Assignee: (was: Julian Reschke)

 TCK assumes that repository does not automatically add mixins on node creation
 --

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke

 In several places, the TCK assumes that repository does not automatically add 
 mixins on node creation. However, this is explicitly allowed per JSR-170, 
 Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation

2007-08-28 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke reassigned JCR-1094:
---

Assignee: Julian Reschke

 TCK assumes that repository does not automatically add mixins on node creation
 --

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 In several places, the TCK assumes that repository does not automatically add 
 mixins on node creation. However, this is explicitly allowed per JSR-170, 
 Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation

2007-08-28 Thread Julian Reschke (JIRA)
TCK assumes that repository does not automatically add mixins on node creation
--

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke


In several places, the TCK assumes that repository does not automatically add 
mixins on node creation. However, this is explicitly allowed per JSR-170, 
Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation

2007-08-28 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523248
 ] 

Julian Reschke commented on JCR-1094:
-

Fixed in one place with revision 570442, but the same issue applies to other 
test cases as well.



 TCK assumes that repository does not automatically add mixins on node creation
 --

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 In several places, the TCK assumes that repository does not automatically add 
 mixins on node creation. However, this is explicitly allowed per JSR-170, 
 Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Database connections queries

2007-08-28 Thread Jukka Zitting
Hi,

On 8/24/07, Jukka Zitting [EMAIL PROTECTED] wrote:
 On 8/22/07, Martijn Hendriks [EMAIL PROTECTED] wrote:
  The current implementation of the LazyScoreNodeIterator is such to
  ignore all RepositoryExceptions that might be thrown while loading the
  nodes in the result set, as required by the javax.jcr.NodeIterator spec.

 I think the correct behaviour would be for the node iterator to throw
 an exception (unfortunately I guess only a RuntimeException would work
 here) instead of ignoring such internal errors.

A thought just crossed my mind... As mentioned in another thread a few
days ago, we could make NodeImpl to load the underlying node states
only on demand as long as we have just the node ID, parent ID, and
name of the node available. This way we could delay the exception to
a method like getProperty() that is allowed to throw appropriate
exceptions.

Besides, such a solution would most likely give a nice speedup to
clients that just want to get the names or paths of nodes that match a
query...

BR,

Jukka Zitting


Re: Tight coupling to XML configurations

2007-08-28 Thread Alan D. Cabrera


On Aug 27, 2007, at 11:16 PM, Thomas Mueller wrote:


Hi,

Yes, in my view, repository.xml and workspace.xml should go away or at
least be less visible for a user. Or do you mean something else with
XML configuration?


I don't see why we would want to make configuration files less  
visible to the users but that's for a different thread.


Currently, the way the JCR server is booted up is tightly integrated  
w/ XML.  For example, the repository configuration object holds an  
XML snippet that it uses as a template to generate new workspaces.   
This is what I mean by tight coupling.


Ideally, we would have factories.  This gives me more control.


interceptor stacks


Could you provide an example?


The current architecture of Jackrabbit seems to be tightly coupled  
with extensions being implemented via inheritance and overriding  
certain methods.  ATM, when I want to provide virtual properties to a  
node, I have to inherit from an existing persistent manager (PM) and  
override methods such as load(PropertyId).


I was thinking that a JCR is really like a CMP container.  Having  
worked on OpenEJB the use of interceptors immediately springs to  
mind.  We can provide all sorts of cross cutting behavior, e.g.  
security, remoting, tx, by just inserting new interceptors.


Take my comments with a grain of salt; I don't fully grok the  
architecture.



Regards,
Alan



Sling: next steps

2007-08-28 Thread Juan José Vázquez Delgado
Dear Felix,

I sent my CLA to ASF four weeks ago, more or less. Wich are the next steps
in order to get an apache account?.

I completed the CLA with this information:

Full name: Juan José Vázquez Delgado
E-Mail: [EMAIL PROTECTED]

Regards,

Juanjo.


[jira] Created: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties

2007-08-28 Thread Julian Reschke (JIRA)
ReferencesPropertyTest can't deal with multivalued reference properties
---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke


The setUp() method uses prop.getNode(), thus assuming that the reference 
property is not multivalued.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties

2007-08-28 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved JCR-1095.
-

Resolution: Fixed

Fixed with revision 570520.


 ReferencesPropertyTest can't deal with multivalued reference properties
 ---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 The setUp() method uses prop.getNode(), thus assuming that the reference 
 property is not multivalued.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Approve the Sling project for incubation

2007-08-28 Thread Tobias Bocanegra
 I'd like to call the Jackrabbit PMC to vote on
 sponsoring the Sling project and approving it for incubation.

+1 Approve the Sling project for incubation

cheers, toby
-- 
- [EMAIL PROTECTED] ---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
--- http://www.day.com ---


[jira] Resolved: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties

2007-08-28 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/JCR-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved JCR-1095.
-

Resolution: Fixed

Fixed with revision 570564.


 ReferencesPropertyTest can't deal with multivalued reference properties
 ---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 The setUp() method uses prop.getNode(), thus assuming that the reference 
 property is not multivalued.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties

2007-08-28 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523343
 ] 

Julian Reschke commented on JCR-1095:
-

Fixed subsequent failure in spi/client with revision 570565.



 ReferencesPropertyTest can't deal with multivalued reference properties
 ---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 The setUp() method uses prop.getNode(), thus assuming that the reference 
 property is not multivalued.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (JCR-1096) Problems with custom nodes in journal

2007-08-28 Thread Raffaele Sena (JIRA)
Problems with custom nodes in journal
-

 Key: JCR-1096
 URL: https://issues.apache.org/jira/browse/JCR-1096
 Project: Jackrabbit
  Issue Type: Bug
  Components: clustering
Affects Versions: 1.3.1
Reporter: Raffaele Sena


I have an application that uses custom node types and I am having problems in a 
clustered configuration.

Issue 1: the following definition in a nodetype is incorrectly read from the 
journal:
  + * (nt:hierarchyNode) version

The * is stored in the journal as _x002a_ since it should be a QName and it 
gets escaped.
When read, the code 
...core.nodetype.compact.CompactNodeTypeDefReader.doChildNodeDefinition does 
the following test:

if (currentTokenEquals('*')) {
ndi.setName(ItemDef.ANY_NAME); 
} else {
ndi.setName(toQName(currentToken));
}

Since currentToken is _x002a_ and not * toQName(currentToken) is called but it 
fails.
I changed the test to:
if (currentTokenEquals('*') || currentTokenEquals(_x002a_))

and that fixes the problem.

Issue 2: when storing a nodeType in the journal the superclass nt:base is not 
store, but when reading I get an error saying the node should be a subclass of 
nt:base.

The code in...core.nodetype.compact.CompactNodeTypeDefWriter.writeSupertypes 
skips nt:base when writing the node.

When reading the nodetype definition from the journal the following exception 
is thrown:

Unable to deliver node type operation: 
[{http://www.adobe.com/acorn/repository/1.0}resource] all primary node types 
except nt:base itself must be (directly or indirectly) derived from nt:base

probably because nt:base is not re-added to the nodetype definition

 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Jackrabbit name collision?

2007-08-28 Thread Jukka Zitting
Hi,

Just came across this:

http://jackrabbit.scalableinformatics.com/

Not sure yet what, if anything, to do about it.

BR,

Jukka Zitting


Re: Jackrabbit name collision?

2007-08-28 Thread Bertrand Delacretaz
On 8/29/07, Jukka Zitting [EMAIL PROTECTED] wrote:

 ...Just came across this:

 http://jackrabbit.scalableinformatics.com/

There's also Jackrabbit reporting,
http://www.eldocomp.com/index.php?option=com_contenttask=viewid=25Itemid=84

Remember that the official name of this project is Apache
Jackrabbit, so there's probably not much to do.

-Bertrand