date:20070828


[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523171
 ] 

Thomas Mueller commented on JCR-926:


Revision 570336: BLOBFileValue and InternalValue refactoring, improved 
GarbageCollector

 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (JCR-1093) Separate initial index creation from MultiIndex construction

Separate initial index creation from MultiIndex construction


 Key: JCR-1093
 URL: https://issues.apache.org/jira/browse/JCR-1093
 Project: Jackrabbit
  Issue Type: Improvement
  Components: query
Reporter: Marcel Reutegger
Priority: Minor


If there is no index present the MultiIndex constructor will create an initial 
index by traversing the workspace item states. This makes it difficult for an 
outside class to detect the situation where no index is present.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property


[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523173
 ] 

Marcel Reutegger commented on JCR-1064:
---

 I am doing the tests, with the parent index in old format, and the workspace 
 index in new format,
 and this is no problem

well, that just means that there is no appropriate test

WRT the bootstrapping issue with the index and its format, I will create a 
separate issue and extract the initial index creation from the MultiIndex 
constructor.  See JCR-1093. Once this is solved, you can set the index format 
version before indexing the workspace.

 will never be called since indexFormatVersion == null. 

that's actually another point that should be changed. There should be a default 
value. I suggest we set it to V1.

 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property


[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523176
 ] 

Ard Schrijvers commented on JCR-1064:
-

 well, that just means that there is no appropriate test  

You mean that the tests just happen to work with new and old format by 
coincidence?  I just really am in the assumption, that a query is done on one 
index at the time, and wether this index is in the old or new format does not 
matter. Wether the system index is in old format, and the query runs on a 
workspace index in new format shouldn't give problems AFAICS. IMO, it is 
possible to port the jr impl to the new version while keeping all the indices, 
and when adding a new workspace, only this workspace will run in the new 
format.  But I do not have the overview like you do, so I probably just miss 
something :-). I'll stop worrying about it and go for your solution.

 WRT the bootstrapping issue with the index and its format, I will create a 
 separate issue and extract the initial index creation from the MultiIndex 
 constructor. See JCR-1093. Once this is solved, you can set the index format 
 version before indexing the workspace. 

That would be very nice. When you have finished, I'll create a new patch

 that's actually another point that should be changed. There should be a 
 default value. I suggest we set it to V1.

Agreed. 

I'll wait for JCR-1093 and then create a new patch. 

 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [VOTE] Approve the Sling project for incubation

2007-08-28 Thread Christoph Kiehl


Jukka Zitting wrote:


Please cast your votes:

[x] +1 Approve the Sling project for incubation
[ ] -1 Don't approve the project, because...


Looking very much forward to it. Sounds like a _very_ interesting project.

Cheers,
Christoph

Re: [VOTE] Approve the Sling project for incubation

2007-08-28 Thread Marcel Reutegger


Jukka Zitting wrote:

I'd like to call the Jackrabbit PMC to vote on
sponsoring the Sling project and approving it for incubation.


+1 Approve the Sling project for incubation

regards
 marcel

[jira] Commented: (JCR-926) Global data store for binaries

2007-08-28 Thread JIRA

[
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523195
]

Claus Köll commented on JCR-926:

thanks for the quick answer thomas.

ok and where will the datastore be stored on the filesystem.
can i configure it because i think for data backup and recory process it
will be very interesting to save the files on a huge, fast filesystem. (SAN)

Global data store for binaries
--

Key: JCR-926
URL: https://issues.apache.org/jira/browse/JCR-926
Project: Jackrabbit
Issue Type: New Feature
Components: core
Reporter: Jukka Zitting
Attachments: dataStore.patch, DataStore.patch, DataStore2.patch,
dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch,
internalValue.patch, ReadWhileSaveTest.patch

There are three main problems with the way Jackrabbit currently handles large
binary values:
1) Persisting a large binary value blocks access to the persistence layer for
extended amounts of time (see JCR-314)
2) At least two copies of binary streams are made when saving them through
the JCR API: one in the transient space, and one when persisting the value
3) Versioining and copy operations on nodes or subtrees that contain large
binary values can quickly end up consuming excessive amounts of storage space.
To solve these issues (and to get other nice benefits), I propose that we
implement a global data store concept in the repository. A data store is an
append-only set of binary values that uses short identifiers to identify and
access the stored binary values. The data store would trivially fit the
requirements of transient space and transaction handling due to the
append-only nature. An explicit mark-and-sweep garbage collection process
could be added to avoid concerns about storing garbage values.
See the recent NGP value record discussion, especially [1], for more
background on this idea.
[1]
http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL
PROTECTED]

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-926) Global data store for binaries


[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523190
 ] 

Thomas Mueller commented on JCR-926:


 Does the GlobalDataStore also prevent the BundleDBPersistenceManager 
 to load the binary property automatically when you get a node ?

Yes. If 'Global Data Store' is enabled, larger binary properties are be stored 
there and loaded from there. Only the DataIdentifier (a String) and small 
binaries (up to 1 KB or so, needs to be tested) will be stored in the 
persistence manager.


 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: SystemSession question

2007-08-28 Thread Marcel Reutegger


Esteban Franqueiro wrote:

Hi all.
We want our component to be notified of every property event, and we
were thinking about using the system sessions to connect the listeners.
The idea is to install the listeners on repository startup, with the
system sessions so that we don't miss any possible event.
Is there any significant performace issue related to using them in such
a way?


no, there is no performance impact. event listeners are informed using a 
background thread.



Is it advisable? Is there a better way?


I would rather use a regular session, which has read access to the whole 
repository. The system session is jackrabbit internal and should not be used 
unless there is a very good reason.



regards
 marcel

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property


[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523202
 ] 

Marcel Reutegger commented on JCR-1064:
---

 You mean that the tests just happen to work with new and old format by 
 coincidence?

yes, that's what I mean.

 I just really am in the assumption, that a query is done on one index at the 
 time

Ah, I see. That's where the misunderstanding is. Unless otherwise indicated (by 
static analysis of the query tree, see JCR-1066) a query is executed on both 
indexes using a MultiReader. This means the query is only executed once and 
across both indexes.

Btw. JCR-1093 is now fixed.

 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (JCR-1093) Separate initial index creation from MultiIndex construction


 [ 
https://issues.apache.org/jira/browse/JCR-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved JCR-1093.
---

   Resolution: Fixed
Fix Version/s: 1.4

Fixed in revision: 570350

 Separate initial index creation from MultiIndex construction
 

 Key: JCR-1093
 URL: https://issues.apache.org/jira/browse/JCR-1093
 Project: Jackrabbit
  Issue Type: Improvement
  Components: query
Reporter: Marcel Reutegger
Priority: Minor
 Fix For: 1.4


 If there is no index present the MultiIndex constructor will create an 
 initial index by traversing the workspace item states. This makes it 
 difficult for an outside class to detect the situation where no index is 
 present.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-926) Global data store for binaries


[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523201
 ] 

Thomas Mueller commented on JCR-926:


 where will the datastore be stored on the filesystem. 
This will be a configuration option for the FileDataStore


 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1079) Extend the IndexingConfiguration to allow configuration of reuseable analyzers

[
https://issues.apache.org/jira/browse/JCR-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523206
]

Ard Schrijvers commented on JCR-1079:
-

That's OK with me, but I think being able to configure an analyzer in an
index rule also seems useful to me.

That is fine with me, but we do have to realize, that I cannot make a
distinction between setting it for a property in a single index-rule or
setting it global like I did describe it. It is because when analyzing or when
parsing some query for a field, all I know in the analyzer is the the string
representation (JCR-style name) of the given property.

If that is fine with you I will add this configuration option and write
documentation about it.

Extend the IndexingConfiguration to allow configuration of reuseable analyzers
--

Key: JCR-1079
URL: https://issues.apache.org/jira/browse/JCR-1079
Project: Jackrabbit
Issue Type: New Feature
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
Fix For: 1.4

To the indexing_configuration.xml a xml block of analyzers should be
configurable. In each index-rule to a property an analyzer can be assigned.
This means, that property will be analyzed with that specific analyzer. In
the first place, it enables multilingual indexing.
Documentation needs to be added explaining the difference in searching in the
node scope [jcr:contains(.,'foo')] and in some property
[jcr:contains(@myprop,'foo')]. The node scope will always be searched and
indexed with the default analyzer, which can be configured in the
workspace.xml in the SearchIndex element.
Below a possible indexing_configuration.xml snippet is shown. Also node the
possible enhancement (not sure wether this implementation will have it,
because it requires a lot of filter Factories and is probably out of scope).
Adding custom filters which do not need a factory might be easier.
analyzers
analyzer name=fr
class=org.apache.lucene.analysis.fr.FrenchAnalyzer/
analyzer name=de
class=org.apache.lucene.analysis.de.GermanAnalyzer/
analyzer name=compound
class=org.apache.lucene.analysis.SimpleAnalyzer
filter class=jr.StopFilterFactory words=stopwords.txt/
filter class=jr.EdgeNGramTokenizerFactory side=front
minGram=1 maxGram=2/
/analyzer
/analyzers
index-rule nodeType=nt:unstructured
property analyzer=frbode_fr/property
property analyzer=debode_de/property
/index-rule

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1079) Extend the IndexingConfiguration to allow configuration of reuseable analyzers

[
https://issues.apache.org/jira/browse/JCR-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523203
]

Marcel Reutegger commented on JCR-1079:
---

That's OK with me, but I think being able to configure an analyzer in an index
rule also seems useful to me.

Extend the IndexingConfiguration to allow configuration of reuseable analyzers
--

Key: JCR-1079
URL: https://issues.apache.org/jira/browse/JCR-1079
Project: Jackrabbit
Issue Type: New Feature
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
Fix For: 1.4

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property


[ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523205
 ] 

Ard Schrijvers commented on JCR-1064:
-

 Ah, I see. That's where the misunderstanding is. Unless otherwise indicated 
 (by static analysis of the query tree, see JCR-1066) a query is 
 executed on both indexes using a MultiReader. This means the query is only 
 executed once and across both indexes. 

Now I am convinced and understand your concerns! :-)

I'll create a new patch with JCR-1093 taken into account, and a default value 
for indexFormatVersion 



 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

[
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523228
]

Ard Schrijvers commented on JCR-1064:
-

Implemented the new indexing format again. There is a subtle difficulty though:

When I have one sysIndex and 2 workspace indices in format style like:

sysIndex = old
ws1Index = old
ws2Index = old

now, only deleting the sysIndex, will generate a sysIndex in new format style
in index.createInitialIndex().

Since ws1Index and ws2Index are old, the parentQueryHandler should be set to
old index style again. This is implemented.

Now, when you would have again

sysIndex = old
ws1Index = old
ws2Index = old

and remove sysIndex *and* ws1Index, then at doInit() we would get

sysIndex = new -- old (but changed to old when ws2Index is initialised)
ws1Index = new
ws2Index = old

but, when querying ws1Index, this might give problems, because sysIndex is
reverted to old when ws2Index was initialized. To solve this, at
getIndexFormatVersion() always a check is done wether parent handler and
current index format are the same. If not, default back to old style.

This implies, that when updating jackrabbit version, you will *only* get the
new indexing format style if and only if you re-index all the existing indices
you have so far.

Hope my explanation is clear! I'll prepare the patch

Optimize queries that check for the existence of a property
---

Key: JCR-1064
URL: https://issues.apache.org/jira/browse/JCR-1064
Project: Jackrabbit
Issue Type: Improvement
Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
Fix For: 1.4

Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch,
JCR-1064-DEPR.patch

//[EMAIL PROTECTED] is transformed into the
org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the
MatchAllWeight uses the MatchAllScorer. The calculateDocFilter() in
MatchAllScorer does not scale and becomes slow for growing number of nodes.
Solution: lucene documents will get a new Field:
public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
that holds the available properties of this document.
NOTE: Lucene indices build without this performance improvement should still
work and fall back to the original implementation

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-926) Global data store for binaries


[ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523229
 ] 

Thomas Mueller commented on JCR-926:


Revision 570407: add DataStore to constructors

 Global data store for binaries
 --

 Key: JCR-926
 URL: https://issues.apache.org/jira/browse/JCR-926
 Project: Jackrabbit
  Issue Type: New Feature
  Components: core
Reporter: Jukka Zitting
 Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
 dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
 internalValue.patch, ReadWhileSaveTest.patch


 There are three main problems with the way Jackrabbit currently handles large 
 binary values:
 1) Persisting a large binary value blocks access to the persistence layer for 
 extended amounts of time (see JCR-314)
 2) At least two copies of binary streams are made when saving them through 
 the JCR API: one in the transient space, and one when persisting the value
 3) Versioining and copy operations on nodes or subtrees that contain large 
 binary values can quickly end up consuming excessive amounts of storage space.
 To solve these issues (and to get other nice benefits), I propose that we 
 implement a global data store concept in the repository. A data store is an 
 append-only set of binary values that uses short identifiers to identify and 
 access the stored binary values. The data store would trivially fit the 
 requirements of transient space and transaction handling due to the 
 append-only nature. An explicit mark-and-sweep garbage collection process 
 could be added to avoid concerns about storing garbage values.
 See the recent NGP value record discussion, especially [1], for more 
 background on this idea.
 [1] 
 http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
 PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property


 [ 
https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ard Schrijvers updated JCR-1064:


Attachment: JCR-1064-2.patch

Patch that should implement all previous comments. 



 Optimize queries that check for the existence of a property
 ---

 Key: JCR-1064
 URL: https://issues.apache.org/jira/browse/JCR-1064
 Project: Jackrabbit
  Issue Type: Improvement
  Components: indexing
Affects Versions: 1.3.1
Reporter: Ard Schrijvers
Priority: Minor
 Fix For: 1.4

 Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, 
 JCR-1064-2.patch, JCR-1064-DEPR.patch


 //[EMAIL PROTECTED] is transformed into the 
 org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the 
 MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in 
 MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
 Solution: lucene documents will get a new Field:
 public static final String PROPERTIES_SET = _:PROPERTIES_SET.intern();
 that holds the available properties of this document. 
 NOTE: Lucene indices build without this performance improvement should still 
 work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation


 [ 
https://issues.apache.org/jira/browse/JCR-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke reassigned JCR-1094:
---

Assignee: (was: Julian Reschke)

 TCK assumes that repository does not automatically add mixins on node creation
 --

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke

 In several places, the TCK assumes that repository does not automatically add 
 mixins on node creation. However, this is explicitly allowed per JSR-170, 
 Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation


 [ 
https://issues.apache.org/jira/browse/JCR-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke reassigned JCR-1094:
---

Assignee: Julian Reschke

 TCK assumes that repository does not automatically add mixins on node creation
 --

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 In several places, the TCK assumes that repository does not automatically add 
 mixins on node creation. However, this is explicitly allowed per JSR-170, 
 Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation

TCK assumes that repository does not automatically add mixins on node creation
--

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke


In several places, the TCK assumes that repository does not automatically add 
mixins on node creation. However, this is explicitly allowed per JSR-170, 
Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1094) TCK assumes that repository does not automatically add mixins on node creation


[ 
https://issues.apache.org/jira/browse/JCR-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523248
 ] 

Julian Reschke commented on JCR-1094:
-

Fixed in one place with revision 570442, but the same issue applies to other 
test cases as well.



 TCK assumes that repository does not automatically add mixins on node creation
 --

 Key: JCR-1094
 URL: https://issues.apache.org/jira/browse/JCR-1094
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 In several places, the TCK assumes that repository does not automatically add 
 mixins on node creation. However, this is explicitly allowed per JSR-170, 
 Section 7.4.4.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Database connections queries

2007-08-28 Thread Jukka Zitting

Hi,

On 8/24/07, Jukka Zitting [EMAIL PROTECTED] wrote:
 On 8/22/07, Martijn Hendriks [EMAIL PROTECTED] wrote:
  The current implementation of the LazyScoreNodeIterator is such to
  ignore all RepositoryExceptions that might be thrown while loading the
  nodes in the result set, as required by the javax.jcr.NodeIterator spec.

 I think the correct behaviour would be for the node iterator to throw
 an exception (unfortunately I guess only a RuntimeException would work
 here) instead of ignoring such internal errors.

A thought just crossed my mind... As mentioned in another thread a few
days ago, we could make NodeImpl to load the underlying node states
only on demand as long as we have just the node ID, parent ID, and
name of the node available. This way we could delay the exception to
a method like getProperty() that is allowed to throw appropriate
exceptions.

Besides, such a solution would most likely give a nice speedup to
clients that just want to get the names or paths of nodes that match a
query...

BR,

Jukka Zitting

Re: Tight coupling to XML configurations

2007-08-28 Thread Alan D. Cabrera



On Aug 27, 2007, at 11:16 PM, Thomas Mueller wrote:


Hi,

Yes, in my view, repository.xml and workspace.xml should go away or at
least be less visible for a user. Or do you mean something else with
XML configuration?


I don't see why we would want to make configuration files less  
visible to the users but that's for a different thread.


Currently, the way the JCR server is booted up is tightly integrated  
w/ XML.  For example, the repository configuration object holds an  
XML snippet that it uses as a template to generate new workspaces.   
This is what I mean by tight coupling.


Ideally, we would have factories.  This gives me more control.


interceptor stacks


Could you provide an example?


The current architecture of Jackrabbit seems to be tightly coupled  
with extensions being implemented via inheritance and overriding  
certain methods.  ATM, when I want to provide virtual properties to a  
node, I have to inherit from an existing persistent manager (PM) and  
override methods such as load(PropertyId).


I was thinking that a JCR is really like a CMP container.  Having  
worked on OpenEJB the use of interceptors immediately springs to  
mind.  We can provide all sorts of cross cutting behavior, e.g.  
security, remoting, tx, by just inserting new interceptors.


Take my comments with a grain of salt; I don't fully grok the  
architecture.



Regards,
Alan

Sling: next steps

2007-08-28 Thread Juan José Vázquez Delgado

Dear Felix,

I sent my CLA to ASF four weeks ago, more or less. Wich are the next steps
in order to get an apache account?.

I completed the CLA with this information:

Full name: Juan José Vázquez Delgado
E-Mail: [EMAIL PROTECTED]

Regards,

Juanjo.

[jira] Created: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties

ReferencesPropertyTest can't deal with multivalued reference properties
---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke


The setUp() method uses prop.getNode(), thus assuming that the reference 
property is not multivalued.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties


 [ 
https://issues.apache.org/jira/browse/JCR-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved JCR-1095.
-

Resolution: Fixed

Fixed with revision 570520.


 ReferencesPropertyTest can't deal with multivalued reference properties
 ---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 The setUp() method uses prop.getNode(), thus assuming that the reference 
 property is not multivalued.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [VOTE] Approve the Sling project for incubation

2007-08-28 Thread Tobias Bocanegra

 I'd like to call the Jackrabbit PMC to vote on
 sponsoring the Sling project and approving it for incubation.

+1 Approve the Sling project for incubation

cheers, toby
-- 
- [EMAIL PROTECTED] ---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
--- http://www.day.com ---

[jira] Resolved: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties


 [ 
https://issues.apache.org/jira/browse/JCR-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved JCR-1095.
-

Resolution: Fixed

Fixed with revision 570564.


 ReferencesPropertyTest can't deal with multivalued reference properties
 ---

 Key: JCR-1095
 URL: https://issues.apache.org/jira/browse/JCR-1095
 Project: Jackrabbit
  Issue Type: Bug
  Components: JCR TCK
Reporter: Julian Reschke
Assignee: Julian Reschke

 The setUp() method uses prop.getNode(), thus assuming that the reference 
 property is not multivalued.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1095) ReferencesPropertyTest can't deal with multivalued reference properties