hi jukka, nice to here about your planning for jackrabbit 3 and for asking about requirements so far. please see my answers below.
best regards, ulf. -- ulf schneider +49 163 2505164 [email protected] datenlabor gmbh sitz: paderborn, hrb 8819 geschäftsführer: ulf schneider www.datenlabor.net ibm business partner Am 09.02.2010 um 16:55 schrieb Jukka Zitting: > Hi, > > Now that Jackrabbit 2.0 is out and the major JCR 2.0 feature work is > done, it's time to start looking ahead at Jackrabbit 3. We've talked > about this a bit already at Day and I'll be posting a summary of our > ideas for further discussion, but before that I'd like to frame the > discussion by getting a better picture of the range of requirements > we'll be having for Jackrabbit 3. > > So, please let us know what you expect your repositories to look like > within the next five or so years. I'm especially interested in answers > to the following questions: > > Scalability: > * How much content (number of documents/nodes, raw amount data in > GB/TB/PB) do you have in the repository? -> producing up to 500.000 documents per year / up to some TB / can grow up to several hundred workspaces the repository is considered to be used as an external memory for all project related documentation and each new project would get a new workspace. over the years this could produce several hundred workspaces. > * How many (concurrent) users (readers/editors/administrators) does > your repository have? -> up to some thousand concurrent readers, some hundred concurrent editors, under 10 admins > * Do you need Internet-scale (millions of users or exabytes of > content) features? -> currently not > > Deployment: > * Do you run the repository on a single server, on a cluster or in the cloud? -> cluster > * How many and how powerful servers do you use for the repository? -> under 10 servers > > Content model: > * Do you need support for flat content hierarchies (>>10k sibling nodes)? -> there are some cases where it would be useful > * Do you need support for same-name siblings? -> no > * If you use versioning, how actively (commit on all saves / commit > only at major milestones) and for what purpose (revision history, > backup, etc.) do you use it? -> there are cases where a commit needs to be done on all saves for a revision history. but very often versioning is used for major milestones only. > * How granular (hierarchies of small properties vs. big binary blobs) > is your content? -> nodes with 10 to 50 properties and binary data for file attachments stored in nt:folder/nt:file nodes > * How much of your content access is based on search / tree traversal > / following references? -> full range: tree traversal, xpath and full text search and even references are being used. > * How much you rely on the repository to enforce your content model > (node type constraints, etc.)? -> not much, the application itself is driving the content model by using properties > * How often you modify your content model (and/or related node types)? -> node types have not been changed and the model evolves in tiny steps that do normally not enforce specific upgrade scripts. > > Features: > * Do you need full ACID semantics? Is an "eventually consistent" > system good enough for you? -> currently not needed > * Do you need more powerful search features than what we now have? -> currently not > * How important is observation to your application? Do you need > trigger-like capability that can modify or reject a save() operation? -> it is currently not used but considered to be used > > Feel free to answer either based on your current usage patterns or to > predict your needs for the next few years. The further ahead in the > future you can reasonably predict, the better. > > Note that I intentionally restricted this set of questions to core > repository features, I'll do a poll on favorite new features later on. > > BR, > > Jukka Zitting
