Hi, On 4/27/07, Stefan Kurla <[EMAIL PROTECTED]> wrote:
I guess this is more suited for the dev list.
Yep.
How is the data actually stored in jackrabbit say using mysql for example and we are just using the default workspace.
A good starting point in understanding the underlying storage model of Jackrabbit is to look at the PersistenceManager interface [1]. The actual physical storage model depends on the persistence manager implementation you are using, but the logical model is fixed by the interface. The PersistenceManager abstraction essentially treats all nodes and properties as individually addressable items that each have their own unique identifier. In addition to these items the interface also defines a mechanism to store and access all the references pointing to a node.
There is the default_binval which has binval_id and binval_data. ### Is this table used to store binary data, where binval_id is the uuid of the jcr:content that this is referring to and binval_data is the actual bytestream blob data
Yes, the binval table stores binary properties when the externalBLOBs configuration option is set to "false". The binval_id column contains the property identifier plus value index (because of multivalued properties) used to identify the binary value, and the binval_data column contains the actual byte stream.
There is default_node which has node_id and node_data. ###How is this used?
The node_id column contains the unique node identifier and the node_data column contains the node state in a serialized format [2].
default_prop with prop_id and prop_data ###How is this used?
The prop_id column contains the property identifier, and the prop_data column contains the property state in a serialized format [2].
default_refs with node_id and refs_data ###How is this used?
The node_id contains the identifier of the reference target node, and the refs_data contains the list of referencing property identifiers in a serialized format [2].
Say the structure is / --folderA:nt:folder (propertyX:references fileB) ----fileA:nt:file --fileB:nt:file [...] My question then is how would the database store the uuids or nodes of the structure that is defined above. Very simple structure but to understand how this structure is actually translated to be stored in the database would be helpful.
You'd have four node rows: the root node, folderA, fileA, and fileB. The serialized node_data part of the root and folderA nodes would contain the node identifiers of the child nodes (folderA and fileB for the root node, and fileA for folderA). All properties would be stored in the property table. Additionally the reference from propertyX to fileB would be stored as a separate refs row with the fileB UUID as the node_id value and a serialized property identifier list that contains just the propertyX identifier as the refs_data value. I hope this description helps. Note that this only applies to the traditional database persistence managers. The new bundle persistence managers in Jackrabbit 1.3 work a bit differently, though the same identifier->data structure is still in use. BR, Jukka Zitting [1] http://jackrabbit.apache.org/api/1.2.1/org/apache/jackrabbit/core/persistence/PersistenceManager.html [2] http://jackrabbit.apache.org/api/1.2.1/org/apache/jackrabbit/core/persistence/util/Serializer.html
