OK, great, happy to help. Yes, a lot of people will want to do this, and I've tried unsuccessfully in the past to get the details surrounding backup techniques, so when I got back from vacation and saw your post, I didn't want to lose the info!
Would you please confirm that I got the information correct, as I haven't tried this yet? After you confirm, I will update my config, test it, and then provide my new config as a blueprint and put in the wiki. Best, Mark On 8/7/07, David Nuescheler <[EMAIL PROTECTED]> wrote: > > Hi Mark, > > I think this is an excellent idea, thanks a lot for putting in the effort. > > I think the case that someone would like to store all their content > within the same RDBMS is common enough that we even should > have a blueprint example config in the documentation. > > thanks again, > david > > > On 8/7/07, Mark Waschkowski <[EMAIL PROTECTED]> wrote: > > Hi David, > > > > I would like to update the wiki with the below information, as I think > its > > quite valuable and would help new users without having to scour the > mailing > > list. If you verify the following, I will update the wiki. > > > > -----For wiki: > > Using DBFileSystem as specified in the repository.xml: > > <Repository> > > <FileSystem ...> > > > > and using the same database any of the PersistenceManager entries, the > only > > things that need to be backed up are: > > 1) repository.xml > > 2) the database > > > > Then, to restore from a backup, all that would need to be done is to use > the > > backed up repository.xml , restore the database using the backup, and > the > > indexes will rebuild themselves when the system restarts. This will > properly > > handle versioning as well. > > > > Note: rebuilding of indexes may take a significant amount of time > > ----end > > > > If all that looks correct, I'll fill in an example FileSystem and update > the > > wiki. As well, any suggestions for the 'significant amount of time > part'? > > > > Thanks, > > > > Mark > > > > On 7/30/07, David Nuescheler <[EMAIL PROTECTED]> wrote: > > > > > > Hi Bruce, > > > > > > thanks for your comment. > > > > > > > I am not fired by index problems. -) > > > > I just want to everybody realize it is very critical issue to back > up > > > your repository. > > > > Currently, the solution is: > > > > 1) Backup DB data. > > > > 2) Backup your file system and you can delete all indexes of them. > > > > However, it is still a bug that JackRabbit v1.3 can not rebuild > > > everything from DB, in > > > > case your hard driver dies with all your repository file system. > > > Shouldn't that be solved by the DBFileSystem. > > > > http://yukatan.fi/2007/1.4/org/apache/jackrabbit/core/fs/db/DbFileSystem.html > > > > > > > > > This allows you to store everything that is necessary for a complete > > > restore > > > in the DB, which means your DB backup is the only thing (beyond the > > > repository.xml) that you need to restore a complete JR instance. > > > > > > > My concerns are two: > > > > 1) Performance of navigation of Nodes which relates cache manager > > > resizing > > > I appreciate the performance issue. I am still not convinced that this > > > is related > > > with the cache manager resizing... > > > > > > > 2) Logic backup repository using JCR export/import API. > > > I agree that it would be desirable to have a built-in backup/restore > > > mechanism on a higher level. > > > > > > The JCR export/import is probably not the right layer, > > > since it only covers the content in a single workspace and has no > > > means to address things like nodetypes, versions or the > > > namespace registry. > > > And I think your most pressing issue should be addressed > > > by the DBFileSystem. > > > > > > regards, > > > david > > > > > > > -----Original Message----- > > > > From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On > Behalf Of > > > Bertrand Delacretaz > > > > Sent: Friday, July 27, 2007 3:15 AM > > > > To: [email protected] > > > > Subject: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big > Trouble??) > > > > > > > > Hi, > > > > > > > > I hate to play grumpy old man once again, but the recent trend > towards > > > > Loud Subjects That Catch Peoples Attention does not really help the > > > > discussion, so let's rename this thread ;-) > > > > > > > > Bruce, if I read your message correctly, it looks like you have > three > > > > problems with Jackrabbit: > > > > > > > > 1) Cache Manager resizes seem to slow your app down > > > > 2) You're going to be fired because you lost your index (or > Jackrabbit > > > did) > > > > 3) You're not sure about which application pattern/content model to > use > > > > > > > > So let's please tackle these one at a time, ideally in separate > > > > threads so that people can contribute efficiently to the discussion. > > > > > > > > > Sorry if I'm being a bit harsh, but IMHO you started it with the > > > > choice of your message's subject ;-) > > > > -Bertrand > > > > > > > > > > > > On 7/27/07, Bruce Li < [EMAIL PROTECTED]> wrote: > > > > > I have been in this Jackrabbit Community for a couple of months > since > > > I joined repository project two months ago. > > > > > > > > > > > > > > > > > > > > First, I respect and appreciate all hard works contributed in > current > > > JackRabbit project and definitely I am sure a lot of developers > benefit from > > > this project. There are some people contribute their JackRabbit > working > > > experience like David Nuescheler, who collects "7 DR Rules", which is > > > precious since current lack of document of JackRabbit, and they are > "real" > > > working experiences. > > > > > > > > > > > > > > > > > > > > However, I also heard some negative voice from this community like > > > "JackRabbit is dead (for us)" from Frédéric Esnault. I suffer some > troubles > > > from JackRabbit and it seems foundational problems. I would like to > share > > > all my experience with you, and any feedback or good suggestion is > > > definitely what I want. > > > > > > > > > > > > > > > > > > > > Since these troubles are "big" troubles for enterprise use of > > > JackRabbit 1.3, let's discuss it from beginning. > > > > > > > > > > > > > > > > > > > > Question 1: > > > > > > > > > > Why do you select JackRabbit rather than Database as your > repository > > > solution? > > > > > > > > > > > > > > > > > > > > There are a lot of answers for this question and it seems that > > > everybody who joins this community has already known the answers (It > may be > > > formal document which was approved by your CTO). However, my opinion, > this > > > is the basic question really need to be discussed here. > > > > > > > > > > > > > > > > > > > > To answer this question, some technical key words to support > > > Jackrabbit may be "JCR API", "Lucene Search Engine" and so on. > However, as > > > the user of JackRabbit, I would like to list the two key concerns why > I > > > select JackRabbit as repository solution from Product Point of View: > > > > > > > > > > > > > > > > > > > > 1. Quick and effective data search/fetch from volume content > > > repository > > > > > 2. Build-in content version/revision control without extra > code > > > > > > > > > > > > > > > > > > > > Now let me describe the big troubles I met in my use: > > > > > > > > > > 1. Quick and effective data search or fetch from volume > content > > > repository > > > > > > > > > > > > > > > > > > > > Experience: There are not many data on my repository which > contains > > > hundreds of two major object nodes, each node (object) contains less > than 20 > > > properties (fields), including the other 5 child nodes (nested small > > > objects) and one of two major nodes(object) has one binary data (up to > 1 > > > megabyte). Unfortunately, the performance is not acceptable when I > navigate > > > nodes of the major nodes. The main problem is the build-in Cache > Manager of > > > JackRabbit resizes which costs uncertain time, which result the > operation > > > very slow sometimes. It is not easy to read those codes when > debugging > > > Jackrabbit for performance tuning because there is no document about > the > > > logic behind the index resizing. > > > > > > > > > > > > > > > > > > > > 2. Content version/revision control > > > > > > > > > > Experience: This function works well on Jackrabbit v1.3. The main > > > problem is that all revision (except base revision) of node are lost > when > > > export/import data from one repository to another repository. I am > > > discussing this issue because it concerns the repository backup. > > > > > > > > > > > > > > > > > > > > I just found in JackRabbit v1.3, there is no way to backup > repository > > > using DB as persistence manager. I mean that there is no way to > re-index > > > based on data on DB. The following is my case: > > > > > > > > > > > > > > > > > > > > In one repository server, the index (in file system) is corrupt > which > > > causes all search failure. However, all data (in DB) is still alive, > where > > > you can iterate all of them. After clean the whole repository file > system > > > (most of them are index information), Jackrabbit can not correctly > re-build > > > index based on the data on DB. If it happens on production repository, > it > > > means: "My God, I am going to be fired". As I know, Jackrabbit v1.1can > > > successfully re-index (creating totally new repository index (file > system) > > > based on DB data). > > > > > > > > > > > > > > > > > > > > As the alternative solution to backup repository, I try to > > > export/import all nodes from repository to another repository using > JCR > > > Export API (exportSystemView). The good news is that JackRabbot > v1.3successfully builds index (the whole file system) during the importing > > > process; the bad news is that it lost all revision of all versioning > nodes. > > > Can you image how frustrate I am when I realize there is no way to > backup > > > repository based on DB data? > > > > > > > > > > > > > > > > > > > > I just got the answer for the re-index issue for Jackrabbit v1.3: > You > > > CAN NOT delete all file system. Only delete all indexes but keep the > other > > > folders. Jackrabbit can re-index successfully when it starts up. > > > > > > > > > > > > > > > > > > > > Question 2: > > > > > > > > > > How can developer correctly use Jackrabbit (JCR) as their > repository > > > solution? > > > > > > > > > > > > > > > > > > > > The expert of jackrabbit may see that I use object to describe > node > > > and you may think it is not the pattern you are using Jackrabbit. So > the > > > question is raised as "Which is the best practices (pattern) to use > > > Jackrabbit (JCR) as repository solution." > > > > > > > > > > > > > > > > > > > > From this community, I see a lot of developers use Jackrabbit by > > > fetching contents by path. It means that they do not need treat node > as > > > object, instead, they put content on repository as asset, which can be > > > easily and effectively retrieved by a given path. This pattern exactly > meets > > > the truth of "The simplicity is the best". > > > > > > > > > > > > > > > > > > > > My use of Jackrabbit is based on the business requirement, which > need > > > to navigate most of nodes and reference nodes, check child nodes and > > > properties to find the proper content by a couple of business rules. I > would > > > like to say that all performance issues are raised by nodes iteration > > > process. Even more, I have created generic classes using java reflect > > > package for bi-directory mapping between nodes and objects. For > performance > > > improvement, the mapping supports generic child nodes lazy loading. > However, > > > it seems all these jobs do not solve the performance problem although > they > > > sound pretty "professional". You may ask me: if you have such > business > > > requirement, why not go to DB and build the full relationship for your > > > > business model? J2EE developers all know how powerful java-db world > is: the > > > mature ORM tool ( e.g. Hibernate), transaction management, batch data > > > fetching, performance tuning and so on. However, my question is: "Is > there > > > any good pattern in current jackrabbit to effectively handle data > fetching > > > with week relationship?" > > > > > > > > > > > > > > > > > > > > Now it is time to say some words to the jackrabbit developers and > > > contributors what I really want to say for the whole community: > > > > > > > > > > > > > > > > > > > > My begs: > > > > > > > > > > Guide, document and sample code is the king for any open source. > How > > > frustrating for Jackrabbit developers find the incorrect pattern is > applied > > > by users on their projects. On the other hand, how frustrating for > > > JackRabbit users can not find the good pattern to follow, which can > save > > > their bunch of time. From product point of view, the search by XPath > or > > > XQuery or SQL is not foundational issue. The foundational issue is one > > > effective search means covers most of important requirements from real > world > > > and the document can be found in jackrabbit web site. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I do believe Jackrabbit is qualified project and I really hope all > > > > "best features" are documented, demoed and used by the whole > community. > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > Bruce > > > > > > > > > > > > > > > -- > > Best, > > > > Mark Waschkowski > > > -- Best, Mark Waschkowski
