Re: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big Trouble??)

David Nuescheler Mon, 30 Jul 2007 02:15:21 -0700

Hi Bruce,

thanks for your comment.


> I am not fired by index problems. -)
> I just want to everybody realize it is very critical issue to back up your 
> repository.
> Currently, the solution is:
> 1) Backup DB data.
> 2) Backup your file system and you can delete all indexes of them.
> However, it is still a bug that JackRabbit v1.3 can not rebuild everything 
> from DB, in
> case your hard driver dies with all your repository file system.
Shouldn't that be solved by the DBFileSystem.
http://yukatan.fi/2007/1.4/org/apache/jackrabbit/core/fs/db/DbFileSystem.html

This allows you to store everything that is necessary for a complete restore
in the DB, which means your DB backup is the only thing (beyond the
repository.xml) that you need to restore a complete JR instance.

> My concerns are two:
> 1) Performance of navigation of Nodes which relates cache manager resizing
I appreciate the performance issue. I am still not convinced that this
is related
with the cache manager resizing...

> 2) Logic backup repository using JCR export/import API.
I agree that it would be desirable to have a built-in backup/restore
mechanism on a higher level.

The JCR export/import is probably not the right layer,
since it only covers the content in a single workspace and has no
means to address things like nodetypes, versions or the
namespace registry.
And I think your most pressing issue should be addressed
by the DBFileSystem.

regards,
david

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bertrand 
> Delacretaz
> Sent: Friday, July 27, 2007 3:15 AM
> To: [email protected]
> Subject: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big Trouble??)
>
> Hi,
>
> I hate to play grumpy old man once again, but the recent trend towards
> Loud Subjects That Catch Peoples Attention does not really help the
> discussion, so let's rename this thread ;-)
>
> Bruce, if I read your message correctly, it looks like you have three
> problems with Jackrabbit:
>
> 1) Cache Manager resizes seem to slow your app down
> 2) You're going to be fired because you lost your index (or Jackrabbit did)
> 3) You're not sure about which application pattern/content model to use
>
> So let's please tackle these one at a time, ideally in separate
> threads so that people can contribute efficiently to the discussion.
>
> Sorry if I'm being a bit harsh, but IMHO you started it with the
> choice of your message's subject ;-)
> -Bertrand
>
>
> On 7/27/07, Bruce Li <[EMAIL PROTECTED]> wrote:
> > I have been in this Jackrabbit Community for a couple of months since I 
> > joined repository project two months ago.
> >
> >
> >
> > First, I respect and appreciate all hard works contributed in current 
> > JackRabbit project and definitely I am sure a lot of developers benefit 
> > from this project. There are some people contribute their JackRabbit 
> > working experience like David Nuescheler, who collects "7 DR Rules", which 
> > is precious since current lack of document of JackRabbit, and they are 
> > "real" working experiences.
> >
> >
> >
> > However, I also heard some negative voice from this community like 
> > "JackRabbit is dead (for us)" from Frédéric Esnault. I suffer some troubles 
> > from JackRabbit and it seems foundational problems. I would like to share 
> > all my experience with you, and any feedback or good suggestion is 
> > definitely what I want.
> >
> >
> >
> > Since these troubles are "big" troubles for enterprise use of JackRabbit 
> > 1.3, let's discuss it from beginning.
> >
> >
> >
> > Question 1:
> >
> > Why do you select JackRabbit rather than Database as your repository 
> > solution?
> >
> >
> >
> > There are a lot of answers for this question and it seems that everybody 
> > who joins this community has already known the answers (It may be formal 
> > document which was approved by your CTO).  However, my opinion, this is the 
> > basic question really need to be discussed here.
> >
> >
> >
> > To answer this question, some technical key words to support Jackrabbit may 
> > be "JCR API", "Lucene Search Engine" and so on. However, as the user of 
> > JackRabbit, I would like to list the two key concerns why I select 
> > JackRabbit as repository solution from Product Point of View:
> >
> >
> >
> > 1.      Quick and effective data search/fetch from volume content repository
> > 2.      Build-in content version/revision control without extra code
> >
> >
> >
> > Now let me describe the big troubles I met in my use:
> >
> > 1.      Quick and effective data search or fetch from volume content 
> > repository
> >
> >
> >
> > Experience: There are not many data on my repository which contains 
> > hundreds of two major object nodes, each node (object) contains less than 
> > 20 properties (fields), including the other 5 child nodes (nested small 
> > objects) and one of two major nodes(object) has one binary data (up to 1 
> > megabyte). Unfortunately, the performance is not acceptable when I navigate 
> > nodes of the major nodes. The main problem is the build-in Cache Manager of 
> > JackRabbit resizes which costs uncertain time, which result the operation 
> > very slow sometimes.  It is not easy to read those codes when debugging 
> > Jackrabbit for performance tuning because there is no document about the 
> > logic behind the index resizing.
> >
> >
> >
> > 2.      Content version/revision control
> >
> > Experience: This function works well on Jackrabbit v1.3. The main problem 
> > is that all revision (except base revision) of node are lost when 
> > export/import data from one repository to another repository. I am 
> > discussing this issue because it concerns the repository backup.
> >
> >
> >
> > I just found in JackRabbit v1.3, there is no way to backup repository using 
> > DB as persistence manager. I mean that there is no way to re-index based on 
> > data on DB. The following is my case:
> >
> >
> >
> > In one repository server, the index (in file system) is corrupt which 
> > causes all search failure. However, all data (in DB) is still alive, where 
> > you can iterate all of them. After clean the whole repository file system 
> > (most of them are index information), Jackrabbit can not correctly re-build 
> > index based on the data on DB. If it happens on production repository, it 
> > means: "My God, I am going to be fired". As I know, Jackrabbit v1.1 can 
> > successfully re-index (creating totally new repository index (file system) 
> > based on DB data).
> >
> >
> >
> > As the alternative solution to backup repository, I try to export/import 
> > all nodes from repository to another repository using JCR Export API 
> > (exportSystemView). The good news is that JackRabbot v1.3 successfully 
> > builds index (the whole file system) during the importing process; the bad 
> > news is that it lost all revision of all versioning nodes. Can you image 
> > how frustrate I am when I realize there is no way to backup repository 
> > based on DB data?
> >
> >
> >
> > I just got the answer for the re-index issue for Jackrabbit v1.3: You CAN 
> > NOT delete all file system. Only delete all indexes but keep the other 
> > folders. Jackrabbit can re-index successfully when it starts up.
> >
> >
> >
> > Question 2:
> >
> > How can developer correctly use Jackrabbit (JCR) as their repository 
> > solution?
> >
> >
> >
> > The expert of jackrabbit may see that I use object to describe node and you 
> > may think it is not the pattern you are using Jackrabbit. So the question 
> > is raised as "Which is the best practices (pattern) to use Jackrabbit (JCR) 
> > as repository solution."
> >
> >
> >
> > From this community, I see a lot of developers use Jackrabbit by fetching 
> > contents by path. It means that they do not need treat node as object, 
> > instead, they put content on repository as asset, which can be easily and 
> > effectively retrieved by a given path. This pattern exactly meets the truth 
> > of "The simplicity is the best".
> >
> >
> >
> > My use of Jackrabbit is based on the business requirement, which need to 
> > navigate most of nodes and reference nodes, check child nodes and 
> > properties to find the proper content by a couple of business rules. I 
> > would like to say that all performance issues are raised by nodes iteration 
> > process. Even more, I have created generic classes using java reflect 
> > package for bi-directory mapping between nodes and objects. For performance 
> > improvement, the mapping supports generic child nodes lazy loading. 
> > However, it seems all these jobs do not solve the performance problem 
> > although they sound pretty "professional".  You may ask me: if you have 
> > such business requirement, why not go to DB and build the full relationship 
> > for your business model? J2EE developers all know how powerful java-db 
> > world is: the mature ORM tool (e.g. Hibernate), transaction management, 
> > batch data fetching, performance tuning and so on. However, my question is: 
> > "Is there any good pattern in current jackrabbit to effectively handle data 
> > fetching with week relationship?"
> >
> >
> >
> > Now it is time to say some words to the jackrabbit developers and 
> > contributors what I really want to say for the whole community:
> >
> >
> >
> > My begs:
> >
> > Guide, document and sample code is the king for any open source. How 
> > frustrating for Jackrabbit developers find the incorrect pattern is applied 
> > by users on their projects. On the other hand, how frustrating for 
> > JackRabbit users can not find the good pattern to follow, which can save 
> > their bunch of time. From product point of view, the search by XPath or 
> > XQuery or SQL is not foundational issue. The foundational issue is one 
> > effective search means covers most of important requirements from real 
> > world and the document can be found in jackrabbit web site.
> >
> >
> >
> >
> >
> > I do believe Jackrabbit is qualified project and I really hope all "best 
> > features" are documented, demoed and used by the whole community.
> >
> >
> >
> > Thanks
> >
> >
> >
> > Bruce
>

Re: Jackrabbit = Kick Ass Tool (was: Jackrabbit = Big Trouble??)

Reply via email to