Nikhil- I think you should rethink you're architecture. It really doesn't make sense to be bringing repository instances up only for a 2-4 minute job. Instead, you should think about using the Command pattern and package your "applications" as executable jobs which can be run inside a long-running VM against a local repository instance (i.e. making in-process calls instead of RMI or DavEx).
This is where something like OSGi and Apache Sling can be *very* helpful, but there are obviously other ways to add/remove jobs at runtime. See, for example, Sling's Scheduler support: http://sling.apache.org/site/scheduler-service-commons-scheduler.html Justin On Tue, Nov 16, 2010 at 5:16 AM, <[email protected]> wrote: > Thanks for your inputs, they are really helpful. > > Well, so does my application is not a good candidate to use jackrabbit. > > The other option, I had was to use jackrabbit in client-server mode. In this > case I will be accessing the repository from RMI. But in the jackrabbit > documents it has been mentioned that RMI is not optimized for performance and > I should use embedded repository instance in my application code for better > performance. > > I can remove the search functionality from these clusters, because the life > span of these will be very short. The application will take 2-4 minutes to do > its job and I don't think we really need search for these clusters. > > But my question is, should I really use the clustering feature. I mean > cluster nodes should normally have a longer life span. But here in this case > the nodes will have very short life span 2-4 minutes. > I am kind of finding it hard to use these short span applications as cluster > nodes. > > Thanks, > Nikhil > > -----Original Message----- > From: Seidel. Robert [mailto:[email protected]] > Sent: Tuesday, November 16, 2010 3:33 PM > To: [email protected] > Subject: AW: Multiple instances of repository > > Hi Nikhil, > > I don't know if it will work (setProperty), but you have another problem. The > Lucene search index is always saved in the file system. And afaik, each > repository home needs its own index directories (so you have the index files > for each cluster). If you make a new cluster, you have to wait for a long > time till the index is built, depending on the data in your repository (if > you have tons of data, you have to wait a week or longer). > > The tables of the FS and PM will be shared between all cluster nodes - that > works. > > Kindly regards, Robert > > -----Ursprüngliche Nachricht----- > Von: [email protected] [mailto:[email protected]] > Gesendet: Dienstag, 16. November 2010 10:54 > An: [email protected] > Betreff: RE: Multiple instances of repository > > Since there could be n number of instances. So I can't decide the cluster id > beforehand. > Hence I have the following code that creates a cluster id at run time. > > System.setProperty("org.apache.jackrabbit.core.cluster.node_id", > "cluster_id"+System.nanoTime()); > > Similarly the repositoryHome path is generated at run time. > > But do I also need separate tables for workspace file system? I have the > following configuration for my workspace. Is it correct? The tables for the > workspace FS and PersistenceManager will be shared between all the nodes or > will these tables will be different? > > <?xml version="1.0"?> > <!DOCTYPE Repository > PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" > "http://jackrabbit.apache.org/dtd/repository-2.0.dtd"> > > <Repository> > > <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore"> > <param name="driver" value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <param name="databaseType" value="oracle"/> > <param name="copyWhenReading" value="true"/> > <param name="tablePrefix" value=""/> > <param name="schemaObjectPrefix" value="J_R_DS_"/> > <param name="schemaCheckEnabled" value="false"/> > </DataStore> > > <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> > <param name="driver" value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <!-- The following value must oracle for oracle server this is > not the same as the database schema --> > <param name="schema" value="oracle"/> > <param name="schemaObjectPrefix" value="J_R_FS_"/> > <param name="schemaCheckEnabled" value="false"/> > </FileSystem> > > <Security appName="Jackrabbit"> > <SecurityManager > class="repository.jcr.jackrabbit.EipSecurityManager" /> > <AccessManager > class="org.apache.jackrabbit.core.security.SimpleAccessManager" /> > <LoginModule > class="org.apache.jackrabbit.core.security.SimpleLoginModule"> > <param name="principalProvider" > value="repository.jcr.jackrabbit.EipPrincipalProvider" /> > </LoginModule> > </Security> > > <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="eip" /> > > <Workspace name="${wsp.name}"> > <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> > <param name="driver" > value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <!-- The following value must oracle for oracle server > this is not the same as the database schema --> > <param name="schema" value="oracle"/> > <param name="schemaObjectPrefix" > value="J_FS_${wsp.name}_"/> > <param name="schemaCheckEnabled" value="false"/> > </FileSystem> > <PersistenceManager > class="org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager"> > <param name="driver" > value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <param name="tableSpace" value="" /> > <!-- The following value must oracle for oracle server > this is not the same as the database schema --> > <param name="schema" value="oracle" /> > <param name="schemaObjectPrefix" > value="J_PM_${wsp.name}_" /> > <param name="externalBLOBs" value="false" /> > <param name="schemaCheckEnabled" value="false"/> > </PersistenceManager> > <SearchIndex > class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> > <param name="path" value="${wsp.home}/index"/> > <param name="supportHighlighting" value="true"/> > </SearchIndex> > </Workspace> > > <Versioning rootPath="${rep.home}/version"> > > <FileSystem > class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> > <param name="driver" > value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <!-- The following value must oracle for oracle server > this is not the same as the database schema --> > <param name="schema" value="oracle"/> > <param name="schemaObjectPrefix" value="J_V_FS_"/> > <param name="schemaCheckEnabled" value="false"/> > </FileSystem> > <!-- Change to Oracle Class <PersistenceManager > class="org.apache.jackrabbit.core.state.db.SimpleDbPersistenceManager"> --> > <PersistenceManager > class="org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager"> > <param name="driver" > value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <param name="tableSpace" value="" /> > <!-- The following value must oracle for oracle server > this is not the same as the database schema --> > <param name="schema" value="oracle" /> > <param name="schemaObjectPrefix" value="J_V_PM_" /> > <param name="externalBLOBs" value="false" /> > <param name="schemaCheckEnabled" value="false"/> > </PersistenceManager> > > </Versioning> > > <SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> > <param name="path" value="${rep.home}/search/index"/> > <param name="supportHighlighting" value="true"/> > </SearchIndex> > > <Cluster syncDelay="2000"> > <Journal > class="org.apache.jackrabbit.core.journal.OracleDatabaseJournal"> > <param name="revision" value="${rep.home}/revision.log" /> > <param name="driver" > value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > <param name="schemaObjectPrefix" value="J_R_" /> > <param name="databaseType" value="oracle"/> > </Journal> > </Cluster> > > </Repository> > > Thanks, > Nikhil > -----Original Message----- > From: Seidel. Robert [mailto:[email protected]] > Sent: Tuesday, November 16, 2010 2:42 PM > To: [email protected] > Subject: AW: Multiple instances of repository > > Hi Nikhil, > > you need clustering, because all of your instances should access the same > repository. > > What you need is separate repository homes for each instance. In my use case > I have an installation directory for each instance, so the repository home is > located below this directory. > > You have to make sure, that each instance has also its own repository.xml > because you need to define different clusterIDs. > > And you have to define a cluster section in the repository.xml where the > journal is located, which is necessary for synchronization: > > <Cluster id="node1" syncDelay="5000"> > <Journal > class="org.apache.jackrabbit.core.journal.OracleDatabaseJournal"> > <param name="driver" value="javax.naming.InitialContext"/> > <param name="url" value="jdbc/amiDBDataSource"/> > ... > </Journal> > </Cluster> > > Kindly regards, Robert > > -----Ursprüngliche Nachricht----- > Von: [email protected] [mailto:[email protected]] > Gesendet: Dienstag, 16. November 2010 09:37 > An: [email protected] > Betreff: RE: Multiple instances of repository > > Thanks for replying back. I will need little more help to understand the > things completely. > I will just elaborate a bit more on my usage scenario. I am also attaching my > repository.xml file with this mail. Please let me know if you want to know > more about my environment. > > In my case, I want to keep all the data in one database and I want to use > jackrabbit as JCR over this database. > I have the jackrabbit embedded in my application so the repository gets-up as > part of the application. > Now this application reads some files from repository and also inserts some > data in repository. > There could be two instances of the application app1 running on machine1 and > app2 running on machine2. > So my application instances are different and I can create multiple > repository homes to avoid the locking problem but I still wants to insert the > data from these applications in same database tables. > So if all the application instances use the same repository configuration > file and specify their own repository home. > Will that work in my case? Will there be any consistency issues? > > When you say separate data store and separate persistence managers, you mean > separate repository configuration file or separate database tables for data > stores and persistence managers. > > My instances and the repositories operate separately from each other but they > still want to share the data. The data inserted by one application instance > should be visible to other instance. So they all should be inserting the data > in same tables, that's what my understanding is. > > Thanks, > Nikhil > > -----Original Message----- > From: Seidel. Robert [mailto:[email protected]] > Sent: Tuesday, November 16, 2010 1:22 PM > To: [email protected] > Subject: AW: Multiple instances of repository > > Hi Nikhil, > > if you want to use clustering, you have to define a repository home for each > cluster. > > Clustering is necessary, if you want to have the same data/indexes at all > cluster nodes - the key word is synchronization. > > If your instances and the repositories operate separately from each other, > you don't need clustering. Separate repository homes, data stores and > persistence managers will do the job. > > Kindly regards, Robert > > -----Ursprüngliche Nachricht----- > Von: [email protected] [mailto:[email protected]] > Gesendet: Dienstag, 16. November 2010 08:33 > An: [email protected] > Betreff: Multiple instances of repository > > Hi, > > I am using jackrabbit as JCR implementation in my project. I am running > jackrabbit with in my application in the same jvm. > The application read the content from repository and also writes some content > in repository. > There could be multiple concurrent instances of my application running on the > same or different machines. > I have a configuration file for jackrabbit and I have a single repository > home for jackrabbit. > Now as soon as one instance of the application is up and running, I can't run > the other instance as the first instance creates a lock file in repository > home. > After doing some search I came to know about running the jackrabbit in > clustered mode. > Now my question is even in this case I will have to specify a different > repository home for every run, right? > That means I should form the repository home path at the run time because at > compile time I am not sure how many instance will be run. > This is a standalone java application and theoretically n number of instance > can be run. > My question is when I have to specify a different repository path for every > run, then the jackrabbit will work even with out clustering? > Because .lock file will be different for different runs as the repository > home is different. > I know I am missing something here, please help me. > I am attaching my conf file with this mail. > > Thanks, > Nikhil > >
