On Wed, Nov 17, 2010 at 12:05 PM, <[email protected]> wrote: > So I will have to run a cluster configuration on this machine1, because I > will have two independent JVMs hitting on > the same repository? Yes.
> I really don't want to run cluster nodes on a single machine, just so that > different JVMs can access the repository. > That doesn't look correct. I am sure that will be better ways to solve this > issue as well. Although I suspect this isn't typical, there's nothing wrong with this. Multiple JVMs = cluster nodes; doesn't really matter if they're on the same physical machine or multiple physical machines. Justin > > Any ideas will be of great help. > > -Nikhil > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of > Justin Edelson > Sent: Wednesday, November 17, 2010 12:12 AM > To: [email protected] > Subject: Re: Multiple instances of repository > > Nikhil- > I think you should rethink you're architecture. It really doesn't make > sense to be bringing repository instances up only for a 2-4 minute > job. Instead, you should think about using the Command pattern and > package your "applications" as executable jobs which can be run inside > a long-running VM against a local repository instance (i.e. making > in-process calls instead of RMI or DavEx). > > This is where something like OSGi and Apache Sling can be *very* > helpful, but there are obviously other ways to add/remove jobs at > runtime. See, for example, Sling's Scheduler support: > http://sling.apache.org/site/scheduler-service-commons-scheduler.html > > Justin > > On Tue, Nov 16, 2010 at 5:16 AM, <[email protected]> wrote: >> Thanks for your inputs, they are really helpful. >> >> Well, so does my application is not a good candidate to use jackrabbit. >> >> The other option, I had was to use jackrabbit in client-server mode. In this >> case I will be accessing the repository from RMI. But in the jackrabbit >> documents it has been mentioned that RMI is not optimized for performance >> and I should use embedded repository instance in my application code for >> better performance. >> >> I can remove the search functionality from these clusters, because the life >> span of these will be very short. The application will take 2-4 minutes to >> do its job and I don't think we really need search for these clusters. >> >> But my question is, should I really use the clustering feature. I mean >> cluster nodes should normally have a longer life span. But here in this case >> the nodes will have very short life span 2-4 minutes. >> I am kind of finding it hard to use these short span applications as cluster >> nodes. >> >> Thanks, >> Nikhil >> >> -----Original Message----- >> From: Seidel. Robert [mailto:[email protected]] >> Sent: Tuesday, November 16, 2010 3:33 PM >> To: [email protected] >> Subject: AW: Multiple instances of repository >> >> Hi Nikhil, >> >> I don't know if it will work (setProperty), but you have another problem. >> The Lucene search index is always saved in the file system. And afaik, each >> repository home needs its own index directories (so you have the index files >> for each cluster). If you make a new cluster, you have to wait for a long >> time till the index is built, depending on the data in your repository (if >> you have tons of data, you have to wait a week or longer). >> >> The tables of the FS and PM will be shared between all cluster nodes - that >> works. >> >> Kindly regards, Robert >> >> -----Ursprüngliche Nachricht----- >> Von: [email protected] [mailto:[email protected]] >> Gesendet: Dienstag, 16. November 2010 10:54 >> An: [email protected] >> Betreff: RE: Multiple instances of repository >> >> Since there could be n number of instances. So I can't decide the cluster id >> beforehand. >> Hence I have the following code that creates a cluster id at run time. >> >> System.setProperty("org.apache.jackrabbit.core.cluster.node_id", >> "cluster_id"+System.nanoTime()); >> >> Similarly the repositoryHome path is generated at run time. >> >> But do I also need separate tables for workspace file system? I have the >> following configuration for my workspace. Is it correct? The tables for the >> workspace FS and PersistenceManager will be shared between all the nodes or >> will these tables will be different? >> >> <?xml version="1.0"?> >> <!DOCTYPE Repository >> PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" >> "http://jackrabbit.apache.org/dtd/repository-2.0.dtd"> >> >> <Repository> >> >> <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore"> >> <param name="driver" value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <param name="databaseType" value="oracle"/> >> <param name="copyWhenReading" value="true"/> >> <param name="tablePrefix" value=""/> >> <param name="schemaObjectPrefix" value="J_R_DS_"/> >> <param name="schemaCheckEnabled" value="false"/> >> </DataStore> >> >> <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> >> <param name="driver" value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <!-- The following value must oracle for oracle server this >> is not the same as the database schema --> >> <param name="schema" value="oracle"/> >> <param name="schemaObjectPrefix" value="J_R_FS_"/> >> <param name="schemaCheckEnabled" value="false"/> >> </FileSystem> >> >> <Security appName="Jackrabbit"> >> <SecurityManager >> class="repository.jcr.jackrabbit.EipSecurityManager" /> >> <AccessManager >> class="org.apache.jackrabbit.core.security.SimpleAccessManager" /> >> <LoginModule >> class="org.apache.jackrabbit.core.security.SimpleLoginModule"> >> <param name="principalProvider" >> value="repository.jcr.jackrabbit.EipPrincipalProvider" /> >> </LoginModule> >> </Security> >> >> <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="eip" >> /> >> >> <Workspace name="${wsp.name}"> >> <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> >> <param name="driver" >> value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <!-- The following value must oracle for oracle >> server this is not the same as the database schema --> >> <param name="schema" value="oracle"/> >> <param name="schemaObjectPrefix" >> value="J_FS_${wsp.name}_"/> >> <param name="schemaCheckEnabled" value="false"/> >> </FileSystem> >> <PersistenceManager >> class="org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager"> >> <param name="driver" >> value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <param name="tableSpace" value="" /> >> <!-- The following value must oracle for oracle >> server this is not the same as the database schema --> >> <param name="schema" value="oracle" /> >> <param name="schemaObjectPrefix" >> value="J_PM_${wsp.name}_" /> >> <param name="externalBLOBs" value="false" /> >> <param name="schemaCheckEnabled" value="false"/> >> </PersistenceManager> >> <SearchIndex >> class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> >> <param name="path" value="${wsp.home}/index"/> >> <param name="supportHighlighting" value="true"/> >> </SearchIndex> >> </Workspace> >> >> <Versioning rootPath="${rep.home}/version"> >> >> <FileSystem >> class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> >> <param name="driver" >> value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <!-- The following value must oracle for oracle >> server this is not the same as the database schema --> >> <param name="schema" value="oracle"/> >> <param name="schemaObjectPrefix" value="J_V_FS_"/> >> <param name="schemaCheckEnabled" value="false"/> >> </FileSystem> >> <!-- Change to Oracle Class <PersistenceManager >> class="org.apache.jackrabbit.core.state.db.SimpleDbPersistenceManager"> --> >> <PersistenceManager >> class="org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager"> >> <param name="driver" >> value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <param name="tableSpace" value="" /> >> <!-- The following value must oracle for oracle >> server this is not the same as the database schema --> >> <param name="schema" value="oracle" /> >> <param name="schemaObjectPrefix" value="J_V_PM_" /> >> <param name="externalBLOBs" value="false" /> >> <param name="schemaCheckEnabled" value="false"/> >> </PersistenceManager> >> >> </Versioning> >> >> <SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> >> <param name="path" value="${rep.home}/search/index"/> >> <param name="supportHighlighting" value="true"/> >> </SearchIndex> >> >> <Cluster syncDelay="2000"> >> <Journal >> class="org.apache.jackrabbit.core.journal.OracleDatabaseJournal"> >> <param name="revision" value="${rep.home}/revision.log" /> >> <param name="driver" >> value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> <param name="schemaObjectPrefix" value="J_R_" /> >> <param name="databaseType" value="oracle"/> >> </Journal> >> </Cluster> >> >> </Repository> >> >> Thanks, >> Nikhil >> -----Original Message----- >> From: Seidel. Robert [mailto:[email protected]] >> Sent: Tuesday, November 16, 2010 2:42 PM >> To: [email protected] >> Subject: AW: Multiple instances of repository >> >> Hi Nikhil, >> >> you need clustering, because all of your instances should access the same >> repository. >> >> What you need is separate repository homes for each instance. In my use case >> I have an installation directory for each instance, so the repository home >> is located below this directory. >> >> You have to make sure, that each instance has also its own repository.xml >> because you need to define different clusterIDs. >> >> And you have to define a cluster section in the repository.xml where the >> journal is located, which is necessary for synchronization: >> >> <Cluster id="node1" syncDelay="5000"> >> <Journal >> class="org.apache.jackrabbit.core.journal.OracleDatabaseJournal"> >> <param name="driver" value="javax.naming.InitialContext"/> >> <param name="url" value="jdbc/amiDBDataSource"/> >> ... >> </Journal> >> </Cluster> >> >> Kindly regards, Robert >> >> -----Ursprüngliche Nachricht----- >> Von: [email protected] [mailto:[email protected]] >> Gesendet: Dienstag, 16. November 2010 09:37 >> An: [email protected] >> Betreff: RE: Multiple instances of repository >> >> Thanks for replying back. I will need little more help to understand the >> things completely. >> I will just elaborate a bit more on my usage scenario. I am also attaching >> my repository.xml file with this mail. Please let me know if you want to >> know more about my environment. >> >> In my case, I want to keep all the data in one database and I want to use >> jackrabbit as JCR over this database. >> I have the jackrabbit embedded in my application so the repository gets-up >> as part of the application. >> Now this application reads some files from repository and also inserts some >> data in repository. >> There could be two instances of the application app1 running on machine1 and >> app2 running on machine2. >> So my application instances are different and I can create multiple >> repository homes to avoid the locking problem but I still wants to insert >> the data from these applications in same database tables. >> So if all the application instances use the same repository configuration >> file and specify their own repository home. >> Will that work in my case? Will there be any consistency issues? >> >> When you say separate data store and separate persistence managers, you mean >> separate repository configuration file or separate database tables for data >> stores and persistence managers. >> >> My instances and the repositories operate separately from each other but >> they still want to share the data. The data inserted by one application >> instance should be visible to other instance. So they all should be >> inserting the data in same tables, that's what my understanding is. >> >> Thanks, >> Nikhil >> >> -----Original Message----- >> From: Seidel. Robert [mailto:[email protected]] >> Sent: Tuesday, November 16, 2010 1:22 PM >> To: [email protected] >> Subject: AW: Multiple instances of repository >> >> Hi Nikhil, >> >> if you want to use clustering, you have to define a repository home for each >> cluster. >> >> Clustering is necessary, if you want to have the same data/indexes at all >> cluster nodes - the key word is synchronization. >> >> If your instances and the repositories operate separately from each other, >> you don't need clustering. Separate repository homes, data stores and >> persistence managers will do the job. >> >> Kindly regards, Robert >> >> -----Ursprüngliche Nachricht----- >> Von: [email protected] [mailto:[email protected]] >> Gesendet: Dienstag, 16. November 2010 08:33 >> An: [email protected] >> Betreff: Multiple instances of repository >> >> Hi, >> >> I am using jackrabbit as JCR implementation in my project. I am running >> jackrabbit with in my application in the same jvm. >> The application read the content from repository and also writes some >> content in repository. >> There could be multiple concurrent instances of my application running on >> the same or different machines. >> I have a configuration file for jackrabbit and I have a single repository >> home for jackrabbit. >> Now as soon as one instance of the application is up and running, I can't >> run the other instance as the first instance creates a lock file in >> repository home. >> After doing some search I came to know about running the jackrabbit in >> clustered mode. >> Now my question is even in this case I will have to specify a different >> repository home for every run, right? >> That means I should form the repository home path at the run time because at >> compile time I am not sure how many instance will be run. >> This is a standalone java application and theoretically n number of instance >> can be run. >> My question is when I have to specify a different repository path for every >> run, then the jackrabbit will work even with out clustering? >> Because .lock file will be different for different runs as the repository >> home is different. >> I know I am missing something here, please help me. >> I am attaching my conf file with this mail. >> >> Thanks, >> Nikhil >> >> >
