Noam, Please see repo.xml below. Oracle is 10.2.0.4. Size of repo is just under 1TB. The total nodes indexed where right under 8 million.
j <?xml version="1.0" encoding="UTF-8"?> <!-- ~ Artifactory is a binaries repository manager. ~ Copyright (C) 2010 JFrog Ltd. ~ ~ Artifactory is free software: you can redistribute it and/or modify ~ it under the terms of the GNU Lesser General Public License as published by ~ the Free Software Foundation, either version 3 of the License, or ~ (at your option) any later version. ~ ~ Artifactory is distributed in the hope that it will be useful, ~ but WITHOUT ANY WARRANTY; without even the implied warranty of ~ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ~ GNU Lesser General Public License for more details. ~ ~ You should have received a copy of the GNU Lesser General Public License ~ along with Artifactory. If not, see <http://www.gnu.org/licenses/>. --><!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN" "http://jackrabbit.apache.org/dtd/repository-2.0.dtd"> <Repository> <!-- Oracle Datasource --> <DataSources> <DataSource name="ds"> <!-- Leave this on "oracle" --> <param name="databaseType" value="oracle" /> <param name="driver" value="oracle.jdbc.driver.OracleDriver" /> <param name="url" value="XXXX" /> <param name="user" value="XXXX" /> <param name="password" value="XXXX" /> <!--<param name="validationQuery" value=""/>--> <!-- Unlimited when not specified --> <!--<param name="maxPoolSize" value="80"/>--> </DataSource> </DataSources> <!-- virtual file system where the repository stores global state (e.g. registered namespaces, custom node types, etc.) --> <!-- Oracle File System --> <FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem"> <param name="dataSourceName" value="ds" /> <param name="schemaObjectPrefix" value="rep_" /> <!--<param name="schema" value=""/>--> </FileSystem> <!-- http://wiki.apache.org/jackrabbit/DataStore --> <!-- Oracle Datastore --> <DataStore class="org.artifactory.jcr.jackrabbit.ArtifactoryDbDataStoreImpl"> <param name="dataSourceName" value="ds" /> <param name="schemaObjectPrefix" value="ds_" /> <!--<param name="tablePrefix" value=""/>--> <param name="minRecordLength" value="512" /> <!-- Whether to use a cache blobs temporarily on the file system for faster reads --> <param name="cacheBlobs" value="true" /> <!-- The maximum size of the blobs cache in gigabytes (g), megabytes (m) or kilobytes (k) --> <param name="blobsCacheMaxSize" value="1g" /> <!-- The blobs cache directory location --> <param name="blobsCacheDir" value="${rep.home}/cache" /> </DataStore> <!-- security configuration --> <Security appName="Jackrabbit"> <SecurityManager class="org.artifactory.jcr.NullJackrabbitSecurityManager" /> </Security> <!-- location of workspaces root directory and name of default workspace --> <Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" /> <!-- workspace configuration template: used to create the initial workspace if there's no workspace yet --> <Workspace name="${wsp.name}"> <!-- virtual file system of the workspace: class: FQN of class implementing the FileSystem interface --> <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem"> <param name="path" value="${wsp.home}" /> </FileSystem> <!-- persistence manager of the workspace: class: FQN of class implementing the PersistenceManager interface --> <!-- Oracle Persistence Manager --> <PersistenceManager class="org.apache.jackrabbit.core.persistence.pool.OraclePersistenceManager"> <param name="dataSourceName" value="ds" /> <param name="schemaObjectPrefix" value="${wsp.name}_" /> <param name="bundleCacheSize" value="16" /> <param name="errorHandling" value="IGN_MISSING_BLOBS" /> </PersistenceManager> <!-- Search index and the file system it uses. class: FQN of class implementing the QueryHandler interface If required by the QueryHandler implementation, one may configure a FileSystem that the handler may use. Supported parameters for lucene search index: - path: location of the index. This parameter is mandatory! - useCompoundFile: advises lucene to use compound files for the index files - minMergeDocs: minimum number of nodes in an index until segments are merged - volatileIdleTime: idle time in seconds until the volatile index is moved to persistent index even though minMergeDocs is not reached. - maxMergeDocs: maximum number of nodes in segments that will be merged - mergeFactor: determines how often segment indices are merged - maxFieldLength: the number of words that are fulltext indexed at most per property. - bufferSize: maximum number of documents that are held in a pending queue until added to the index - cacheSize: size of the document number cache. This cache maps uuids to lucene document numbers - forceConsistencyCheck: runs a consistency check on every startup. If false, a consistency check is only performed when the search index detects a prior forced shutdown. This parameter only has an effect if 'enableConsistencyCheck' is set to 'true'. - enableConsistencyCheck: if set to 'true' a consistency check is performed depending on the parameter 'forceConsistencyCheck'. If set to 'false' no consistency check is performed on startup, even if a redo log had been applied. - autoRepair: errors detected by a consistency check are automatically repaired. If false, errors are only written to the log. - analyzer: class name of a lucene analyzer to use for fulltext indexing of text. - queryClass: class name that implements the javax.jcr.query.Query interface. this class must extend the class: org.apache.jackrabbit.core.query.AbstractQueryImpl - respectDocumentOrder: If true and the query does not contain an 'order by' clause, result nodes will be in document order. For better performance when queries return a lot of nodes set to 'false'. - resultFetchSize: The number of results the query handler should initially fetch when a query is executed. Default value: Integer.MAX_VALUE (-> all) - extractorPoolSize: defines the maximum number of background threads that are used to extract text from binary properties. If set to zero (default) no background threads are allocated and text extractors run in the current thread. - extractorTimeout: a text extractor is executed using a background thread if it doesn't finish within this timeout defined in milliseconds. This parameter has no effect if extractorPoolSize is zero. - extractorBackLogSize: the size of the extractor pool back log. If all threads in the pool are busy, incoming work is put into a wait queue. If the wait queue reaches the back log size incoming extractor work will not be queued anymore but will be executed with the current thread. - synonymProviderClass: the name of a class that implements org.apache.jackrabbit.core.query.lucene.SynonymProvider. The default value is null (-> not set). Note: all parameters (except path) in this SearchIndex config are default values and can be omitted. --> <SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> <param name="path" value="${rep.home}/index" /> <param name="useCompoundFile" value="true" /> <!-- Default is 100 --> <param name="minMergeDocs" value="500" /> <param name="maxMergeDocs" value="10000" /> <param name="volatileIdleTime" value="3" /> <!-- Default is 10: more segments quicker the indexing but slower the searching --> <param name="mergeFactor" value="10" /> <param name="maxFieldLength" value="10000" /> <!-- Default is 10 --> <param name="bufferSize" value="100" /> <param name="cacheSize" value="1000" /> <param name="forceConsistencyCheck" value="false" /> <param name="enableConsistencyCheck" value="true" /> <param name="autoRepair" value="true" /> <param name="analyzer" value="org.artifactory.search.lucene.ArtifactoryAnalyzer" /> <param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl" /> <param name="respectDocumentOrder" value="false" /> <param name="resultFetchSize" value="700" /> <param name="supportHighlighting" value="true" /> <param name="excerptProviderClass" value="org.artifactory.search.ArchiveEntriesXmlExcerpt" /> <!-- Use 5 background threads for text extraction that takes more than 100 milliseconds --> <param name="extractorPoolSize" value="5" /> <param name="extractorTimeout" value="100" /> <!-- Default is 100 --> <param name="extractorBackLogSize" value="500" /> <!-- Indexing configuration --> <param name="indexingConfiguration" value="${rep.home}/index/index_config.xml" /> <!-- Workspace inconsistency handler --> <param name="onWorkspaceInconsistency" value="lenient" /> </SearchIndex> <!-- http://issues.apache.org/jira/browse/JCR-314 --> <ISMLocking class="org.apache.jackrabbit.core.state.FineGrainedISMLocking" /> </Workspace> <!-- Configures the versioning --> <Versioning rootPath="${rep.home}/version"> <!-- Configures the filesystem to use for versioning for the respective persistence manager --> <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem"> <param name="path" value="${rep.home}/version" /> </FileSystem> <!-- Configures the persistence manager to be used for persisting version state. Please note that the current versioning implementation is based on a 'normal' persistence manager, but this could change in future implementations. --> <!--We do not use versioning--> <PersistenceManager class="org.apache.jackrabbit.core.persistence.mem.InMemPersistenceManager"> <param name="persistent" value="false" /> </PersistenceManager> </Versioning> <!-- Clustering configuration --> <!-- <Cluster id="node1"> <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal"> <param name="revision" value="${rep.home}/revision.log"/> <param name="driver" value="com.mysql.jdbc.Driver"/> <param name="url" value="jdbc:mysql://localhost:3306/artifactory?useUnicode=true&characterEncoding=UTF-8"/> <param name="user" value="artifactory_user"/> <param name="password" value="password"/> </Journal> </Cluster> --> </Repository> On Aug 1, 2010, at 10:36 AM, Noam Y. Tenne wrote: > OK, we will look into this. > Could you please send us the JCR repo.xml file which is being used? (you > may send it to [email protected], if you wish to send it privately). > Also, which version of Oracle are you using, and what is the approximate > size of the repository that was converted? > > Thanks for reporting this > > On 08/01/2010 05:22 PM, Jay Colson wrote: >> it MOST DEFINITELY grows _way_ larger than 1GB (180 GB +) --- which is a >> real issue during an upgrade, when you can't just stop the server and clear >> the cache, because it just rebuilds it all over again. >> >> j >> >> On Aug 1, 2010, at 10:21 AM, Noam Y. Tenne wrote: >> >> >>> If by "cache" folder you mean the one within $ARTIFACTORY_HOME/data/, >>> then yes; you may remove it (but make sure to stop Artifactory before >>> doing so). >>> According to the default Oracle JCR configuration xml that's bundled >>> with Artifactory (can be found within >>> $ARTIFACTORY_HOME/etc/repo/oracle10/), the cache folder size limit >>> should be 1GB (see Repository->Datastore->blobsCacheMaxSize). >>> If the cache folder grows uncontrollably beyond that size, this may be a >>> bug. >>> >>> On 08/01/2010 04:15 PM, Jay Colson wrote: >>> >>>> This ran for 5-6 hours. The cache was actually what wad growing out of >>>> control. I had to stop the process once to add another 200gb disk >>>> space. Which >>>> Made the upgrade process take about 13 hours all together. Is it safe >>>> to delink cache directories from the filesystem and let artifactory >>>> recreate them? How can the cache be better managed? >>>> >>>> On Aug 1, 2010, at 8:49 AM, "Noam Y. Tenne"<[email protected]> wrote: >>>> >>>> >>>> >>>>> Hi Jay, >>>>> >>>>> Version 2.2 includes changes and improvements to the search capabilities >>>>> of Artifactory, hence the need for re-indexing and the growth of index >>>>> size. >>>>> The times it takes to re-index can vary from minutes and up to an hour, >>>>> depending on repository size, and the strength of host machine. >>>>> As of now, is it still indexing? If so, how large is your repository and >>>>> what are the specs of the server it runs on? >>>>> >>>>> Noam >>>>> >>>>> On 07/31/2010 08:21 AM, Jay Colson wrote: >>>>> >>>>> >>>>>> Just upgraded to 2.5.5 from 2.0.8 and this thing has been indexing for >>>>>> hours. It's eating all my disk space (when my db is in oracle) -- Is >>>>>> there a way to just disable indexing for now? >>>>>> >>>>>> Thanks, >>>>>> j >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> The Palm PDK Hot Apps Program offers developers who use the >>>>>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >>>>>> of $1 Million in cash or HP Products. Visit us here for more details: >>>>>> http://p.sf.net/sfu/dev2dev-palm >>>>>> _______________________________________________ >>>>>> Artifactory-users mailing list >>>>>> [email protected] >>>>>> https://lists.sourceforge.net/lists/listinfo/artifactory-users >>>>>> >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------------ >>>>> The Palm PDK Hot Apps Program offers developers who use the >>>>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >>>>> of $1 Million in cash or HP Products. Visit us here for more details: >>>>> http://p.sf.net/sfu/dev2dev-palm >>>>> _______________________________________________ >>>>> Artifactory-users mailing list >>>>> [email protected] >>>>> https://lists.sourceforge.net/lists/listinfo/artifactory-users >>>>> >>>>> >>>> ------------------------------------------------------------------------------ >>>> The Palm PDK Hot Apps Program offers developers who use the >>>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >>>> of $1 Million in cash or HP Products. Visit us here for more details: >>>> http://p.sf.net/sfu/dev2dev-palm >>>> _______________________________________________ >>>> Artifactory-users mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/artifactory-users >>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>> The Palm PDK Hot Apps Program offers developers who use the >>> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >>> of $1 Million in cash or HP Products. Visit us here for more details: >>> http://p.sf.net/sfu/dev2dev-palm >>> _______________________________________________ >>> Artifactory-users mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/artifactory-users >>> >> >> ------------------------------------------------------------------------------ >> The Palm PDK Hot Apps Program offers developers who use the >> Plug-In Development Kit to bring their C/C++ apps to Palm for a share >> of $1 Million in cash or HP Products. Visit us here for more details: >> http://p.sf.net/sfu/dev2dev-palm >> _______________________________________________ >> Artifactory-users mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/artifactory-users >> > > > ------------------------------------------------------------------------------ > The Palm PDK Hot Apps Program offers developers who use the > Plug-In Development Kit to bring their C/C++ apps to Palm for a share > of $1 Million in cash or HP Products. Visit us here for more details: > http://p.sf.net/sfu/dev2dev-palm > _______________________________________________ > Artifactory-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/artifactory-users ------------------------------------------------------------------------------ The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm _______________________________________________ Artifactory-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/artifactory-users
