We've opened up http://issues.jfrog.org/jira/browse/RTFACT-3395 for this
issue.
On 08/01/2010 05:49 PM, Jay Colson wrote:
Noam,
Please see repo.xml below. Oracle is 10.2.0.4. Size of repo is just under
1TB. The total nodes indexed where right under 8 million.
j
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Artifactory is a binaries repository manager.
~ Copyright (C) 2010 JFrog Ltd.
~
~ Artifactory is free software: you can redistribute it and/or modify
~ it under the terms of the GNU Lesser General Public License as published by
~ the Free Software Foundation, either version 3 of the License, or
~ (at your option) any later version.
~
~ Artifactory is distributed in the hope that it will be useful,
~ but WITHOUT ANY WARRANTY; without even the implied warranty of
~ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
~ GNU Lesser General Public License for more details.
~
~ You should have received a copy of the GNU Lesser General Public License
~ along with Artifactory. If not, see<http://www.gnu.org/licenses/>.
--><!DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 2.0//EN"
"http://jackrabbit.apache.org/dtd/repository-2.0.dtd">
<Repository>
<!-- Oracle Datasource -->
<DataSources>
<DataSource name="ds">
<!-- Leave this on "oracle" -->
<param name="databaseType" value="oracle" />
<param name="driver" value="oracle.jdbc.driver.OracleDriver" />
<param name="url" value="XXXX" />
<param name="user" value="XXXX" />
<param name="password" value="XXXX" />
<!--<param name="validationQuery" value=""/>-->
<!-- Unlimited when not specified -->
<!--<param name="maxPoolSize" value="80"/>-->
</DataSource>
</DataSources>
<!--
virtual file system where the repository stores global state
(e.g. registered namespaces, custom node types, etc.)
-->
<!-- Oracle File System -->
<FileSystem class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
<param name="dataSourceName" value="ds" />
<param name="schemaObjectPrefix" value="rep_" />
<!--<param name="schema" value=""/>-->
</FileSystem>
<!-- http://wiki.apache.org/jackrabbit/DataStore -->
<!-- Oracle Datastore -->
<DataStore
class="org.artifactory.jcr.jackrabbit.ArtifactoryDbDataStoreImpl">
<param name="dataSourceName" value="ds" />
<param name="schemaObjectPrefix" value="ds_" />
<!--<param name="tablePrefix" value=""/>-->
<param name="minRecordLength" value="512" />
<!-- Whether to use a cache blobs temporarily on the file system for
faster reads -->
<param name="cacheBlobs" value="true" />
<!-- The maximum size of the blobs cache in gigabytes (g), megabytes (m)
or kilobytes (k) -->
<param name="blobsCacheMaxSize" value="1g" />
<!-- The blobs cache directory location -->
<param name="blobsCacheDir" value="${rep.home}/cache" />
</DataStore>
<!--
security configuration
-->
<Security appName="Jackrabbit">
<SecurityManager class="org.artifactory.jcr.NullJackrabbitSecurityManager"
/>
</Security>
<!--
location of workspaces root directory and name of default workspace
-->
<Workspaces rootPath="${rep.home}/workspaces" defaultWorkspace="default" />
<!--
workspace configuration template:
used to create the initial workspace if there's no workspace yet
-->
<Workspace name="${wsp.name}">
<!--
virtual file system of the workspace:
class: FQN of class implementing the FileSystem interface
-->
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${wsp.home}" />
</FileSystem>
<!--
persistence manager of the workspace:
class: FQN of class implementing the PersistenceManager interface
-->
<!-- Oracle Persistence Manager -->
<PersistenceManager
class="org.apache.jackrabbit.core.persistence.pool.OraclePersistenceManager">
<param name="dataSourceName" value="ds" />
<param name="schemaObjectPrefix" value="${wsp.name}_" />
<param name="bundleCacheSize" value="16" />
<param name="errorHandling" value="IGN_MISSING_BLOBS" />
</PersistenceManager>
<!--
Search index and the file system it uses.
class: FQN of class implementing the QueryHandler interface
If required by the QueryHandler implementation, one may configure
a FileSystem that the handler may use.
Supported parameters for lucene search index:
- path: location of the index. This parameter is mandatory!
- useCompoundFile: advises lucene to use compound files for the
index files
- minMergeDocs: minimum number of nodes in an index until segments
are merged
- volatileIdleTime: idle time in seconds until the volatile index is
moved to persistent index even though minMergeDocs is not reached.
- maxMergeDocs: maximum number of nodes in segments that will be
merged
- mergeFactor: determines how often segment indices are merged
- maxFieldLength: the number of words that are fulltext indexed at
most per property.
- bufferSize: maximum number of documents that are held in a pending
queue until added to the index
- cacheSize: size of the document number cache. This cache maps
uuids to lucene document numbers
- forceConsistencyCheck: runs a consistency check on every startup.
If
false, a consistency check is only performed when the search index
detects a prior forced shutdown. This parameter only has an effect
if 'enableConsistencyCheck' is set to 'true'.
- enableConsistencyCheck: if set to 'true' a consistency check is
performed depending on the parameter 'forceConsistencyCheck'. If
set to 'false' no consistency check is performed on startup, even
if a redo log had been applied.
- autoRepair: errors detected by a consistency check are
automatically
repaired. If false, errors are only written to the log.
- analyzer: class name of a lucene analyzer to use for fulltext
indexing of text.
- queryClass: class name that implements the javax.jcr.query.Query
interface.
this class must extend the class:
org.apache.jackrabbit.core.query.AbstractQueryImpl
- respectDocumentOrder: If true and the query does not contain an
'order by' clause,
result nodes will be in document order. For better performance
when queries return
a lot of nodes set to 'false'.
- resultFetchSize: The number of results the query handler should
initially fetch when a query is executed.
Default value: Integer.MAX_VALUE (-> all)
- extractorPoolSize: defines the maximum number of background
threads that are
used to extract text from binary properties. If set to zero
(default) no
background threads are allocated and text extractors run in the
current thread.
- extractorTimeout: a text extractor is executed using a background
thread if it
doesn't finish within this timeout defined in milliseconds. This
parameter has
no effect if extractorPoolSize is zero.
- extractorBackLogSize: the size of the extractor pool back log. If
all threads in
the pool are busy, incoming work is put into a wait queue. If the
wait queue
reaches the back log size incoming extractor work will not be
queued anymore
but will be executed with the current thread.
- synonymProviderClass: the name of a class that implements
org.apache.jackrabbit.core.query.lucene.SynonymProvider. The
default value is null (-> not set).
Note: all parameters (except path) in this SearchIndex config are
default
values and can be omitted.
-->
<SearchIndex
class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
<param name="path" value="${rep.home}/index" />
<param name="useCompoundFile" value="true" />
<!-- Default is 100 -->
<param name="minMergeDocs" value="500" />
<param name="maxMergeDocs" value="10000" />
<param name="volatileIdleTime" value="3" />
<!-- Default is 10: more segments quicker the indexing but slower the
searching -->
<param name="mergeFactor" value="10" />
<param name="maxFieldLength" value="10000" />
<!-- Default is 10 -->
<param name="bufferSize" value="100" />
<param name="cacheSize" value="1000" />
<param name="forceConsistencyCheck" value="false" />
<param name="enableConsistencyCheck" value="true" />
<param name="autoRepair" value="true" />
<param name="analyzer"
value="org.artifactory.search.lucene.ArtifactoryAnalyzer" />
<param name="queryClass"
value="org.apache.jackrabbit.core.query.QueryImpl" />
<param name="respectDocumentOrder" value="false" />
<param name="resultFetchSize" value="700" />
<param name="supportHighlighting" value="true" />
<param name="excerptProviderClass"
value="org.artifactory.search.ArchiveEntriesXmlExcerpt" />
<!--
Use 5 background threads for text extraction that takes more than
100 milliseconds
-->
<param name="extractorPoolSize" value="5" />
<param name="extractorTimeout" value="100" />
<!-- Default is 100 -->
<param name="extractorBackLogSize" value="500" />
<!-- Indexing configuration -->
<param name="indexingConfiguration"
value="${rep.home}/index/index_config.xml" />
<!-- Workspace inconsistency handler -->
<param name="onWorkspaceInconsistency" value="lenient" />
</SearchIndex>
<!-- http://issues.apache.org/jira/browse/JCR-314 -->
<ISMLocking class="org.apache.jackrabbit.core.state.FineGrainedISMLocking"
/>
</Workspace>
<!--
Configures the versioning
-->
<Versioning rootPath="${rep.home}/version">
<!--
Configures the filesystem to use for versioning for the respective
persistence manager
-->
<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
<param name="path" value="${rep.home}/version" />
</FileSystem>
<!--
Configures the persistence manager to be used for persisting
version state.
Please note that the current versioning implementation is based on
a 'normal' persistence manager, but this could change in future
implementations.
-->
<!--We do not use versioning-->
<PersistenceManager
class="org.apache.jackrabbit.core.persistence.mem.InMemPersistenceManager">
<param name="persistent" value="false" />
</PersistenceManager>
</Versioning>
<!-- Clustering configuration -->
<!--
<Cluster id="node1">
<Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
<param name="revision" value="${rep.home}/revision.log"/>
<param name="driver" value="com.mysql.jdbc.Driver"/>
<param name="url"
value="jdbc:mysql://localhost:3306/artifactory?useUnicode=true&characterEncoding=UTF-8"/>
<param name="user" value="artifactory_user"/>
<param name="password" value="password"/>
</Journal>
</Cluster>
-->
</Repository>
On Aug 1, 2010, at 10:36 AM, Noam Y. Tenne wrote:
OK, we will look into this.
Could you please send us the JCR repo.xml file which is being used? (you
may send it to [email protected], if you wish to send it privately).
Also, which version of Oracle are you using, and what is the approximate
size of the repository that was converted?
Thanks for reporting this
On 08/01/2010 05:22 PM, Jay Colson wrote:
it MOST DEFINITELY grows _way_ larger than 1GB (180 GB +) --- which is a real
issue during an upgrade, when you can't just stop the server and clear the
cache, because it just rebuilds it all over again.
j
On Aug 1, 2010, at 10:21 AM, Noam Y. Tenne wrote:
If by "cache" folder you mean the one within $ARTIFACTORY_HOME/data/,
then yes; you may remove it (but make sure to stop Artifactory before
doing so).
According to the default Oracle JCR configuration xml that's bundled
with Artifactory (can be found within
$ARTIFACTORY_HOME/etc/repo/oracle10/), the cache folder size limit
should be 1GB (see Repository->Datastore->blobsCacheMaxSize).
If the cache folder grows uncontrollably beyond that size, this may be a
bug.
On 08/01/2010 04:15 PM, Jay Colson wrote:
This ran for 5-6 hours. The cache was actually what wad growing out of
control. I had to stop the process once to add another 200gb disk
space. Which
Made the upgrade process take about 13 hours all together. Is it safe
to delink cache directories from the filesystem and let artifactory
recreate them? How can the cache be better managed?
On Aug 1, 2010, at 8:49 AM, "Noam Y. Tenne"<[email protected]> wrote:
Hi Jay,
Version 2.2 includes changes and improvements to the search capabilities
of Artifactory, hence the need for re-indexing and the growth of index
size.
The times it takes to re-index can vary from minutes and up to an hour,
depending on repository size, and the strength of host machine.
As of now, is it still indexing? If so, how large is your repository and
what are the specs of the server it runs on?
Noam
On 07/31/2010 08:21 AM, Jay Colson wrote:
Just upgraded to 2.5.5 from 2.0.8 and this thing has been indexing for hours.
It's eating all my disk space (when my db is in oracle) -- Is there a way to
just disable indexing for now?
Thanks,
j
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://p.sf.net/sfu/dev2dev-palm
_______________________________________________
Artifactory-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/artifactory-users