Thanks Dirk, I should have found that page on my own. I am going to look into using the BTreeManager, just curious what are the limitations for documents/file counts within a node? I am planning on storing a lot of data in JackRabbit (terabytes). Also, is the configuration code I posted in my previous posts the best way to do things? Or can I simplify it and just do something like this to get a repo:

ServiceLoader.load(Class.forName("org.apache.jackrabbit.jcr2dav.Jcr2davRepositoryFactory"));
return JcrUtils.getRepository(jackabbitServerUrl);

On 11/13/2015 03:47 PM, Dirk Rudolph wrote:
Did I understood you right, you have thousands of child nodes below the
root node?

You should avoid this because this is considered bad practice in terms of
write performance and depending on your concurrent access this might also
block read access.

http://wiki.apache.org/jackrabbit/Performance

Try to introduce a structure to your content using BTreeManger


https://jackrabbit.apache.org/api/2.10/org/apache/jackrabbit/commons/flat/BTreeManager.html

Cheers, D


On Friday, 13 November 2015, David Marginian <[email protected]> wrote:

Thanks Clay.  I am not trying to load that many records at once.  The
application is crawling a directory.  It places the files from that
directory into JackRabbit one at a time, and puts a content id onto a queue
which is picked up by consumers on different servers.  Those consumers then
use the content id to retrieve the file from JackRabbit. Each piece of
content is saved in a node under the root node.  The performance slowdown
is coming from calling session.getRootNode(), from what I can gather from
the docs I need the root node in order to add a child node.  Note the
slowdown is pretty significant and I don't need to have close to 50k to
start seeing it (I start seeing it within a few minutes of running my
app).  I don't need orderable nodes, how do I disable that?


On 11/13/2015 03:10 PM, Clay Ferguson wrote:

​Please let us know more about your use case. Why are you even "trying" to
load that many records all at once. Or at least scan them one by one, I
mean. In most use cases you wouldn't need to do this kind of thing, unless
it's some kind of backup or replication. I say "most" cases... I'm not
   saying you don't need to just asking for a bit more background. BTW: If
you don't need 'orderable' nodes try to avoid them. That type of node does
not work at 'scale'... and 50K is propably pushing it.​

Best regards,
Clay Ferguson
[email protected]


On Fri, Nov 13, 2015 at 3:33 PM, <[email protected]> wrote:

Hi,
I am new to JackRabbit and using version 2.11.2.  I am using JackRabbit
to
store documents in a multi-threaded environment.  I noticed that the time
it takes to retrieve the root node is inconsistent and slow (several
seconds +) and degrades over time (after 50K plus child nodes retrieval
is
taking ~15 seconds).

Originally, I was using code as follows to obtain a repository:

   public Repository getRepository() throws ClassNotFoundException,
RepositoryException {


ServiceLoader.load(Class.forName("org.apache.jackrabbit.jcr2dav.Jcr2davRepositoryFactory"));
       return JcrUtils.getRepository(jackabbitServerUrl);
   }

Then I came across the following thread:


http://jackrabbit.510166.n4.nabble.com/getRootNode-takes-27-seconds-td1571027.html#a1571302

This thread had some useful information (BatchReadConfig), but I am not
certain how to use the API to take advantage of it.  I have changed my
code
to the following but it doesn't appear that node retrieval performance
has
improved, is there something I am missing/doing wrong?

1) Repository Factory
public Repository getRepository(@SuppressWarnings("rawtypes") Map
parameters) throws RepositoryException {
          String repositoryFactoryName = parameters != null && (

  parameters.containsKey(PARAM_REPOSITORY_SERVICE_FACTORY) ||
                          parameters.containsKey(PARAM_REPOSITORY_CONFIG))
                  ?
"org.apache.jackrabbit.jcr2spi.Jcr2spiRepositoryFactory"
                  : "org.apache.jackrabbit.core.RepositoryFactoryImpl";

          Object repositoryFactory;
          try {
              Class<?> repositoryFactoryClass =
Class.forName(repositoryFactoryName, true,
                      Thread.currentThread().getContextClassLoader());

              repositoryFactory = repositoryFactoryClass.newInstance();
          }
          catch (Exception e) {
              throw new RepositoryException(e);
          }

          if (repositoryFactory instanceof RepositoryFactory) {
              return ((RepositoryFactory)
repositoryFactory).getRepository(parameters);
          }
          else {
              throw new RepositoryException(repositoryFactory + " is not a
RepositoryFactory");
          }
      }

2) Use the factory to get a repo:
   public Repository getRepository() throws ClassNotFoundException,
RepositoryException {
          Map<String, RepositoryConfig> parameters =
Collections.singletonMap(
                  "org.apache.jackrabbit.jcr2spi.RepositoryConfig",
                  (RepositoryConfig) new
RepositoryConfigImpl(jackabbitServerUrl));

          return getRepository(parameters);
      }

3) Repository Config:
private static final class RepositoryConfigImpl implements
RepositoryConfig {

          private String jackabbitServerUrl;

          private RepositoryConfigImpl(String jackabbitServerUrl) {
              super();
              this.jackabbitServerUrl = jackabbitServerUrl;
          }

          public CacheBehaviour getCacheBehaviour() {
              return CacheBehaviour.INVALIDATE;
          }

          public int getItemCacheSize() {
              return 100;
          }

          public int getPollTimeout() {
              return 5000;
          }

          public RepositoryService getRepositoryService() throws
RepositoryException {
              BatchReadConfig brc = new BatchReadConfig() {
                  public int getDepth(Path path, PathResolver resolver)
throws NamespaceException {
                      return 1;
                  }
              };
              return new RepositoryServiceImpl(jackabbitServerUrl, brc);
          }

      }

Thanks for your time.

David






Reply via email to