Jacek,

I think I have found out why you are seeing what you are seeing (Yes,
I managed to replicate your setup, and will commit the perf test to
qi4j-tests/performance after I have added some well formatted output).

The underlying OpenRDF/Sesame requires one to specify which kind of
indices one wants, and if none is given then it will not index the
content, hence a linear search occurs. The RDF indexing configuration
(for the Native store) looks like

    @Optional @Matches( "([spoc][spoc][spoc][spoc],?)*" )
Property<String> tripleIndexes();

and the code does;
        String tripleIndexes =
configuration.configuration().tripleIndexes().get();
        if( tripleIndexes == null )
        {
            tripleIndexes = "";
            configuration.configuration().tripleIndexes().set( tripleIndexes );
        }


The default behavior is perhaps not very desirable, but I am not
familiar enough with RDF to understand how it works to give any clever
clue on how it should be. I am trying to figure this out...


Cheers
Niclas


On Thu, Nov 12, 2009 at 5:50 PM, Jacek Sokulski <[email protected]> wrote:
> Here are some snapshots form the code
> Persistence layer configuration:
>     private LayerAssembly createInfrastructureLayer(
>             ApplicationAssembly applicationAssembly) throws
> AssemblyException {
>         LayerAssembly infrastructureLayer = applicationAssembly
>                 .layerAssembly(LAYER_INFRASTRUCTURE);
>
>         // Persistence module
>         ModuleAssembly module = infrastructureLayer
>                 .moduleAssembly(MODULE_PERSISTENCE);
>
>         // Indexing
>         module.addObjects(EntityStateSerializer.class,
>                 EntityTypeSerializer.class);
>         module.addServices(NativeRepositoryService.class).identifiedBy(
>                 "rdf-repository").instantiateOnStartup();
>
>         module.addServices(RdfFactoryService.class).visibleIn(application)
>                 .instantiateOnStartup();
>
>         module.addServices(RdfQueryService.class).visibleIn(application)
>                 .instantiateOnStartup();
>
>         // Entity store
>            module.addServices(JdbmEntityStoreService.class,
>                 UuidIdentityGeneratorService.class).visibleIn(application)
>                 .instantiateOnStartup();
>
>        //Config
>         ModuleAssembly config =
> module.layerAssembly().moduleAssembly("Config");
>
> config.addEntities(JdbmConfiguration.class).visibleIn(Visibility.layer);
>         Preferences jdbmPreferences = Preferences.userRoot().node(
>                 module.layerAssembly().applicationAssembly().name() + "/"
>                         + "Jdbm");
>         jdbmPreferences.put("file", "test.db");
>         config.addEntities(NativeConfiguration.class).visibleIn(
>                 Visibility.application);
>         config.addServices(PreferencesEntityStoreService.class).setMetaInfo(
>                 new PreferencesEntityStoreInfo(jdbmPreferences))
>                 .instantiateOnStartup();
>
>         config.addServices(UuidIdentityGeneratorService.class);
>
>         return infrastructureLayer;
>     }
>
> The test:
> public void testIndexing2() throws ConcurrentEntityModificationException,
> UnitOfWorkCompletionException {
>         UnitOfWork uow = unitOfWorkFactory.newUnitOfWork();
>         ServiceReference<LeadRepository> leadRepoRef =
> serviceLocator.findService(LeadRepositoryService.class);
>         LeadRepository leadRepo = leadRepoRef.get();
>         ServiceReference<LeadEntityFactoryService> leadFactoryRef =
> serviceLocator.findService(LeadEntityFactoryService.class);
>         LeadEntityFactory leadFactory = leadFactoryRef.get();
>         long start, end;
>         start = System.currentTimeMillis();
>         for (int i= 0; i<100000; i++){
>             leadFactory.create("Lead"+i);
>         }
>         uow.complete();
>         end = System.currentTimeMillis();
>         System.out.println("Population time: "+ (end-start));
>         uow = unitOfWorkFactory.newUnitOfWork();
>         start = System.currentTimeMillis();
>         Lead lead = leadRepo.findByName("Lead38467");
>         end = System.currentTimeMillis();
>         System.out.println("Lead: " +lead);
>         System.out.println("Retrival time by name: "+ (end-start));
>         uow.complete();
>     }
> the factory:
> public class LeadEntityFactoryMixin implements LeadEntityFactory{
>     @Structure UnitOfWorkFactory uowf;
>     public Lead create( String name )
>     {
>         UnitOfWork uow = uowf.currentUnitOfWork();
>         EntityBuilder<LeadEntity> builder = uow.newEntityBuilder(
> LeadEntity.class );
>         Lead prototype = builder.instanceFor( LeadEntity.class );
>         prototype.name().set( name );
>            return builder.newInstance();
>     }
> }
>
> the repository:
> public Lead findByName(String name) {
>         UnitOfWork uow = uowf.currentUnitOfWork();
>         QueryBuilder<Lead> builder = qbf.newQueryBuilder( Lead.class );
>         Lead template = templateFor( Lead.class );
>         Query<Lead> query = builder.where( eq( template.name(), name )
> ).newQuery(uow);
>         for(Lead lead: query){
>             return lead;
>         }
>         return null;
>     }
>
> Jacek
>
> 2009/11/11 Niclas Hedhman <[email protected]>
>>
>> On Wed, Nov 11, 2009 at 12:43 AM, Jacek Sokulski <[email protected]>
>> wrote:
>> > Hi,
>> > I have been playing with jdbm and rdf indexing (as far as I know the
>> > only
>> > one in Qi4j at this moment).  The results for quering database with 100
>> > 000
>> > simple entities is as follow:
>> >  * querying by Id is less than 1 ms
>> >  * finding entity by one property (string) lasts 12 sec, the query is:
>> >     Query<Lead> query = builder.where( eq( template.name(), name )
>> > ).newQuery(uow);
>> > I hoped that rdf indexing will be faster or at least not slower than
>> > querying RDBMS...
>> > Or am I missing something? Any special configuration is required?
>>
>> This doesn't reflect the testcases on the topic. Can you post here or
>> send me what you have used to test this?
>>
>> Btw, are you interested in helping out on performance testing across
>> the board? We really need someone to help out in this field.
>> What I have in mind; Performance tests are run on the CI server and
>> produces timelined data output which are plotted on the webpages. I am
>> willing to help setting up the infrastructure to do the plots, but
>> need someone to be dedicated to create the Qi4j code to execute.
>>
>>
>> Cheers
>> --
>> Niclas Hedhman, Software Developer
>> http://www.qi4j.org - New Energy for Java
>>
>> I  live here; http://tinyurl.com/2qq9er
>> I  work here; http://tinyurl.com/2ymelc
>> I relax here; http://tinyurl.com/2cgsug
>>
>> _______________________________________________
>> qi4j-dev mailing list
>> [email protected]
>> http://lists.ops4j.org/mailman/listinfo/qi4j-dev
>
>



-- 
Niclas Hedhman, Software Developer
http://www.qi4j.org - New Energy for Java

I  live here; http://tinyurl.com/2qq9er
I  work here; http://tinyurl.com/2ymelc
I relax here; http://tinyurl.com/2cgsug

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Reply via email to