Rupert, thanks for advice. I got the idea to try to index dataset from ehealth 
index source-file called diseasome_dump.nt [9]. This is a good example - I 
thought.

Referring to your previous message:



>As long as your RDF data do define values for rdfs:label the default config is 
>fine. This is also true if you use SKOS, FOAF, Dublin Core Elements/Terms and 
>some other well known RDF schemas. Changing the name field is good. Just make 
>sure you do not change it to "local" or "entityhub" as those names are 
>protected for the use of the Entityhub.

This file includes rdfs:label, so I changed only ‘name’ property in 
indexing.properties[5] config file to "diseasome".





>Does the path to the indexing folder contain a space (e.g. "/stanbol/my 
>indexes/")? This may cause problems with the Apache CLI library used by the 
>Entityhub indexing too. If this is the case please change the path accordingly.

My indexing-dir don't contain white spaces : " D:\tomcat7\bin\stanbol\Indexing".





>Make sure your RDF files are located in the correct directory 
>"{indexing-dir}/indexing/resource/rdfdata" where {indexing-dir} is the working 
>directory (the directory where the 
>"org.apache.stanbol.entityhub.indexing.genericrdf-0.10.1-incubating-SNAPSHOT-jar-with-dependencies.jar"
> is located.

RDF file is located properly  I think: ". 
\stanbol\Indexing\indexing\resources\rdfdata\diseasome.nt".





>You can check if your RDF data are loaded by searching in the log for the file 
>name of your rdf file. You should see the following loggings

>    jenatdb.RdfIndexingSource -  > {path}/{rdf-file}

>    source.ResourceLoader -  > loading '{path}/{rdf-file}' ...

In attachments is file index.log [4] with log from java -jar 
org.apache.stanbol.entityhub.indexing.genericrdf-0.10.1-incubating-SNAPSHOT-jar-with-dependencies.jar
 init process, it include lines:

jenatdb.RdfIndexingSource -  > 
D:\tomcat7\bin\stanbol\Indexing\indexing\resources\rdfdata\diseasome_dump.nt

source.ResourceLoader -  > loading 
'D:\tomcat7\bin\stanbol\Indexing\indexing\resources\rdfdata\diseasome_dump.nt' 
...



>Indexing Process: The indexing process is started after the line

>     [Indexing: Entity Source Reader Deamon] INFO 
> impl.EntityDataBasedIndexingDaemon - ...start iterating over Entity data

Index.log [4] also contain this line.





>After indexing completed you should see two files in   
>{indexing-dir}/indexing/disc

>1. {name}.solrindex.zip : This is basically a ZIP archive of the Apache Solr 
>Core that contains the indexed data. You will need to copy this in the 
>"datafiles" directory of your Stanbol instance

>2. org.apache.stanbol.data.site.{name}-1.0.0.jar: This is an OSGI Bundle 
>containing the configurations for the ReferencedSite, Cache and SolrYard. You 
>will need to install this Bundle by using the Bundle Tab

>of the Apache Felix Webconsole

> (http://{stanbol-instance}/system/console/bundles<http://%7bstanbol-instance%7d/system/console/bundles>).
>  Look for the  [Install/Update...] button. Click it and in the dialog 
> activate the "Start Bundle" option and add the Bundle. The suggested Start 
> Level is fine.

These steps are done correctly.



>As soon as you complete (2) you should see your referenced Site at 
>http://{stanbol-instance}/entityhub/site/{name}<http://%7bstanbol-instance%7d/entityhub/site/%7bname%7d>
> and some seconds after completing (1) the Site should be functional.

Here I have problem. I can't reference site at 
http://{stanbol-instance}/entityhub/site/{name}<http://%7bstanbol-instance%7d/entityhub/site/%7bname%7d>.
 Stanbol error.log [2] after bundle installation log only a warning (file 
error.log [2]):



01.10.2012 13:11:40.206 *WARN* [FelixDispatchQueue] 
org.apache.stanbol.commons.installer.provider.bundle.impl.BundleInstaller  ... 
no Entries found in path 'org\apache\stanbol\data\site\diseasome' configured 
for Bundle 'org.apache.stanbol.data.site.diseasome' with Manifest header field 
'Install-Path'!



>If your RDF data do define rdfs:label's than using the default configuration 
>should be fine. To reset previous changes you can delete the 
>"{indexing-dir}/indexing" and reinitialize to the default by calling

>    java -jar 
> org.apache.stanbol.entityhub.indexing.genericrdf-0.10.1-incubating-SNAPSHOT-jar-with-dependencies.jar
>  init

In this point, this method generates exception:

Exception in thread "main" java.lang.IllegalArgumentException: Unable to find 
configuration file 'indexing.properties'!

        at 
org.apache.stanbol.entityhub.indexing.core.config.IndexingConfig.loadConfig(IndexingConfig.java:599)

        at 
org.apache.stanbol.entityhub.indexing.core.config.IndexingConfig.<init>(IndexingConfig.java:280)

        at 
org.apache.stanbol.entityhub.indexing.core.IndexerFactory.create(IndexerFactory.java:80)

        at 
org.apache.stanbol.entityhub.indexing.core.IndexerFactory.create(IndexerFactory.java:65)

        at org.apache.stanbol.entityhub.indexing.Main.main(Main.java:66)

File 'indexing.properties' [5]must be created manually, same as 'mappings.txt' 
[8], 'fieldboots.properties' [3], 'entityTypes.properties' [1]. This files I've 
got from Stanbol trunk repository.

Log containing this process is: init2.log [7].

Log file containing indexing process when I create the configuration files 
manually: init.log [6].



> If you like you can also sent me your RDF file so that I can try to reproduce 
> your issues.

My RDF file has more then 16 MB:

2012-03-22  10:56        16 997 067 diseasome.nt

But this file is provided in Stanbol trunk repositories [9], so I do not send 
it.



===========================================

Contents of the indexing.7z file:

Directory of E:\Semantic\indexing



[1]. 2012-09-10  17:43             1 095 entityTypes.properties            
(index config file)

[2]. 2012-10-01  13:11            19 953 error.log                              
         (Stanbol error log)

[3]. 2012-09-10  17:39             1 976 fieldboosts.properties             
(index config file)

[4]. 2012-10-01  12:00            13 017 index.log                              
        (index process log)

[5]. 2012-10-01  11:30             7 611 indexing.properties                  
(index config file)

[6]. 2012-10-01  11:58             3 893 init.log                               
              (init process log no. 1)

[7]. 2012-10-01  13:02             1 644 init2.log                              
            (init process log no. 2)

[8]. 2012-09-28  14:08             5 784 mappings.txt                           
     (index config file)

===========================================

[9]. http://dev.iks-project.eu/downloads/stanbol-indices/ehealth/source-files/



I would be very thankful if you could look at it.



Best

Gniewoslaw

Reply via email to