Rupert, thanks for advice. I got the idea to try to index dataset from ehealth
index source-file called diseasome_dump.nt [9]. This is a good example - I
thought.
Referring to your previous message:
>As long as your RDF data do define values for rdfs:label the default config is
>fine. This is also true if you use SKOS, FOAF, Dublin Core Elements/Terms and
>some other well known RDF schemas. Changing the name field is good. Just make
>sure you do not change it to "local" or "entityhub" as those names are
>protected for the use of the Entityhub.
This file includes rdfs:label, so I changed only ‘name’ property in
indexing.properties[5] config file to "diseasome".
>Does the path to the indexing folder contain a space (e.g. "/stanbol/my
>indexes/")? This may cause problems with the Apache CLI library used by the
>Entityhub indexing too. If this is the case please change the path accordingly.
My indexing-dir don't contain white spaces : " D:\tomcat7\bin\stanbol\Indexing".
>Make sure your RDF files are located in the correct directory
>"{indexing-dir}/indexing/resource/rdfdata" where {indexing-dir} is the working
>directory (the directory where the
>"org.apache.stanbol.entityhub.indexing.genericrdf-0.10.1-incubating-SNAPSHOT-jar-with-dependencies.jar"
> is located.
RDF file is located properly I think: ".
\stanbol\Indexing\indexing\resources\rdfdata\diseasome.nt".
>You can check if your RDF data are loaded by searching in the log for the file
>name of your rdf file. You should see the following loggings
> jenatdb.RdfIndexingSource - > {path}/{rdf-file}
> source.ResourceLoader - > loading '{path}/{rdf-file}' ...
In attachments is file index.log [4] with log from java -jar
org.apache.stanbol.entityhub.indexing.genericrdf-0.10.1-incubating-SNAPSHOT-jar-with-dependencies.jar
init process, it include lines:
jenatdb.RdfIndexingSource - >
D:\tomcat7\bin\stanbol\Indexing\indexing\resources\rdfdata\diseasome_dump.nt
source.ResourceLoader - > loading
'D:\tomcat7\bin\stanbol\Indexing\indexing\resources\rdfdata\diseasome_dump.nt'
...
>Indexing Process: The indexing process is started after the line
> [Indexing: Entity Source Reader Deamon] INFO
> impl.EntityDataBasedIndexingDaemon - ...start iterating over Entity data
Index.log [4] also contain this line.
>After indexing completed you should see two files in
>{indexing-dir}/indexing/disc
>1. {name}.solrindex.zip : This is basically a ZIP archive of the Apache Solr
>Core that contains the indexed data. You will need to copy this in the
>"datafiles" directory of your Stanbol instance
>2. org.apache.stanbol.data.site.{name}-1.0.0.jar: This is an OSGI Bundle
>containing the configurations for the ReferencedSite, Cache and SolrYard. You
>will need to install this Bundle by using the Bundle Tab
>of the Apache Felix Webconsole
> (http://{stanbol-instance}/system/console/bundles<http://%7bstanbol-instance%7d/system/console/bundles>).
> Look for the [Install/Update...] button. Click it and in the dialog
> activate the "Start Bundle" option and add the Bundle. The suggested Start
> Level is fine.
These steps are done correctly.
>As soon as you complete (2) you should see your referenced Site at
>http://{stanbol-instance}/entityhub/site/{name}<http://%7bstanbol-instance%7d/entityhub/site/%7bname%7d>
> and some seconds after completing (1) the Site should be functional.
Here I have problem. I can't reference site at
http://{stanbol-instance}/entityhub/site/{name}<http://%7bstanbol-instance%7d/entityhub/site/%7bname%7d>.
Stanbol error.log [2] after bundle installation log only a warning (file
error.log [2]):
01.10.2012 13:11:40.206 *WARN* [FelixDispatchQueue]
org.apache.stanbol.commons.installer.provider.bundle.impl.BundleInstaller ...
no Entries found in path 'org\apache\stanbol\data\site\diseasome' configured
for Bundle 'org.apache.stanbol.data.site.diseasome' with Manifest header field
'Install-Path'!
>If your RDF data do define rdfs:label's than using the default configuration
>should be fine. To reset previous changes you can delete the
>"{indexing-dir}/indexing" and reinitialize to the default by calling
> java -jar
> org.apache.stanbol.entityhub.indexing.genericrdf-0.10.1-incubating-SNAPSHOT-jar-with-dependencies.jar
> init
In this point, this method generates exception:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to find
configuration file 'indexing.properties'!
at
org.apache.stanbol.entityhub.indexing.core.config.IndexingConfig.loadConfig(IndexingConfig.java:599)
at
org.apache.stanbol.entityhub.indexing.core.config.IndexingConfig.<init>(IndexingConfig.java:280)
at
org.apache.stanbol.entityhub.indexing.core.IndexerFactory.create(IndexerFactory.java:80)
at
org.apache.stanbol.entityhub.indexing.core.IndexerFactory.create(IndexerFactory.java:65)
at org.apache.stanbol.entityhub.indexing.Main.main(Main.java:66)
File 'indexing.properties' [5]must be created manually, same as 'mappings.txt'
[8], 'fieldboots.properties' [3], 'entityTypes.properties' [1]. This files I've
got from Stanbol trunk repository.
Log containing this process is: init2.log [7].
Log file containing indexing process when I create the configuration files
manually: init.log [6].
> If you like you can also sent me your RDF file so that I can try to reproduce
> your issues.
My RDF file has more then 16 MB:
2012-03-22 10:56 16 997 067 diseasome.nt
But this file is provided in Stanbol trunk repositories [9], so I do not send
it.
===========================================
Contents of the indexing.7z file:
Directory of E:\Semantic\indexing
[1]. 2012-09-10 17:43 1 095 entityTypes.properties
(index config file)
[2]. 2012-10-01 13:11 19 953 error.log
(Stanbol error log)
[3]. 2012-09-10 17:39 1 976 fieldboosts.properties
(index config file)
[4]. 2012-10-01 12:00 13 017 index.log
(index process log)
[5]. 2012-10-01 11:30 7 611 indexing.properties
(index config file)
[6]. 2012-10-01 11:58 3 893 init.log
(init process log no. 1)
[7]. 2012-10-01 13:02 1 644 init2.log
(init process log no. 2)
[8]. 2012-09-28 14:08 5 784 mappings.txt
(index config file)
===========================================
[9]. http://dev.iks-project.eu/downloads/stanbol-indices/ehealth/source-files/
I would be very thankful if you could look at it.
Best
Gniewoslaw