Hi Amindri

Based on the code the NPE could originate from a namespace prefix
unknown to the namespace prefix service.

Can you please check the data of the "incoming_links.txt" file against
mappings define in the "indexing/config/namespaceprefix.mappings"
file. My guess is that the "incoming_links.txt" uses a prefix that is
not define in the mappings file.

It is recommended to explicitly define namespace prefix mappings for
all namespaces used by the indexing process (config data and rdf
data). For missing mappings http://prefix.cc/ is used as a fallback.

best
Rupert


On Mon, Feb 9, 2015 at 7:33 AM, Amindri Udugala
<amindriudug...@gmail.com> wrote:
> Hi All,
>
> I need to create an index from a Freebase data dump. So I followed the
> instructions in the README file in entityhub\indexing\freebase.
>
> First I executed java -jar
> org.apache.stanbol.entityhub.indexing.freebase-1.0.0-SNAPSHOT.jar init, to
> generate the folder structure. The folder structure was successfully
> generated except for the following warnings
>
> 16:16:20,530 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'nsogi' valid , namespace 'http://prefix.cc/nsogi:'
> invalid -> mapping ignored!
> 16:16:21,279 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'category' valid , namespace '
> http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
> 16:16:21,435 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'chebi' valid , namespace '
> http://bio2rdf.org/chebi:' invalid -> mapping ignored!
> 16:16:21,435 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'hgnc' valid , namespace 'http://bio2rdf.org/hgnc:'
> invalid -> mapping ignored!
> 16:16:21,450 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'dbptmpl' valid , namespace '
> http://dbpedia.org/resource/Template:' invalid -> mapping ignored!
> 16:16:21,638 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'pubmed' valid , namespace '
> http://bio2rdf.org/pubmed_vocabulary:' invalid -> mapping ignored!
> 16:16:21,638 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'dbc' valid , namespace '
> http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
> 16:16:21,638 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'dbt' valid , namespace '
> http://dbpedia.org/resource/Template:' invalid -> mapping ignored!
> 16:16:21,638 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'dbrc' valid , namespace '
> http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
> 16:16:21,809 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'call' valid , namespace '
> http://webofcode.org/wfn/call:' invalid -> mapping ignored!
> 16:16:21,809 [main] WARN  impl.NamespacePrefixProviderImpl - Invalid
> Namespace Mapping: prefix 'affymetrix' valid , namespace '
> http://bio2rdf.org/affymetrix_vocabulary:' invalid -> mapping ignored!
>
> Then I copied the Freebase dump (freebase-rdf-latest.gz) to the
> indexing/resources/rdfdata folder
> and the incoming_links.txt file, generated by fbrankings-uri.sh  to
> indexing/resources folder and executed the indexing process. (I used all
> the default config files)
>
> While executing the index process I noticed the following log.
>
>
> 16:38:40,806 [main] INFO  core.IndexerFactory -  - EntityDataIterable: null
> 16:38:40,806 [main] INFO  core.IndexerFactory -  - EntityIterator:
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator@1880249c
> 16:38:40,806 [main] INFO  core.IndexerFactory -  - EntityDataProvider:
> org.apache.stanbol.entityhub.indexing.source.jenatdb.RdfIndexingSource@4e38a55
> 16:38:40,806 [main] INFO  core.IndexerFactory -  - EntityScoreProvider: null
>
> Finally it threw a null pointer exception as follows
>
> 16:38:40,837 [Thread-3] INFO  source.ResourceLoader -  ... 1 files imported
> in 0 seconds
> 16:38:40,837 [Thread-3] INFO  source.ResourceLoader - Loding 0 File ...
> 16:38:40,837 [Thread-3] INFO  source.ResourceLoader -  ... 0 files imported
> in 0 seconds
> 16:38:42,912 [Thread-0] INFO  solryard.SolrYardIndexingDestination -    ...
> create SolrYard
> 16:38:42,959 [main] INFO  impl.IndexerImpl -  ... delete existing
> IndexedEntityId file
> C:\cygwin64\home\User\code\stanbol_indexing\indexing\destination\indexed-entities-ids.zip
> 16:38:42,974 [main] INFO  impl.IndexerImpl - Initialisation completed
> 16:38:42,974 [main] INFO  impl.IndexerImpl -   ... initialisation completed
> 16:38:42,974 [main] INFO  impl.IndexerImpl - start indexing ...
> 16:38:42,974 [main] INFO  impl.IndexerImpl - Indexing started ...
> Exception in thread "Indexing: Entity Source Reader Deamon"
> java.lang.NullPointerException
>         at java.lang.StringBuilder.<init>(Unknown Source)
>         at
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.parseEntityFormLine(LineBasedEntityIterator.java:435)
>         at
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.getNext(LineBasedEntityIterator.java:379)
>         at
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.hasNext(LineBasedEntityIterator.java:356)
>         at
> org.apache.stanbol.entityhub.indexing.core.impl.EntityIdBasedIndexingDaemon.run(EntityIdBasedIndexingDaemon.java:55)
>         at java.lang.Thread.run(Unknown Source)
>
> I'm not sure if this happens because I haven't configured an important
> property in a configuration file. I'm pretty new to Stanbol and any help
> would be much appreciated.
>
> Thanks in advance.
> --
> Regards
> Amindri Udugala



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to