Dear all,

I'm trying to deploy a local installation of DBPedia Spotlight for Spanish following this step-by-step guide:

https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Internationalization

I have downloaded all files for Spanish using the script download.sh. I'm pretty sure that I have configured indexing.properties file properly for Spanish but I'm attaching it anyway. I have also cleaned my maven repository to prevent any error regarding libraries or maven's plugins versions.

DBPedia spotlight compiles correctly. The problem appears when I try to run index.sh script, more precisely in the line:

/mvn scala:run -Dlauncher=ExtractCandidateMap "-DjavaOpts.Xmx=$JAVA_XMX" "-DaddArgs=$INDEX_CONFIG_FILE"/

where I'm getting the following WARNING:

/[WARNING] Not mainClass or valid launcher found/define/

I haven't been able to find any launcher called ExtractCandidateMap in any pom.xml in the whole project. The build doesn't stop there, but because I'm getting an empty occs.tsv file, the build finally fails with this error:

INFO] launcher 'IndexMergedOccurrences' selected => org.dbpedia.spotlight.lucene.index.IndexMergedOccurrences INFO 2012-11-26 18:08:36,962 main [IndexingConfiguration] - Loading configuration file ../conf/indexing.properties
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:319)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
    at java.io.InputStreamReader.read(InputStreamReader.java:167)
    at java.io.BufferedReader.fill(BufferedReader.java:136)
    at java.io.BufferedReader.readLine(BufferedReader.java:299)
    at java.io.BufferedReader.readLine(BufferedReader.java:362)
at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:67)
    at scala.collection.Iterator$class.foreach(Iterator.scala:660)
at scala.io.BufferedSource$BufferedLineIterator.foreach(BufferedSource.scala:43) at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:143) at scala.io.BufferedSource$BufferedLineIterator.foldLeft(BufferedSource.scala:43) at scala.collection.TraversableOnce$class.$div$colon(TraversableOnce.scala:137) at scala.io.BufferedSource$BufferedLineIterator.$div$colon(BufferedSource.scala:43)
    at scala.collection.SetLike$class.$plus$plus(SetLike.scala:128)
    at scala.collection.immutable.Set$EmptySet$.$plus$plus(Set.scala:52)
at scala.collection.TraversableOnce$class.toSet(TraversableOnce.scala:252) at scala.io.BufferedSource$BufferedLineIterator.toSet(BufferedSource.scala:43) at org.dbpedia.spotlight.util.IndexingConfiguration.getStopWords(IndexingConfiguration.scala:93) at org.dbpedia.spotlight.util.IndexingConfiguration.getAnalyzer(IndexingConfiguration.scala:106) at org.dbpedia.spotlight.lucene.index.IndexMergedOccurrences$.main(IndexMergedOccurrences.scala:92) at org.dbpedia.spotlight.lucene.index.IndexMergedOccurrences.main(IndexMergedOccurrences.scala)
    ... 6 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------

Any clue about what could be happening?

Thanks in advance

This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. Statements 
of intent shall only become binding when confirmed in hard copy by an 
authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, 
London W10 5JJ, UK.
# Wikipedia Dump
# --------------
org.dbpedia.spotlight.data.wikipediaDump = 
/usr/local/spotlight/dbpedia_data/original/wikipedia/es/eswiki-latest-pages-articles.xml.bz2

# Location for DBpedia resources index (output
org.dbpedia.spotlight.index.dir 
=/usr/local/spotlight/dbpedia_data/data/output/index
org.dbpedia.spotlight.index.minDocsBeforeFlush = 40000

# DBpedia Datasets
# ----------------
org.dbpedia.spotlight.data.labels 
=/usr/local/spotlight/dbpedia_data/original/dbpedia/es/labels_es.nt.bz2
org.dbpedia.spotlight.data.redirects = 
/usr/local/spotlight/dbpedia_data/original/dbpedia/es/redirects_es.nt.bz2
org.dbpedia.spotlight.data.disambiguations = 
/usr/local/spotlight/dbpedia_data/original/dbpedia/es/disambiguations_es.nt.bz2
org.dbpedia.spotlight.data.instanceTypes = 
/usr/local/spotlight/dbpedia_data/original/dbpedia/es/instance_types_es.nt.bz2

# Files created from DBpedia Datasets
# -----------------------
org.dbpedia.spotlight.data.conceptURIs = 
/usr/local/spotlight/dbpedia_data/data/output/conceptURIs.list
org.dbpedia.spotlight.data.redirectsTC = 
/usr/local/spotlight/dbpedia_data/data/output/redirects_tc.tsv
org.dbpedia.spotlight.data.surfaceForms = 
/usr/local/spotlight/dbpedia_data/data/output/surfaceForms.tsv

# Language-specific config
# --------------
org.dbpedia.spotlight.language = Spanish
org.dbpedia.spotlight.language_i18n_code = es
org.dbpedia.spotlight.lucene.analyzer = 
org.apache.lucene.analysis.es.SpanishAnalyzer
org.dbpedia.spotlight.lucene.version = LUCENE_36

# Internationalization (i18n) support -- work in progress
org.dbpedia.spotlight.default_namespace = http://es.dbpedia.org/resource/
org.dbpedia.spotlight.default_ontology= http://es.dbpedia.org/ontology/ 

# Stop word list
org.dbpedia.spotlight.data.stopWords.spanish = 
/usr/local/spotlight/dbpedia_data/data/stopwords.es.list


# URI patterns that should not be indexed. e.g. List_of_*
org.dbpedia.spotlight.data.badURI.spanish=/usr/local/spotlight/dbpedia_data/data/blacklistedURIPatterns.es.list

# Will discard surface forms that are too long (reduces complexity of spotting 
and generally size in disk/memory)
org.dbpedia.spotlight.data.maxSurfaceFormLength = 50
# Will index only words closest to resource occurrence
org.dbpedia.spotlight.data.maxContextWindowSize = 200
org.dbpedia.spotlight.data.minContextWindowSize = 0

# Other files
org.dbpedia.spotlight.data.priors = 
/home/pablo/eval/grounder/gold/g1b_spotlight.words.uris.counts

# Yahoo! Boss properties
# ----------------------
# application ID
org.dbpedia.spotlight.yahoo.appID = 
# number of results returned at for one query (maximum: 50)
org.dbpedia.spotlight.yahoo.maxResults = 50
# number of iteration; each iteration returns YahooBossResults results
org.dbpedia.spotlight.yahoo.maxIterations = 100
## important for Yahoo! Boss query string: both language and region must be set 
according to
## http://developer.yahoo.com/search/boss/boss_guide/supp_regions_lang.html
org.dbpedia.spotlight.yahoo.language = en
org.dbpedia.spotlight.yahoo.region = us
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to