What about this:
mvn scala:run -Dlauncher=Server "-DjavaOpts.Xmx=50G"
"-DaddArgs=../conf/server.properties"
And change:
org.dbpedia.spotlight.core.database = lucene
Also, let it run for a while until it "warms up". Prepare a few hundred
requests first, then measure the time.
Cheers,
Pablo
On Tue, Nov 13, 2012 at 6:22 PM, Essam Elsherif <[email protected]>wrote:
> I am using mvn scala:run '-DaddArgs=../conf/server.properties'
>
> I increased the ram to 64G. Still performance is the same. Below is the
> requested info...
>
> ========================./config/server.properties=======================
> # Server hostname and port to be used by DBpedia Spotlight REST API
> org.dbpedia.spotlight.web.rest.uri = http://localhost:2222/rest
>
> # Internationalization (i18n) support -- work in progress
> org.dbpedia.spotlight.default_namespace = http://dbpedia.org/resource/
> org.dbpedia.spotlight.default_ontology= http://dbpedia.org/ontology/
> # Defines the languages the system should support.
> org.dbpedia.spotlight.language = English
> org.dbpedia.spotlight.language_i18n_code = en
> # Stop word list
> # An example can be downloaded from:
> http://spotlight.dbpedia.org/download/release-0.4/stopwords.en.list
> org.dbpedia.spotlight.data.stopWords.english =
> /data/spotlight/data/stopwords.en.list
> org.dbpedia.spotlight.data.stopWords.portuguese =
> /data/spotlight/data/stopwords.pt.list
>
> #----- SPOTTING -------
>
> # Comma-separated list of spotters to load.
> # Accepted values are
> LingPipeSpotter,WikiMarkupSpotter,AtLeastOneNounSelector,CoOccurrenceBasedSelector,NESpotter,OpenNLPNGramSpotter,OpenNLPChunkerSpotter,KeaSpotter
> # Some spotters may require extra files and config parameters. See
> org.dbpedia.spotlight.model.SpotterConfiguration
> org.dbpedia.spotlight.spot.spotters = LingPipeSpotter,WikiMarkupSpotter
> org.dbpedia.spotlight.spot.selectors = ShortSurfaceFormSelector
>
> # Path to serialized LingPipe dictionary used by LingPipeSpotter
> org.dbpedia.spotlight.spot.dictionary =
> /data/spotlight/data/compact/surface_forms-Wikipedia-TitRedDis.uriThresh75.tsv.spotterDictionary
> #org.dbpedia.spotlight.spot.allowOverlap = false
> #org.dbpedia.spotlight.spot.caseSensitive = true
>
> # Configurations for the CoOccurrenceBasedSelector
> # From:
> http://spotlight.dbpedia.org/download/release-0.5/spot_selector.tgz
> org.dbpedia.spotlight.spot.cooccurrence.datasource = ukwac
> org.dbpedia.spotlight.spot.cooccurrence.database.jdbcdriver =
> org.hsqldb.jdbcDriver
> org.dbpedia.spotlight.spot.cooccurrence.database.connector =
> jdbc:hsqldb:file:/data/spotlight/data/spotsel/ukwac_candidate.script;shutdown=true&readonly=true
> org.dbpedia.spotlight.spot.cooccurrence.database.user = sa
> org.dbpedia.spotlight.spot.cooccurrence.database.password =
> org.dbpedia.spotlight.spot.cooccurrence.classifier.unigram =
> /data/spotlight/data/spotsel/ukwac_unigram.model
> org.dbpedia.spotlight.spot.cooccurrence.classifier.ngram =
> /data/spotlight/data/spotsel/ukwac_ngram.model
>
> # Path to serialized HMM model for LingPipe-based POS tagging. Required by
> AtLeastOneNounSelector and CoOccurrenceBasedSelector
> org.dbpedia.spotlight.tagging.hmm =
> /data/spotlight/data/pos-en-general-brown.HiddenMarkovModel
> # Path to dir containing several OpenNLP models for NER, chunking, etc.
> This is required for spotters that are based on OpenNLP.
> # Can be downloaded from
> http://spotlight.dbpedia.org/download/release-0.5/opennlp_models.tgz
> org.dbpedia.spotlight.spot.opennlp.dir =
> /data/spotlight/data/data//spotlight/3.7/opennlp
> org.dbpedia.spotlight.spot.opennlp.person=
> http://dbpedia.org/ontology/Person
> org.dbpedia.spotlight.spot.opennlp.organization=
> http://dbpedia.org/ontology/Organisation
> org.dbpedia.spotlight.spot.opennlp.location=
> http://dbpedia.org/ontology/Place
>
>
> # EXPERIMENTAL! Path to Kea Model
> org.dbpedia.spotlight.spot.kea.model =
> /data/spotlight/3.7/kea/keaModel-1-3-1
>
>
> #----- CANDIDATE SELECTION -------
>
> # Choose between jdbc or lucene for DBpedia Resource creation. Also, if
> the jdbc throws an error, lucene will be used.
> org.dbpedia.spotlight.core.database = jdbc
> org.dbpedia.spotlight.core.database.jdbcdriver = org.hsqldb.jdbcDriver
> org.dbpedia.spotlight.core.database.connector =
> jdbc:hsqldb:file:/data/spotlight/data/dbpedia-spotlight-db;shutdown=true&readonly=true
> org.dbpedia.spotlight.core.database.user = sa
> org.dbpedia.spotlight.core.database.password =
>
> # From
> http://spotlight.dbpedia.org/download/release-0.5/candidate-index-full.tgz
> org.dbpedia.spotlight.candidateMap.dir =
> /data/spotlight/data/candidateIndexTitRedDis
> org.dbpedia.spotlight.candidateMap.loadToMemory = true
> # Path to Lucene index containing only the candidate map. It is used by
> document-oriented disambiguators such as Document,TwoStepDisambiguator
> # Only used if one such disambiguator is loaded. Data is at:
> http://spotlight.dbpedia.org/download/release-0.5/candidate-index-full.tgz
> #org.dbpedia.spotlight.candidateMap.dir =
> dist/src/deb/control/data/usr/share/dbpedia-spotlight/index
>
>
> #----- DISAMBIGUATION -------
>
> # List of disambiguators to load: Document,Occurrences,CuttingEdge,Default
> org.dbpedia.spotlight.disambiguate.disambiguators = Occurrences
>
> # Path to a directory containing Lucene index files. These can be
> downloaded from the website or created by
> org.dbpedia.spotlight.lucene.index.IndexMergedOccurrences
> org.dbpedia.spotlight.index.dir
> =/dev/shm/temp/medium/index-withSF-withTypes-compressed
> # Will attempt to load into RAM (the potentially huge) index from
> "org.dbpedia.spotlight.index.dir"
> org.dbpedia.spotlight.index.loadToMemory = true
> # Class used to process context around DBpedia mentions (tokenize, stem,
> etc.)
> org.dbpedia.spotlight.lucene.analyzer =
> org.apache.lucene.analysis.en.EnglishAnalyzer
> org.dbpedia.spotlight.lucene.version = LUCENE_36
> # How large can the cache be for ICFDisambiguator.
> jcs.default.cacheattributes.MaxObjects = 15000
>
>
> #----- LINKING / FILTERING -------
>
> # Configuration for SparqlFilter
> org.dbpedia.spotlight.sparql.endpoint = http://dbpedia.org/sparql
> org.dbpedia.spotlight.sparql.graph = http://dbpedia.org
>
>
> ===========End=============./config/server.properties=======================
>
> ===========Begin=============free -m -t=================================
> [aelshes@nj2utaepxapp01 dbpedia-spotlight]$ free -m -t
> total used free shared buffers cached
> Mem: 64458 56447 8011 0 106 26498
> -/+ buffers/cache: 29842 34616
> Swap: 1983 1549 433
> Total: 66442 57997 8445
>
> ===========End=============free -m -t=================================
>
> ===========Begin=============vmstat 5=================================
> [aelshes@nj2utaepxapp01 dbpedia-spotlight]$ vmstat 5
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id
> wa st
> 1 0 1587196 8511184 109284 27134160 2 3 129 153 29 21 1
> 0 98 0 0
> 1 0 1587196 8511184 109288 27134160 0 0 0 18 1015 490 25
> 0 75 0 0
> 1 0 1587196 8845116 109296 27134160 0 0 0 9 1010 499 26
> 0 74 0 0
> 2 0 1587196 9149660 109308 27134160 0 0 0 7 1015 559 26
> 0 73 0 0
> 1 0 1587196 9469704 109308 27134160 0 0 0 0 1011 538 27
> 0 73 0 0
> 1 0 1587196 9469580 109312 27134160 0 0 0 31 1014 491 25
> 0 75 0 0
> 1 0 1587196 9761352 109324 27134164 0 0 0 38 1016 590 26
> 0 74 0 0
> 1 0 1587196 10062044 109328 27134164 0 0 0 3 1017 487 26
> 0 73 0 0
> 1 0 1587196 10333364 109328 27134164 0 0 0 4 1015 483 26
> 0 74 0 0
> 1 0 1587196 10333364 109340 27134164 0 0 0 13 1012 471 25
> 0 75 0 0
> 0 0 1587196 10609884 109348 27134164 0 0 0 6 1021 514 24
> 0 76 0 0
> 0 0 1587196 10609884 109352 27134164 0 0 0 9 1010 491 0
> 0 100 0 0
> ===========End=============vmstat 5=================================
>
> ------------------------------
> *From:* Pablo N. Mendes <[email protected]>
> *To:* Essam Elsherif <[email protected]>
> *Cc:* "[email protected]" <
> [email protected]>
> *Sent:* Tuesday, November 13, 2012 12:02 PM
>
> *Subject:* Re: [Dbp-spotlight-users] Performance is Slower when loading
> the index
>
>
> Can you share your config files, the command line that you are using to
> run it, and the output of vmstat or "free -m -t" while the system is
> running?
>
>
> On Tue, Nov 13, 2012 at 5:39 PM, Essam Elsherif
> <[email protected]>wrote:
>
> I successfully loaded the compact index and candidate map to a 32G memory
> machine with 2 CPUs. Running the build from the trunk, my instance is still
> three times slower than the live spotlight version. Any clue what reason
> could be?
>
> Any help is appreciated.
>
> Thanks,
> Essam
>
> ------------------------------
> *From:* Essam Elsherif <[email protected]>
> *To:* Pablo N. Mendes <[email protected]>
> *Cc:* "[email protected]" <
> [email protected]>
> *Sent:* Wednesday, November 7, 2012 2:05 PM
>
> *Subject:* Re: [Dbp-spotlight-users] Performance is Slower when loading
> the index
>
> Yes, around 24G is being used during annotation. Below is the output from
> top..
>
> I do not have conf under rest.
>
> I used the command below. Still performance is slow.
>
> Thanks,
> Essam
>
>
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 26683 root 19 0 30.6g 20g 20m S 198.8 63.7 22:51.34 java
> 1 root 15 0 10364 564 528 S 0.0 0.0 0:00.74 init
> 2 root RT -5 0 0 0 S 0.0 0.0 0:00.18 migration/0
> 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
> 4 root RT -5 0 0 0 S 0.0 0.0 0:00.13 migration/1
>
>
> ------------------------------
> *From:* Pablo N. Mendes <[email protected]>
> *To:* Essam Elsherif <[email protected]>
> *Cc:* "[email protected]" <
> [email protected]>
> *Sent:* Wednesday, November 7, 2012 1:02 PM
> *Subject:* Re: [Dbp-spotlight-users] Performance is Slower when loading
> the index
>
>
> Do you observe that about 24G is filled when annotation is running? If
> not, you might have not successfully configured the server.properties to
> load things to memory.
>
> From your command line it seems you have a "conf" directory under the
> "rest" module in addition to the original "conf" that sits on the project
> root? Did you also try:
>
> cd rest
> mvn scala:run -Dlauncher=Server "-DjavaOpts.Xmx=26G"
> "-DaddArgs=../conf/server.properties"
>
>
> Cheers,
> Pablo
>
>
> On Wed, Nov 7, 2012 at 6:56 PM, Essam Elsherif
> <[email protected]>wrote:
>
> I have 32G of memory. I set the -Xmx in rest/pom.xml to 26G.
>
> I am using mvn scala:run '-DaddArgs=./conf/server.properties' to run the
> server
>
> free -m shows most of the 2G swap is free while running the annotation.
>
>
> Thanks,
> Essam
>
> ------------------------------
> *From:* Pablo N. Mendes <[email protected]>
> *To:* Essam Elsherif <[email protected]>
> *Cc:* "[email protected]" <
> [email protected]>
> *Sent:* Wednesday, November 7, 2012 11:42 AM
> *Subject:* Re: [Dbp-spotlight-users] Performance is Slower when loading
> the index
>
>
> Perhaps the system is paging? How much swap is used when the system is
> running?
> How much memory do you have?
> What command line did you use?
> What is the -Xmx specified in your pom.xml?
> Cheers
> pablo
> On Nov 7, 2012 10:21 AM, "Essam Elsherif" <[email protected]>
> wrote:
>
> Hi,
> I built spotlight from the latest source and I am trying to run it on 32G
> Ram server. When I load the compact index
> "index-withSF-withTypes-compressed " into memory annotation is very much
> slower than not loading. Both are slower than the live spotlight anyway.
> Any idea what is going on here?
>
> Thanks,
> essam
>
>
> ------------------------------------------------------------------------------
> LogMeIn Central: Instant, anywhere, Remote PC access and management.
> Stay in control, update software, and manage PCs from one command center
> Diagnose problems and improve visibility into emerging IT issues
> Automate, monitor and manage. Do more in less time with Central
> http://p.sf.net/sfu/logmein12331_d2d
> _______________________________________________
> Dbp-spotlight-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>
>
>
>
>
>
> --
> ---
> Pablo N. Mendes
> http://pablomendes.com
> Events: http://wole2012.eurecom.fr/
>
>
>
>
>
>
>
>
> --
> ---
> Pablo N. Mendes
> http://pablomendes.com
> Events: http://wole2012.eurecom.fr
>
>
>
>
--
---
Pablo N. Mendes
http://pablomendes.com
Events: http://wole2012.eurecom.fr
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users