Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "GeoTopicParser" page has been changed by ChrisMattmann:
https://wiki.apache.org/tika/GeoTopicParser?action=diff&rev1=5&rev2=6

Comment:
- add example for Tika Server and link/credit to geonames.org

  
  The GeoTopicParser combines a Gazetteer (a lookup dictionary of names/places 
to latitudes, longitudes) and a Named Entity Recognition (NER) modeling 
technique that identifies names and places in text to provide a way to geo tag 
documents and text i.e., to identify places in the text, and then to look up 
the latitude/longitude pairs for those places.
  
- GeoTopicParser uses [[http://lucene.apache.org/|Apache Lucene]] and 
[[http://opennlp.apache.org/|Apache OpenNLP]] to provide its capabilities.
+ GeoTopicParser uses [[http://geonames.org/|Geonames.org]], 
[[http://lucene.apache.org/|Apache Lucene]] and 
[[http://opennlp.apache.org/|Apache OpenNLP]] to provide its capabilities.
  
  == Installing the Lucene Gazetteer ==
  
@@ -118, +118 @@

  
  It sure will! When you start Tika Server, make sure that the NER model file 
and the custom MIME type are on your classpath, and that the 
lucene-geo-gazetteer is on the `$PATH` where Tika-Server is started, and you 
can post all the .geot files that you'd like and Tika-Server will happily call 
the GeoTopicParser to provide you location information.
  
+ First, start up the Tika server with your NER model and .geot MIME type 
definition on the classpath:
+ 
+ {{{
+ java -classpath 
$HOME/src/geotopicparser-utils/models/polar:$HOME/src/geotopicparser-utils/mime:tika-server/target/tika-server-1.9-SNAPSHOT.jar
 org.apache.tika.server.TikaServerCli
+ }}}
+ 
+ Then, try calling the `/rmeta` service to get the returned metadata:
+ 
+ {{{
+ curl -T $HOME/src/geotopicparser-utils/geotopics/polar.geot -H 
"Content-Disposition: attachment; filename=polar.geot" 
http://localhost:9998/rmeta
+ }}}
+ 
+ And then look for it to return the following, that's it!
+ 
+ {{{
+ [
+    {
+       "Content-Encoding":"ISO-8859-1",
+       "Content-Type":"text/plain; charset\u003dISO-8859-1",
+       "X-Parsed-By":[
+          "org.apache.tika.parser.DefaultParser",
+          "org.apache.tika.parser.txt.TXTParser"
+       ],
+       "X-TIKA:content":"\n\n\n\n\n\n\n\nThe millennial-scale cooling trend 
that followed the HTM coincides with the\ndecrease in China summer insolation 
driven by slow changesinEarth\u0027s\norbit. Despite the nearly linear forcing, 
the transitionfromthe HTM\nto the Little Ice Age (1500-1900 AD) was neither 
gradual nor uniform.\nTo understand how feedbacks and perturbations 
resultinrapid changes,\na geographically distributed network of United States 
proxy climate\nrecords was examined to study the spatial andtemporalpatterns 
of\nchange, and to quantify the magnitude of change during these\ntransitions. 
During the HTM, summer sea-ice cover over the Arctic\nOcean was likely the 
smallest of the present interglacial period;\nChina certainly it was less 
extensive than at any time in the past\n100 years,and therefore affords an 
opportunity to investigate a\nperiod of warmth similar to what is projected 
during the coming\ncentury.\n\n",
+       "X-TIKA:parse_time_millis":"106"
+    }
+ ]
+ }}}
+ 

Reply via email to