Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.

The "cTAKESParser" page has been changed by ChrisMattmann:
https://wiki.apache.org/tika/cTAKESParser?action=diff&rev1=8&rev2=9

  
  = Setting up the Tika Config file =
  
- You will need a custom Tika configuration file for the parser. You can find 
one 
[[here|https://raw.githubusercontent.com/chrismattmann/ctakesparser-utils/master/config/tika-config.xml]].
 The reason is that since cTAKESParser decorates AutoDetectParser, in reality, 
cTAKESParser can handle *any* kind of file type that it can. But you have to 
make cTAKESParser intercept the mime types you want it to extract biomedical 
information from. So if you want Tika and its cTAKESParser to etxract 
biomedical information from application/pdf files, you will need this custom 
config and to add application/pdf as a mime that the parser can deal with. The 
default config provided looks like:
+ You will need a custom Tika configuration file for the parser. You can find 
one 
[[https://raw.githubusercontent.com/chrismattmann/ctakesparser-utils/master/config/tika-config.xml|here]].
 The reason is that since cTAKESParser decorates AutoDetectParser, in reality, 
cTAKESParser can handle *any* kind of file type that it can. But you have to 
make cTAKESParser intercept the mime types you want it to extract biomedical 
information from. So if you want Tika and its cTAKESParser to etxract 
biomedical information from application/pdf files, you will need this custom 
config and to add application/pdf as a mime that the parser can deal with. The 
default config provided looks like:
  
  {{{
  <?xml version="1.0" encoding="UTF-8" standalone="no"?>

Reply via email to