[Tika Wiki] Update of "TikaInHadoop" by TimothyAllison

Apache Wiki Tue, 07 Apr 2015 06:29:18 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change 
notification.


The "TikaInHadoop" page has been changed by TimothyAllison:
https://wiki.apache.org/tika/TikaInHadoop?action=diff&rev1=1&rev2=2

  = Running Tika in Hadoop =
  On very rare occasions, Tika can fail catastrophically: infinite hang or out 
of memory errors.  There may be other features of Tika that make it useful for 
developers to share notes on how to run Tika at scale.  This page is intended 
to gather lessons learned and offer pointers for running Tika in the Hadoop 
framework.
  
+ = Useful Parameters =
+ 
+ = Lessons Learned =
+ 
+ = Links =
+  * William Palmer's blog post on running Tika in Hadoop -- 
[[http://openpreservation.org/knowledge/blogs/2014/03/21/tika-ride-characterising-web-content-nanite/|
 Tika to Ride]]
+ 
+ = Frameworks =
+  * Julien Nioche's [[https://github.com/DigitalPebble/behemoth|Behemoth]]
+  * William Palmer's 
[[https://github.com/openpreserve/nanite/tree/master/nanite-hadoop|Nanite]]
+

[Tika Wiki] Update of "TikaInHadoop" by TimothyAllison

Reply via email to