RE: Testing an ingest framework that uses Apache Tika
Thank you, Chris, Luís and Konstantin! -Original Message- From: Mattmann, Chris A (3010) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Thursday, February 16, 2017 10:18 AM To: dev@tika.apache.org; lfcnas...@gmail.com Cc: solr-u...@lucene.apache.org Subject: Re: Testing an ingest framework that uses Apache Tika ++1 awesome job ++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and Development Offices (8212) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 180-503E, Mailstop: 180-503 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++ On 2/16/17, 5:28 AM, "Luís Filipe Nassif" wrote: Excellent, Tim! Thank you for all your great work on Apache Tika! 2017-02-16 11:23 GMT-02:00 Konstantin Gribov : > Tim, > > it's a awesome feature for downstream projects' integration tests. Thanks > for implementing it! > > чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. : > > > All, > > > > I finally got around to documenting Apache Tika's MockParser[1]. As of > > Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and > you > > can simulate: > > > > 1. Regular catchable exceptions > > 2. OOMs > > 3. Permanent hangs > > > > This will allow you to determine if your ingest framework is robust > > against these issues. > > > > As always, we fix Tika when we can, but if history is any indicator, > > you'll want to make sure your ingest code can handle these issues if you > > are handling millions/billions of files from the wild. > > > > Cheers, > > > > Tim > > > > > > [1] https://wiki.apache.org/tika/MockParser > > > -- > > Best regards, > Konstantin Gribov >
Re: Testing an ingest framework that uses Apache Tika
++1 awesome job ++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and Development Offices (8212) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 180-503E, Mailstop: 180-503 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++ On 2/16/17, 5:28 AM, "Luís Filipe Nassif" wrote: Excellent, Tim! Thank you for all your great work on Apache Tika! 2017-02-16 11:23 GMT-02:00 Konstantin Gribov : > Tim, > > it's a awesome feature for downstream projects' integration tests. Thanks > for implementing it! > > чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. : > > > All, > > > > I finally got around to documenting Apache Tika's MockParser[1]. As of > > Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and > you > > can simulate: > > > > 1. Regular catchable exceptions > > 2. OOMs > > 3. Permanent hangs > > > > This will allow you to determine if your ingest framework is robust > > against these issues. > > > > As always, we fix Tika when we can, but if history is any indicator, > > you'll want to make sure your ingest code can handle these issues if you > > are handling millions/billions of files from the wild. > > > > Cheers, > > > > Tim > > > > > > [1] https://wiki.apache.org/tika/MockParser > > > -- > > Best regards, > Konstantin Gribov >
Re: Testing an ingest framework that uses Apache Tika
Excellent, Tim! Thank you for all your great work on Apache Tika! 2017-02-16 11:23 GMT-02:00 Konstantin Gribov : > Tim, > > it's a awesome feature for downstream projects' integration tests. Thanks > for implementing it! > > чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. : > > > All, > > > > I finally got around to documenting Apache Tika's MockParser[1]. As of > > Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and > you > > can simulate: > > > > 1. Regular catchable exceptions > > 2. OOMs > > 3. Permanent hangs > > > > This will allow you to determine if your ingest framework is robust > > against these issues. > > > > As always, we fix Tika when we can, but if history is any indicator, > > you'll want to make sure your ingest code can handle these issues if you > > are handling millions/billions of files from the wild. > > > > Cheers, > > > > Tim > > > > > > [1] https://wiki.apache.org/tika/MockParser > > > -- > > Best regards, > Konstantin Gribov >
Re: Testing an ingest framework that uses Apache Tika
Tim, it's a awesome feature for downstream projects' integration tests. Thanks for implementing it! чт, 16 февр. 2017 г. в 16:17, Allison, Timothy B. : > All, > > I finally got around to documenting Apache Tika's MockParser[1]. As of > Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and you > can simulate: > > 1. Regular catchable exceptions > 2. OOMs > 3. Permanent hangs > > This will allow you to determine if your ingest framework is robust > against these issues. > > As always, we fix Tika when we can, but if history is any indicator, > you'll want to make sure your ingest code can handle these issues if you > are handling millions/billions of files from the wild. > > Cheers, > > Tim > > > [1] https://wiki.apache.org/tika/MockParser > -- Best regards, Konstantin Gribov
Testing an ingest framework that uses Apache Tika
All, I finally got around to documenting Apache Tika's MockParser[1]. As of Tika 1.15 (unreleased), add tika-core-tests.jar to your class path, and you can simulate: 1. Regular catchable exceptions 2. OOMs 3. Permanent hangs This will allow you to determine if your ingest framework is robust against these issues. As always, we fix Tika when we can, but if history is any indicator, you'll want to make sure your ingest code can handle these issues if you are handling millions/billions of files from the wild. Cheers, Tim [1] https://wiki.apache.org/tika/MockParser