Re: ReviewBoard instance

2010-11-10 Thread Chris Mattmann
+1 Sent from my Verizon Wireless BlackBerry -Original Message- From: Jukka Zitting jukka.zitt...@gmail.com Date: Wed, 10 Nov 2010 11:23:10 To: dev@tika.apache.orgdev@tika.apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Subject: Re: ReviewBoard instance Hi, On Tue, Oct 26,

[VOTE] Apache TIka 1.4 Release Candidate #1

2013-06-15 Thread Chris Mattmann
Hi Guys, A candidate for the Tika 1.4 release is available at: http://people.apache.org/~mattmann/apache-tika-1.4/rc1/ The release candidate is a zip archive of the sources in: http://svn.apache.org/repos/asf/tika/tags/1.4/ The SHA1 checksum of the archive is

Re: [VOTE] Apache TIka 1.4 Release Candidate #1

2013-06-16 Thread Chris Mattmann
Hey Guys, I'll respin, no problem. Mike, I'm going to push your update to the 1.4 branch too, so that I can simply roll the release from there. RC #2 coming shortly. Cheers, Chris -Original Message- From: Dave Meikle loo...@gmail.com Reply-To: dev@tika.apache.org

Re: [VOTE] Apache TIka 1.4 Release Candidate #1

2013-06-16 Thread Chris Mattmann
Thanks Uwe, that would be very welcomed! Cheers, Chris -Original Message- From: Uwe Schindler u...@thetaphi.de Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Sunday, June 16, 2013 7:13 AM To: dev@tika.apache.org dev@tika.apache.org Subject: RE: [VOTE] Apache TIka 1.4 Release

[VOTE] Apache TIka 1.4 Release Candidate #2

2013-06-16 Thread Chris Mattmann
Hi Guys, A second candidate for the Tika 1.4 release is available at: http://people.apache.org/~mattmann/apache-tika-1.4/rc2/ The release candidate is a zip archive of the sources in: http://svn.apache.org/repos/asf/tika/tags/1.4-rc2/ The SHA1 checksum of the archive is

Re: [VOTE] Apache TIka 1.4 Release Candidate #1

2013-06-16 Thread Chris Mattmann
Date: Sunday, June 16, 2013 10:25 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: [VOTE] Apache TIka 1.4 Release Candidate #1 Am 16.06.2013 05:52, schrieb Chris Mattmann: Hi Guys, A candidate for the Tika 1.4 release is available at: http://people.apache.org/~mattmann/apache-tika

Re: [VOTE] Apache TIka 1.4 Release Candidate #2

2013-06-17 Thread Chris Mattmann
Hey Guys, Just FYI on this, the VOTE is still going if folks have a chance to review, would appreciate it. So far, we've got 1 binding +1. :) Cheers, Chris -Original Message- From: jpluser mattm...@apache.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Sunday, June 16,

[RESULT] [VOTE] Apache TIka 1.4 Release Candidate #2

2013-06-30 Thread Chris Mattmann
Hi Folks, This VOTE has passed with the following tallies: +1 Chris Mattmann* Oleg Tikhonov* Markus Jelsma Mike McCandless* Joe Wicentowski Dave Meikle* * - indicates PMC Thanks to all for VOTE'ing and I'll now push the release out to the mirrors, update the website, and send the ANNOUNCE

[ANNOUNCE] Apache Tika 1.4 Released

2013-07-02 Thread Chris Mattmann
not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Chris

BigData press for Tika

2013-10-09 Thread Chris Mattmann
Hey Guys, Simons's Foundation has an article series on BigData right now and I'm quoted in 2 of the features part of the spread. I also made mention of Apache Tika in the article, see: https://www.simonsfoundation.org/quanta/20131009-the-future-fabric-of-data- analysis/ Go Tika!! Cheers,

Re: [DISCUSS] Integrate Apache Any23 into Apache Tika

2013-10-19 Thread Chris Mattmann
Lewis, I for one am supportive of this measure somehow. The exact mechanism by which we can do this is something that could involve e.g., taking you, or anyone else from the Any23 community (at this point I think it's really just you by my own accord lurking on the lists over there) that is

Re: Having Problem in Word Count and Language Detaction

2013-10-26 Thread Chris Mattmann
Hi Animesh, Please detail your issue here on dev@tika.apache.org and I'm sure someone can help. Cheers, Chris -Original Message- From: Animesh Kumar animesh.sa...@gmail.com Date: Wednesday, October 23, 2013 9:15 PM To: dev-ow...@tika.apache.org dev-ow...@tika.apache.org Subject: Fwd:

Re: Switch to JUnit 4.x?

2013-12-17 Thread Chris Mattmann
+1 from me. Cheers, Chris -Original Message- From: David Meikle loo...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Tuesday, December 17, 2013 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: Switch to JUnit 4.x? Hi, On 14 Dec 2013, at 23:39, Ken

Re: Tika 1.5 release ?

2013-12-19 Thread Chris Mattmann
Hi Hong-Thai, Thanks for your question. It's probably time for a release. I wil have the cycles next week or the week after to spin one up if no one beats me to it. Thanks! Cheers, Chris -Original Message- From: Hong-Thai Nguyen hong-thai.ngu...@polyspot.com Reply-To:

Re: [DISCUSS] Prepare Release 1.5?

2014-01-09 Thread Chris Mattmann
Hey Dave, I kind of got bogged down and haven't had time to release. If someone else does have time and wants to pick this up, +1 for it! Cheers, Chris -Original Message- From: David Meikle loo...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, January 9,

Submission to ApacheCon on Tika

2014-01-30 Thread Chris Mattmann
Hey Guys, I submitted the below talk on Apache Tika, Nutch and Solr to ApacheCon NA 2014: Real Data Science: Exploring the FBI's Vault dataset with Apache Tika, Nutch and Solr Event ApacheCon North America Submission Type Lightning Talk Category Developer Biography Chris Mattmann has a wealth

Re: [VOTE] Apache Tika 1.5 RC1

2014-02-04 Thread Chris Mattmann
-1.5-rc1] mattmann% gpg --import KEYS gpg: key A355A63E: Jukka Zitting ju...@apache.org not changed gpg: key B876884A: Chris Mattmann (CODE SIGNING KEY) mattm...@apache.org not changed gpg: key 9740DD55: David Meikle (CODE SIGNING KEY) dmei...@apache.org not changed gpg: Total number processed: 3

Re: [VOTE] Apache Tika 1.5 RC1

2014-02-04 Thread Chris Mattmann
/apache-tika-1.5-rc1] mattmann% gpg --import tika.asc gpg: key B876884A: Chris Mattmann (CODE SIGNING KEY) mattm...@apache.org not changed gpg: key A355A63E: Jukka Zitting ju...@apache.org 7 new signatures gpg: key 8A26D9A6: public key Jukka Zitting jukka.zitt...@gmail.com imported gpg: key 42CFAE07

Re: [VOTE] Apache Tika 1.5 RC2

2014-02-10 Thread Chris Mattmann
Hi Dave, +1 from me, SIGS, checksum check out: [chipotle:~/tmp/apache-tika-1.5-rc2] mattmann% $HOME/bin/stage_apache_rc tika 1.5-src http://people.apache.org/~dmeikle/tika-1.5-rc2/ % Total% Received % Xferd Average Speed TimeTime Time Current

Re: CSCI ASSIGNMENT QUESTION

2014-02-19 Thread Chris Mattmann
it to automatically call the PDF parser by calling it directly from your program or Java code and then bypass that step. HTH! Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Mohamed Mustafa Rafik Khimani khim...@usc.edu Date

Re: CSCI ASSIGNMENT QUESTION

2014-03-01 Thread Chris Mattmann
1, 2014 9:43 PM To: Chris Mattmann chris.a.mattm...@jpl.nasa.gov Subject: Re: CSCI ASSIGNMENT QUESTION Hello professor Mattmann, Thank you for replying to my doubts. I realized there was a small mistake in the above code. I was updating the same pdf file count for every keyword

Re: Submission to ApacheCon on Tika

2014-03-02 Thread Chris Mattmann
Thanks Jukka! My Tika talk had to be moved to Wednesday since I wasn't sure I would be there at ApacheCon the whole time, and co-locating my talks around the same day was advantageous, so I asked Rich to move me. Annie's talk was originally I believe set for Wed too, however I am not sure if she

Re: Use of Levenshtein distance to find similar words

2014-03-16 Thread Chris Mattmann
Dear Margi, Great question and thanks for posting this to the list! :) You may also want to split your extracted text not just by \n but also look to split by perhaps to canonical-ize the words. You may even think of an approach for creating words (recall we discussed a method in class for

Re: JAXRS, endpoints and a / welcome page - any ideas why it's broken?

2014-05-16 Thread Chris Mattmann
Hi Guys, Some thoughts here: -Original Message- From: Nick Burch apa...@gagravarr.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Wednesday, May 14, 2014 6:22 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: JAXRS, endpoints and a / welcome page - any ideas why

Re: Review Request 22219: Add Translation to Tika

2014-06-04 Thread Chris Mattmann
should be dynamically loaded via JavaSPI trunk/tika-core/src/main/java/org/apache/tika/language/MicrosoftTranslator.java https://reviews.apache.org/r/22219/#comment79303 use Eclipse or IdeaJ to auto put javadoc in for interfaces? - Chris Mattmann On June 4, 2014, 7:17 p.m., Tyler

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-04 Thread Chris Mattmann
-mail. To reply, visit: https://reviews.apache.org/r/22246/ --- (Updated June 4, 2014, 10:23 p.m.) Review request for tika and Chris Mattmann. Repository: tika Description --- This is a new parser for Matlab .mat files

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Chris Mattmann
/DefaultTranslator.java https://reviews.apache.org/r/22219/#comment79395 Need Apache license here. I will add it. - Chris Mattmann On June 5, 2014, 4:19 p.m., Tyler Palsulich wrote: --- This is an automatically generated e-mail

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22219/#review44842 --- Ship it! Ship It! - Chris Mattmann On June 5, 2014, 4:19 p.m

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Chris Mattmann
: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22219/ --- (Updated June 5, 2014, 4:19 p.m.) Review request for tika and Chris Mattmann. Repository: tika Description --- This patch adds

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-09 Thread Chris Mattmann
://reviews.apache.org/r/22246/#comment79840 agreed, this seems to be extraneous. I would remove this part. trunk/tika-parsers/pom.xml https://reviews.apache.org/r/22246/#comment79842 seems to be extraneous. - Chris Mattmann On June 9, 2014, 8:11 p.m., Ann Burgess wrote

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-09 Thread Chris Mattmann
the dependencies and then I think this is good to commit. - Chris Mattmann On June 9, 2014, 8:11 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22246

Re: Tika Language Detection

2014-06-15 Thread Chris Mattmann
Dear Omid, Looks like you got it added correctly :) Thanks for your question and for your Github pull request. I've filed a JIRA issue for you: https://issues.apache.org/jira/browse/TIKA-1337 I will get your patch into the sources and I sincerely appreciate it. In the future, please feel free

Review Request 22761: Create a Tika Translator implementation that uses JoshuaDecoder

2014-06-18 Thread Chris Mattmann
using http://joshua-decoder.org/data/fisher-callhome-corpus/ My dataset isn't perfect, but it can do basic translations. Also wrote a unit test, part of the patch. Thanks, Chris Mattmann

Re: Review Request 22761: Create a Tika Translator implementation that uses JoshuaDecoder

2014-06-18 Thread Chris Mattmann
corpus built using http://joshua-decoder.org/data/fisher-callhome-corpus/ My dataset isn't perfect, but it can do basic translations. Also wrote a unit test, part of the patch. Thanks, Chris Mattmann

Re: Review Request 22892: New parser for ENVI header files

2014-06-23 Thread Chris Mattmann
this. - Chris Mattmann On June 23, 2014, 9:43 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892

Re: Review Request 22892: New parser for ENVI header files

2014-06-23 Thread Chris Mattmann
/EnviHeaderParser.java https://reviews.apache.org/r/22892/#comment81848 org.apache.tika.parser.envi trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java https://reviews.apache.org/r/22892/#comment81849 org.apache.tika.parser.envi - Chris Mattmann On June

Re: Review Request 22892: New parser for ENVI header files

2014-06-25 Thread Chris Mattmann
/EnviHeaderParser.java https://reviews.apache.org/r/22892/#comment82136 Good comment Nick. I committed the version of this patch without this improvement, and we can make this improvement later on with a new issue. - Chris Mattmann On June 23, 2014, 11:14 p.m., Ann Burgess wrote

Review Request 23299: Add GoogleTranslate implementation of Translation API

2014-07-06 Thread Chris Mattmann
/ Testing --- Tested with my API key, works great. Also tests fail silently using the isAvailable API if the dummy API key is provided (by default). Thanks, Chris Mattmann

Re: [DISCUSS] 1.6 Release?

2014-07-16 Thread Chris Mattmann
wrote: Thanks Matthias, I will take a look a them before rolling the 1.6 RC. Got to finish up some patches, etc., but thanks for your your interest and I will be in touch soon! ++ Chris Mattmann, Ph.D. Chief Architect

Re: Review Request 23562: Add a CachedTranslator implementation

2014-07-16 Thread Chris Mattmann
/translate/CachedTranslator.java https://reviews.apache.org/r/23562/#comment84193 needs apache license headers trunk/tika-translate/src/main/resources/META-INF/services/org.apache.tika.language.translate.Translator https://reviews.apache.org/r/23562/#comment84194 Nice catch! - Chris Mattmann

Re: Review Request 23562: Add a CachedTranslator implementation

2014-07-17 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23562/#review48019 --- Ship it! Ship It! - Chris Mattmann On July 17, 2014, 4 p.m

Review Request 24051: MicrosoftTranslator setClient and setId NPE

2014-07-29 Thread Chris Mattmann
] mattmann% Thanks, Chris Mattmann

Re: NetCDF to Maven Central

2014-08-05 Thread Chris Mattmann
, 2014 12:07 PM To: John Caron ca...@ucar.edu, Chris Mattmann chris.mattm...@gmail.com Cc: John Caron ca...@unidata.ucar.edu, support-net...@unidata.ucar.edu support-net...@unidata.ucar.edu Subject: Re: NetCDF to Maven Central Thanks for the info John, I'll chat with the Tika-dev team about what

Re: Review Request 24506: Create an ExternalTranslator and a MosesTranslator

2014-08-08 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24506/#review50050 --- Ship it! Ship It! - Chris Mattmann On Aug. 8, 2014, 5:40 p.m

Re: Review Request 22402: Tika OCR

2014-08-10 Thread Chris Mattmann
, 2014, 10:18 p.m.) Review request for tika and Chris Mattmann. Repository: tika Description --- Integrating Tesseract OCR with Tika through a new Parser. See TIKA-93. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java

FW: New answer to What are the best algorithms for classifying the language of a text snippet? Why?

2014-08-14 Thread Chris Mattmann
This seems like a relevant Quora question.. -Original Message- From: Quora nore...@quora.com Date: Thursday, August 14, 2014 7:43 AM To: Chris Mattmann chris.mattm...@gmail.com Subject: New answer to What are the best algorithms for classifying the language of a text snippet? Why

Re: [Tika] Embedded images in PDF documents

2014-08-14 Thread Chris Mattmann
send a blank email to the dev list, dev-subscr...@tika.apache.org and follow the instructions from there. Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Damiano Porta damianopo...@gmail.com Date: Thursday, August 14, 2014 7:30 AM

Re: [DISCUSS] Apache Tika 1.6 RC #2..today?

2014-08-19 Thread Chris Mattmann
OK, will roll the RC in a day. Cheers, Chris -Original Message- From: Nick Burch apa...@gagravarr.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Tuesday, August 19, 2014 7:41 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: [DISCUSS] Apache Tika 1.6 RC

[ANNOUNCE] Apache Tika 1.6 release

2014-09-05 Thread Chris Mattmann
not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Chris

Re: Review Request 22402: Tika OCR

2014-09-19 Thread Chris Mattmann
On Sept. 19, 2014, 6:14 a.m., Chris Mattmann wrote: Ship It! Ready to go @tpalsulich.Tested on my machine looks great! THanks everyone! - Chris --- This is an automatically generated e-mail. To reply, visit: https

Re: Tika at ApacheCon Europe - 2 months time!

2014-09-26 Thread Chris Mattmann
That is awesome guys. Tons of great stuff to hack on! Chris Mattmann chris.mattm...@gmail.com -Original Message- From: David Meikle loo...@gmail.com Reply-To: u...@tika.apache.org Date: Thursday, September 25, 2014 10:01 AM To: dev@tika.apache.org Cc: u

Review Request 26542: Tika GDAL parser

2014-10-10 Thread Chris Mattmann
/diff/ Testing --- Tested via unit tests, and ran locally. Thanks, Chris Mattmann

Re: Review Request 26542: Tika GDAL parser

2014-10-11 Thread Chris Mattmann
://reviews.apache.org/r/26542/diff/ Testing --- Tested via unit tests, and ran locally. Thanks, Chris Mattmann

Review Request 27562: GRIB Parser for TIKA

2014-11-03 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27562/ --- Review request for tika, Lewis McGibbney, Chris Mattmann, Tyler Palsulich

Re: Review Request 27562: GRIB Parser for TIKA

2014-11-03 Thread Chris Mattmann
, Chris Mattmann, Tyler Palsulich, and Vineet Ghatge Hemantkumar. Bugs: tika-1423 https://issues.apache.org/jira/browse/tika-1423 Repository: tika Description --- GRIB Parser Patch Diffs - ./trunk/tika-parsers/pom.xml 1636144 ./trunk/tika-parsers/src/main/java/org/apache

Re: [mil-oss] Looking for a pure Java VMF parser

2015-03-21 Thread Chris Mattmann
Hi Alex, [CC to dev@tika.a.o] You may want to look at Apache Tika, which is like a “digital babel fish” parser and MIME type identification and language detection system for 1200+ file formats. http://tika.apache.org/ Is VMF this “VMF” format?

Re: Review Request 31758: TIKA-1330: tika batch code

2015-03-07 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31758/#review75632 --- Ship it! Ship It! - Chris Mattmann On March 5, 2015, 3:07 a.m

Re: Review Request 32291: ISATab parsers (preliminary version)

2015-03-24 Thread Chris Mattmann
/ISATabInvestigationParser.java https://reviews.apache.org/r/32291/#comment125670 would be good to note here that the Parser only populates metadata per Tim A.'s comments. - Chris Mattmann On March 23, 2015, 5:04 p.m., Giuseppe Totaro wrote

Re: Review Request 32291: ISATab parsers (preliminary version)

2015-03-28 Thread Chris Mattmann
! Thanks Giuseppe! - Chris Mattmann On March 23, 2015, 5:04 p.m., Giuseppe Totaro wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32291

Re: Review Request 32291: ISATab parsers (preliminary version)

2015-03-28 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32291/#review78158 --- Ship it! Ship It! - Chris Mattmann On March 23, 2015, 5:04 p.m

Re: Review Request 32255: File type description to HDFParser

2015-03-22 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32255/#review77370 --- Ship it! Ship It! - Chris Mattmann On March 19, 2015, 7:45 p.m

Re: Review Request 32260: Add file type description to NetCDF parser

2015-03-22 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32260/#review77371 --- Ship it! Ship It! - Chris Mattmann On March 19, 2015, 9:22 p.m

Re: Review Request 31758: TIKA-1330: tika batch code

2015-03-04 Thread Chris Mattmann
/apache/commons/io/FileUtils.html - Chris Mattmann On March 5, 2015, 3:07 a.m., Tim Allison wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31758

Re: [memex-jpl] this week action from luke

2015-04-23 Thread Chris Mattmann
Great work Luke and both of these changes make sense. Please send the pull request for that thank you! Great work Giuseppe! Go team! Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Luke hanson311...@gmail.com Date: Thursday

Re: [memex-jpl] this week action from luke

2015-04-22 Thread Chris Mattmann
Hi Luke, Actually I just meant go into tika-mimetypes.xml and change the magic offsets for application/xhtml+xml and see if that works. The code you changed below is actually how many bytes Tika will first download to do MIME checking. Cheers, Chris Chris Mattmann

Re: [memex-jpl] this week action from luke

2015-04-21 Thread Chris Mattmann
Thanks Luke. So I guess all I was asking was could you try it out. Thanks for the lesson in the RFC. Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Luke hanson311...@gmail.com Date: Wednesday, April 22, 2015 at 1:46 AM

[ANNOUNCE] Apache Tika 1.11 release

2015-10-25 Thread Chris Mattmann
g signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc <https://people.apache.org/keys/group/tika.asc> For more information on Apache Tika, visit the project home page: http://tika.apache.org/ <http://tika.apache.org/> — Chris Mattmann, on behalf of the Apache Tika community

[ANNOUNCE] Apache Tika 1.12 release

2016-02-15 Thread Chris Mattmann
g signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc <https://people.apache.org/keys/group/tika.asc> For more information on Apache Tika, visit the project home page: http://tika.apache.org/ <http://tika.apache.org/> — Chris Mattmann, on behalf of the Apache Tika community

Can't get Tensorflow REST recognizer to work

2016-08-14 Thread Chris Mattmann
Hi Devs, Here’s what I’m seeing in TIKA-1993 and 1508, which I would love to finish today. 1. Tensorflow python script works great. 2. Tensorflow REST service – Docker container works (had to upgrade Docker to latest) 3. Tensorflow REST service – Tika parser metadata works great. 4. Tensorflow

Re: Tika 1.14?

2016-08-12 Thread Chris Mattmann
1508, and 1680 are pending me/my review. I’ll get it done today. On 8/12/16, 4:24 AM, "Allison, Timothy B." wrote: >> I know it's been a little bit since we talked about 2.0. We had discussed holding off while some API changes that were under consideration. Has any

Re: TIKA-1164

2016-07-08 Thread Chris Mattmann
gt;A : "scatherine@gouv.mc" <scatherine@gouv.mc> >Cc : "dev@tika.apache.org" <dev@tika.apache.org> >Date : 04/07/2016 17:45 >Objet : Re: TIKA-1164 > > > > >Hi Samuel I am forwarding your email to

Re:

2017-01-20 Thread Chris Mattmann
Hi Graham, Did you email dev-subscr...@tika.apache.org and not get a reply? Cheers, Chris From: Graham Russell Date: Friday, January 20, 2017 at 11:42 AM To: "dev-ow...@tika.apache.org" Subject: Re: Hi I've tried to

Re: Move oldest release archive from lucene/tika to tika?

2017-02-16 Thread Chris Mattmann
see the argument that there may still be some automated scripts pulling down 0.7 somewhere. Do we have download stats per file available somewhere? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 16. feb. 2017 kl. 00.30 skrev Chris Mattmann &

Re: Move oldest release archive from lucene/tika to tika?

2017-02-15 Thread Chris Mattmann
Hi Jan, Thanks for the message, but frankly I don’t think we should do this. The releases have been made and should be canonical, even if they were made from Lucene since Tika originated there. Many sites over many years have likely mirror’ed thoese URLs and I think this could screw that up. Is

Re: Tika on apache.org

2016-09-10 Thread Chris Mattmann
yayyy — Chris Mattmann chris.mattm...@gmail.com From: lewis john mcgibbney <lewi...@apache.org> Reply-To: <u...@tika.apache.org> Date: Saturday, September 10, 2016 at 10:47 AM To: <u...@tika.apache.org>, "dev@tika.apache.org" <dev@tika.apache.org> Sub

Re: Tika 1.14?

2016-09-22 Thread Chris Mattmann
Sounds great to me Tim. If you tell me when the tests are done, I’d be happy to RC a release! On 9/21/16, 11:31 AM, "Allison, Timothy B." wrote: All, PDFBox 2.0.3 is now integrated, I'm about to push the integration with POI-3.15. I have a few cleanup things

Re: Tika 2.0: Restructuring Tesseract

2016-08-25 Thread Chris Mattmann
I like simple – I vote for option 1 ☺ On 8/25/16, 9:06 PM, "Bob Paulin" wrote: Hi, I've been looking at some of the work recently with Tesseract and it's really cool to be able to get OCR combine with so many parsers. The bad part is it has really

[RESULT] [VOTE] Apache Tika 1.14 Release Candidate #1

2016-11-09 Thread Chris Mattmann
Hi, This VOTE has PASSED with the following tallies: +1 Chris Mattmann* Tim Allison* Bob Paulin* Konstantin Gribov* I’ll go ahead and push to the mirrors and update the website. Thanks to allow who VOTEd! Cheers, Chris On 10/19/16, 11:48 AM, "Chris Mattmann" <mattm...@apac

[ANNOUNCE] Apache Tika 1.14 release

2016-11-09 Thread Chris Mattmann
not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org Cheers, Chris

Re: [ANNOUNCE] Welcome Luis Filipe Nassif and Thamme Gowda as Apache Tika PMC members and committers

2016-11-07 Thread Chris Mattmann
Masters' student at Univ of Southern California(USC), Los Angeles. I am a member of Information retrieval and data science group at USC and an intern at NASA Jet Propulsion Laboratory, Pasadena. I am so honored to be working with Dr. Chris Mattmann for the last 1 year, he had introduced Tika to me in a c

Re: [ANNOUNCE] Welcome Luis Filipe Nassif and Thamme Gowda as Apache Tika PMC members and committers

2016-11-07 Thread Chris Mattmann
Luis Welcome! That is so awesome! Thank you for using Tika in Law Enforcement. We are lucky to have you and let me know what I can do to help you further Tika. Cheers bro! From: Luís Filipe Nassif Reply-To: Date: Monday, November 7, 2016

[VOTE] Apache Tika 1.14 Release Candidate #1

2016-10-19 Thread Chris Mattmann
Hi Folks, A first candidate for the Tika 1.14 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: https://git-wip-us.apache.org/repos/asf?p=tika.git;a=tree;hb=687d7706c9778e4f49f2834a07e5a9d99b23042b The SHA1

Re: Apache Tika - Visio V5

2016-11-03 Thread Chris Mattmann
Hi please send to dev@tika.apache.org From: Suganya Suganya Date: Thursday, November 3, 2016 at 3:42 AM To: "dev-ow...@tika.apache.org" Subject: Fwd: Apache Tika - Visio V5 Hi Team, Please let me know if there an

Re: Apache Tika issue review (TIKA-2190 & TIKA-2189)

2016-12-20 Thread Chris Mattmann
Moving dev-owner to BCC. I think you meant to send this to dev@tika.apache.org, so sending there J From: Bipul Kumar Date: Tuesday, December 20, 2016 at 1:54 AM To: "dev-ow...@tika.apache.org" , "talli...@mitre.org"

Re: Tika 1.15?

2017-03-17 Thread Chris Mattmann
None, and happy to roll one by early-mid next week… On 3/17/17, 3:23 AM, "Allison, Timothy B." wrote: All, We're coming up on 6 months since our last release. Any objections to releasing Tika 1.15 shortly after POI 3.16-beta3 is out (early/mid April, I'd

Re: apache tikka is not working for scanned image documents

2017-04-04 Thread Chris Mattmann
Hi, Have you checked out: http://wiki.apache.org/tika/TikaOCR What specifically isn’t working? Moving this to dev@t.a.o: Cheers, Chris From: on behalf of Vadivelhan Date: Tuesday,

Re: tika-2.x-windows now running

2017-03-13 Thread Chris Mattmann
+1 this makes sense to me David! Great job On 3/13/17, 8:01 PM, "David Meikle" wrote: Hello All, The tika-2.x-windows is back up and running - whoop whoop! Turns out the Maven build configuration wasn't pointing to a settings.xml that had the

Re: [tika] branch master updated: TIKA-1988 -- allow for errors downloading models

2017-07-07 Thread Chris Mattmann
Sure On 7/7/17, 7:57 AM, "Allison, Timothy B." <talli...@mitre.org> wrote: I'll leave the moving to a new module to you? -Original Message- From: Chris Mattmann [mailto:mattm...@apache.org] Sent: Friday, July 7, 2017 10:32 AM To: dev@tika.

Re: [tika] branch master updated: TIKA-1988 -- allow for errors downloading models

2017-07-07 Thread Chris Mattmann
thy B." <talli...@mitre.org> wrote: Thank you, Chris! Now, how do I bulk move open 1.16->1.17 on JIRA? -Original Message----- From: Chris Mattmann [mailto:mattm...@apache.org] Sent: Friday, July 7, 2017 11:39 AM To: dev@tika.apache.org

Re: [VOTE] Release Apache Tika 1.16 Candidate #1

2017-07-08 Thread Chris Mattmann
+1 from me SIGS and CHECKSUMS look good. Thanks Tim! Cheers, Chris LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval \-server; do $HOME/bin/stage_apache_rc tika$type 1.16 https://dist.apache.org/repos/dist/dev/tika/; done % Total% Received % Xferd Average Speed

Re: [tika] branch master updated: TIKA-1988 -- allow for errors downloading models

2017-07-07 Thread Chris Mattmann
Great Tim thanks! On 7/7/17, 7:28 AM, "talli...@apache.org" wrote: This is an automated email from the ASF dual-hosted git repository. tallison pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The

Public datasets for Semantic Relationship Extraction

2017-06-28 Thread Chris Mattmann
Hi Team, Anything here that we can use in OpenNLP?

Re: Tika 1.15.1?

2017-06-28 Thread Chris Mattmann
Detector Parser ++++++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and Development Offices (8212) NASA Jet Propulsion Laboratory Pasadena, CA

Re: build failures

2017-07-06 Thread Chris Mattmann
Wow frustrating since the build prints BUILD_SUCCESS at the bottom… On 7/6/17, 7:09 AM, "Allison, Timothy B." wrote: https://issues.apache.org/jira/browse/INFRA-14520 Unless anyone has ideas? I also see this, but it looks like Jenkins has decided the

Re: Tika 1.15.1? -> 1.16

2017-07-05 Thread Chris Mattmann
ne by today, push 1.16 and we’ll put Age Detection in 1.17. ++++++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and

Re: Tika 1.15.1? -> 1.16

2017-07-07 Thread Chris Mattmann
10) [mailto:chris.a.mattm...@jpl.nasa.gov] > Sent: Monday, July 3, 2017 2:24 PM > To: dev@tika.apache.org > Subject: Re: Tika 1.15.1? -> 1.16 > > Hey Tim, if I don’t get it done by today, push 1.16 and we’ll put Age > Detection in 1.17. > > +

error in tika-bundle: tika-seraialization was removed?

2017-07-05 Thread Chris Mattmann
Anyone else seeing build errors in tika-bundle since tika-serialization was removed? I had to implement the following patch to fix it: LMC-053601:tika-bundle mattmann$ git diff ab4ea4724e52fb5718a9d8ea86af96425fb87c7b diff --git a/tika-bundle/pom.xml b/tika-bundle/pom.xml index

Re: Query related to Apache Tika dependencies

2017-08-08 Thread Chris Mattmann
From: Deepanshu Bhardwaj Date: Tuesday, August 8, 2017 at 2:53 AM To: "dev-ow...@tika.apache.org" Subject: Query related to Apache Tika dependencies Hi Team, I need one help. I need to know the list of libraries

Re: Apache Tika

2017-05-03 Thread Chris Mattmann
Hi Gorka, See: http://wiki.apache.org/tika/TikaOCR/ Is that what you’re looking for? If so, then you can simply enable OCR for Tika REST server, and then point your TIka Python at that. Does that help? Cheers, Chris From: gorka gallo Date: Wednesday,

OSGI expert help from Bob/others: TIKA-2016

2017-05-03 Thread Chris Mattmann
Hey Team, I’m trying to get TIKA-2016 sentiment analysis integrated and having a heck of a time fighting tika-bundle and OSGI of which I am not an expert. See: https://github.com/apache/tika/pull/169/files Basically what I’m saying: 1. The USC IRDS sentiment analysis parser has a bunch of

  1   2   3   >