Re: build failures

2017-07-06 Thread Chris Mattmann
Wow frustrating since the build prints BUILD_SUCCESS at the bottom… On 7/6/17, 7:09 AM, "Allison, Timothy B." wrote: https://issues.apache.org/jira/browse/INFRA-14520 Unless anyone has ideas? I also see this, but it looks like Jenkins has decided the

Re: Tika 1.15.1? -> 1.16

2017-07-05 Thread Chris Mattmann
ne by today, push 1.16 and we’ll put Age Detection in 1.17. ++++++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and

error in tika-bundle: tika-seraialization was removed?

2017-07-05 Thread Chris Mattmann
Anyone else seeing build errors in tika-bundle since tika-serialization was removed? I had to implement the following patch to fix it: LMC-053601:tika-bundle mattmann$ git diff ab4ea4724e52fb5718a9d8ea86af96425fb87c7b diff --git a/tika-bundle/pom.xml b/tika-bundle/pom.xml index

Re: Tika 1.15.1?

2017-06-28 Thread Chris Mattmann
Detector Parser ++++++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and Development Offices (8212) NASA Jet Propulsion Laboratory Pasadena, CA

Public datasets for Semantic Relationship Extraction

2017-06-28 Thread Chris Mattmann
Hi Team, Anything here that we can use in OpenNLP?

Re: Tika 1.15.1?

2017-06-16 Thread Chris Mattmann
Yep agreed on both Tim. If I don’t get it done this weekend, we’ll apply the approach you mention below. Great seeing you yesterday! On 6/16/17, 11:40 AM, "Allison, Timothy B." wrote: All, I'm hoping to wrap up the TEIParser next week (I'm thinking about

Re: Tika 1.16?

2017-06-02 Thread Chris Mattmann
FWIW, I believe there is some history on this but can’t find it right now. I thought we discussed X.Y.Z versionining at one point and decided against it. I have no objections to it however. Aloha, Chris On 6/2/17, 2:39 PM, "Tyler Bui-Palsulich" wrote: +1 to

Re: [VOTE] Release Apache Tika 1.15 Candidate #2

2017-05-28 Thread Chris Mattmann
eryozkin (Release Management) <sberyoz...@gmail.com>" imported gpg: key B876884A: "Chris Mattmann (CODE SIGNING KEY) <mattm...@apache.org>" not changed gpg: key 48BAEBF6: "Lewis John McGibbney (CODE SIGNING KEY) <lewi...@apache.org>" 1 new signature gpg: ke

Re: Tika 1.15

2017-05-22 Thread Chris Mattmann
Tim -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Thursday, May 18, 2017 12:26 PM To: dev@tika.apache.org Subject: RE: Tika 1.15 +1 Thank you! -Original Message- From: Chris Mattmann [mailto:mat

Re: Tika 1.15

2017-05-18 Thread Chris Mattmann
the log message in TIKA-2359. Happy to change that message if there are any concerns/recommendations. Onward! Thank you! Cheers, Tim -Original Message- From: Chris Mattmann [mailto:mattm...@apache.org] Sent: Wednesday, May 1

Re: TikaInputStream parse the content and write to OutputStream

2017-05-17 Thread Chris Mattmann
[moving dev-owner@ to BCC] Forwarding to the Tika list. From: Prateek Agarwal Date: Tuesday, May 16, 2017 at 6:35 AM To: "dev-ow...@tika.apache.org" Subject: TikaInputStream parse the content and write to OutputStream Hi, We have

Re: Tika 1.15

2017-05-17 Thread Chris Mattmann
On May 1, 2017 3:59 PM, "Allison, Timothy B." <talli...@mitre.org> > wrote: > > > Sounds good. W00t! > > > > -Original Message- > > From: Chris Mattmann [mailto:mattm...@apache.org] > >

Re: Tika talk next week - help needed!

2017-05-16 Thread Chris Mattmann
Yep, literally take a look at the Tika wiki – there are examples a plenty and even screen shots. Further, if you look at the MEMEX site under our new publications section, there are a few examples (like the ICMR paper on forensics) that show it in action.

Re: TODO: Reminder: Thamme & Chris to document DL4J vision/Tika-DL

2017-05-14 Thread Chris Mattmann
review my PR, fix the stuff and get it merged (I realized it was not an easy one!) Glad to see it merged and I will write the reference and examples in the page you created. -Thamme On Tue, May 9, 2017 at 7:27 AM, Chris Mattmann <mattm...@apache.org> wrote: Hey Thamme, I created a page

TODO: Reminder: Thamme & Chris to document DL4J vision/Tika-DL

2017-05-09 Thread Chris Mattmann
Hey Thamme, I created a page here: https://wiki.apache.org/tika/TikaAndVisionDL4J We should at a minimum copy over the stuff from the GitHub issue on: 1. Getting DL4J installed (or using it) 2. Running Tika-DL and what to look for 3. Testing it out (maybe your 100 run benchmark on the lion) 4.

Welcome Thejan Wijesinghe GSoC 2017 student!

2017-05-04 Thread Chris Mattmann
I’d like to welcome Thejan Wijesinghe our Apache Tika GSoC 2017 student, working on Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types. Thamme and I will mentor him and welcome to the community! Cheers, Chris

Re: OSGI expert help from Bob/others: TIKA-2016

2017-05-03 Thread Chris Mattmann
kely not harmful. I'll need a bit longer to digest if something needs to change. Will submit a JIRA if something needs to change. - Bob On 5/3/2017 2:55 PM, Chris Mattmann wrote: > OK I fixed it. Not sure if I did it right but it at least passes the t

Re: OSGI expert help from Bob/others: TIKA-2016

2017-05-03 Thread Chris Mattmann
, "Chris Mattmann" <mattm...@apache.org> wrote: Hey Team, I’m trying to get TIKA-2016 sentiment analysis integrated and having a heck of a time fighting tika-bundle and OSGI of which I am not an expert. See: https://github.com/apache/tika/pull/169/files

OSGI expert help from Bob/others: TIKA-2016

2017-05-03 Thread Chris Mattmann
Hey Team, I’m trying to get TIKA-2016 sentiment analysis integrated and having a heck of a time fighting tika-bundle and OSGI of which I am not an expert. See: https://github.com/apache/tika/pull/169/files Basically what I’m saying: 1. The USC IRDS sentiment analysis parser has a bunch of

Re: Apache Tika

2017-05-03 Thread Chris Mattmann
Hi Gorka, See: http://wiki.apache.org/tika/TikaOCR/ Is that what you’re looking for? If so, then you can simply enable OCR for Tika REST server, and then point your TIka Python at that. Does that help? Cheers, Chris From: gorka gallo Date: Wednesday,

Re: Tika 1.15

2017-05-02 Thread Chris Mattmann
at's a bit easier to build visualizations for. Tyler On May 1, 2017 3:59 PM, "Allison, Timothy B." <talli...@mitre.org> wrote: > Sounds good. W00t! > > -----Original Message- > From: Chris Mattmann [mailto:mattm...@apache.org]

Re: Tika 1.15

2017-05-01 Thread Chris Mattmann
at's a bit easier to build visualizations for. Tyler On May 1, 2017 3:59 PM, "Allison, Timothy B." <talli...@mitre.org> wrote: > Sounds good. W00t! > > -----Original Message- > From: Chris Mattmann [mailto:mattm...@apache.org]

Re: Tika 1.15

2017-05-01 Thread Chris Mattmann
Thank you! ++++++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulation and Development Offices (8212) NASA Jet Propulsion Laboratory Pasadena, CA

Re: apache tikka is not working for scanned image documents

2017-04-04 Thread Chris Mattmann
Hi, Have you checked out: http://wiki.apache.org/tika/TikaOCR What specifically isn’t working? Moving this to dev@t.a.o: Cheers, Chris From: on behalf of Vadivelhan Date: Tuesday,

Re: Tika 1.15?

2017-03-17 Thread Chris Mattmann
None, and happy to roll one by early-mid next week… On 3/17/17, 3:23 AM, "Allison, Timothy B." wrote: All, We're coming up on 6 months since our last release. Any objections to releasing Tika 1.15 shortly after POI 3.16-beta3 is out (early/mid April, I'd

Re: tika-2.x-windows now running

2017-03-13 Thread Chris Mattmann
+1 this makes sense to me David! Great job On 3/13/17, 8:01 PM, "David Meikle" wrote: Hello All, The tika-2.x-windows is back up and running - whoop whoop! Turns out the Maven build configuration wasn't pointing to a settings.xml that had the

Re: Move oldest release archive from lucene/tika to tika?

2017-02-16 Thread Chris Mattmann
see the argument that there may still be some automated scripts pulling down 0.7 somewhere. Do we have download stats per file available somewhere? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 16. feb. 2017 kl. 00.30 skrev Chris Mattmann &

Re: Move oldest release archive from lucene/tika to tika?

2017-02-15 Thread Chris Mattmann
Hi Jan, Thanks for the message, but frankly I don’t think we should do this. The releases have been made and should be canonical, even if they were made from Lucene since Tika originated there. Many sites over many years have likely mirror’ed thoese URLs and I think this could screw that up. Is

Re:

2017-01-20 Thread Chris Mattmann
Hi Graham, Did you email dev-subscr...@tika.apache.org and not get a reply? Cheers, Chris From: Graham Russell Date: Friday, January 20, 2017 at 11:42 AM To: "dev-ow...@tika.apache.org" Subject: Re: Hi I've tried to

Re: Apache Tika issue review (TIKA-2190 & TIKA-2189)

2016-12-20 Thread Chris Mattmann
Moving dev-owner to BCC. I think you meant to send this to dev@tika.apache.org, so sending there J From: Bipul Kumar Date: Tuesday, December 20, 2016 at 1:54 AM To: "dev-ow...@tika.apache.org" , "talli...@mitre.org"

[ANNOUNCE] Apache Tika 1.14 release

2016-11-09 Thread Chris Mattmann
not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org Cheers, Chris

[RESULT] [VOTE] Apache Tika 1.14 Release Candidate #1

2016-11-09 Thread Chris Mattmann
Hi, This VOTE has PASSED with the following tallies: +1 Chris Mattmann* Tim Allison* Bob Paulin* Konstantin Gribov* I’ll go ahead and push to the mirrors and update the website. Thanks to allow who VOTEd! Cheers, Chris On 10/19/16, 11:48 AM, "Chris Mattmann" <mattm...@apac

Re: [ANNOUNCE] Welcome Luis Filipe Nassif and Thamme Gowda as Apache Tika PMC members and committers

2016-11-07 Thread Chris Mattmann
Masters' student at Univ of Southern California(USC), Los Angeles. I am a member of Information retrieval and data science group at USC and an intern at NASA Jet Propulsion Laboratory, Pasadena. I am so honored to be working with Dr. Chris Mattmann for the last 1 year, he had introduced Tika to me in a c

Re: [ANNOUNCE] Welcome Luis Filipe Nassif and Thamme Gowda as Apache Tika PMC members and committers

2016-11-07 Thread Chris Mattmann
Luis Welcome! That is so awesome! Thank you for using Tika in Law Enforcement. We are lucky to have you and let me know what I can do to help you further Tika. Cheers bro! From: Luís Filipe Nassif Reply-To: Date: Monday, November 7, 2016

Re: Apache Tika - Visio V5

2016-11-03 Thread Chris Mattmann
Hi please send to dev@tika.apache.org From: Suganya Suganya Date: Thursday, November 3, 2016 at 3:42 AM To: "dev-ow...@tika.apache.org" Subject: Fwd: Apache Tika - Visio V5 Hi Team, Please let me know if there an

[VOTE] Apache Tika 1.14 Release Candidate #1

2016-10-19 Thread Chris Mattmann
Hi Folks, A first candidate for the Tika 1.14 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: https://git-wip-us.apache.org/repos/asf?p=tika.git;a=tree;hb=687d7706c9778e4f49f2834a07e5a9d99b23042b The SHA1

Re: Tika 1.14?

2016-09-22 Thread Chris Mattmann
Sounds great to me Tim. If you tell me when the tests are done, I’d be happy to RC a release! On 9/21/16, 11:31 AM, "Allison, Timothy B." wrote: All, PDFBox 2.0.3 is now integrated, I'm about to push the integration with POI-3.15. I have a few cleanup things

Re: Tika on apache.org

2016-09-10 Thread Chris Mattmann
yayyy — Chris Mattmann chris.mattm...@gmail.com From: lewis john mcgibbney <lewi...@apache.org> Reply-To: <u...@tika.apache.org> Date: Saturday, September 10, 2016 at 10:47 AM To: <u...@tika.apache.org>, "dev@tika.apache.org" <dev@tika.apache.org> Sub

Re: Tika 2.0: Restructuring Tesseract

2016-08-25 Thread Chris Mattmann
I like simple – I vote for option 1 ☺ On 8/25/16, 9:06 PM, "Bob Paulin" wrote: Hi, I've been looking at some of the work recently with Tesseract and it's really cool to be able to get OCR combine with so many parsers. The bad part is it has really

Can't get Tensorflow REST recognizer to work

2016-08-14 Thread Chris Mattmann
Hi Devs, Here’s what I’m seeing in TIKA-1993 and 1508, which I would love to finish today. 1. Tensorflow python script works great. 2. Tensorflow REST service – Docker container works (had to upgrade Docker to latest) 3. Tensorflow REST service – Tika parser metadata works great. 4. Tensorflow

Re: Tika 1.14?

2016-08-12 Thread Chris Mattmann
1508, and 1680 are pending me/my review. I’ll get it done today. On 8/12/16, 4:24 AM, "Allison, Timothy B." wrote: >> I know it's been a little bit since we talked about 2.0. We had discussed holding off while some API changes that were under consideration. Has any

Re: TIKA-1164

2016-07-08 Thread Chris Mattmann
gt;A : "scatherine@gouv.mc" <scatherine@gouv.mc> >Cc : "dev@tika.apache.org" <dev@tika.apache.org> >Date : 04/07/2016 17:45 >Objet : Re: TIKA-1164 > > > > >Hi Samuel I am forwarding your email to

[ANNOUNCE] Apache Tika 1.12 release

2016-02-15 Thread Chris Mattmann
g signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc <https://people.apache.org/keys/group/tika.asc> For more information on Apache Tika, visit the project home page: http://tika.apache.org/ <http://tika.apache.org/> — Chris Mattmann, on behalf of the Apache Tika community

[ANNOUNCE] Apache Tika 1.11 release

2015-10-25 Thread Chris Mattmann
g signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc <https://people.apache.org/keys/group/tika.asc> For more information on Apache Tika, visit the project home page: http://tika.apache.org/ <http://tika.apache.org/> — Chris Mattmann, on behalf of the Apache Tika community

Re: [memex-jpl] this week action from luke

2015-04-23 Thread Chris Mattmann
Great work Luke and both of these changes make sense. Please send the pull request for that thank you! Great work Giuseppe! Go team! Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Luke hanson311...@gmail.com Date: Thursday

Re: [memex-jpl] this week action from luke

2015-04-22 Thread Chris Mattmann
Hi Luke, Actually I just meant go into tika-mimetypes.xml and change the magic offsets for application/xhtml+xml and see if that works. The code you changed below is actually how many bytes Tika will first download to do MIME checking. Cheers, Chris Chris Mattmann

Re: [memex-jpl] this week action from luke

2015-04-21 Thread Chris Mattmann
Thanks Luke. So I guess all I was asking was could you try it out. Thanks for the lesson in the RFC. Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Luke hanson311...@gmail.com Date: Wednesday, April 22, 2015 at 1:46 AM

Re: Review Request 32291: ISATab parsers (preliminary version)

2015-03-28 Thread Chris Mattmann
! Thanks Giuseppe! - Chris Mattmann On March 23, 2015, 5:04 p.m., Giuseppe Totaro wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32291

Re: Review Request 32291: ISATab parsers (preliminary version)

2015-03-28 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32291/#review78158 --- Ship it! Ship It! - Chris Mattmann On March 23, 2015, 5:04 p.m

Re: Review Request 32291: ISATab parsers (preliminary version)

2015-03-24 Thread Chris Mattmann
/ISATabInvestigationParser.java https://reviews.apache.org/r/32291/#comment125670 would be good to note here that the Parser only populates metadata per Tim A.'s comments. - Chris Mattmann On March 23, 2015, 5:04 p.m., Giuseppe Totaro wrote

Re: Review Request 32255: File type description to HDFParser

2015-03-22 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32255/#review77370 --- Ship it! Ship It! - Chris Mattmann On March 19, 2015, 7:45 p.m

Re: Review Request 32260: Add file type description to NetCDF parser

2015-03-22 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32260/#review77371 --- Ship it! Ship It! - Chris Mattmann On March 19, 2015, 9:22 p.m

Re: [mil-oss] Looking for a pure Java VMF parser

2015-03-21 Thread Chris Mattmann
Hi Alex, [CC to dev@tika.a.o] You may want to look at Apache Tika, which is like a “digital babel fish” parser and MIME type identification and language detection system for 1200+ file formats. http://tika.apache.org/ Is VMF this “VMF” format?

Re: Review Request 31758: TIKA-1330: tika batch code

2015-03-07 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31758/#review75632 --- Ship it! Ship It! - Chris Mattmann On March 5, 2015, 3:07 a.m

Re: Review Request 31758: TIKA-1330: tika batch code

2015-03-04 Thread Chris Mattmann
/apache/commons/io/FileUtils.html - Chris Mattmann On March 5, 2015, 3:07 a.m., Tim Allison wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31758

Review Request 27562: GRIB Parser for TIKA

2014-11-03 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27562/ --- Review request for tika, Lewis McGibbney, Chris Mattmann, Tyler Palsulich

Re: Review Request 27562: GRIB Parser for TIKA

2014-11-03 Thread Chris Mattmann
, Chris Mattmann, Tyler Palsulich, and Vineet Ghatge Hemantkumar. Bugs: tika-1423 https://issues.apache.org/jira/browse/tika-1423 Repository: tika Description --- GRIB Parser Patch Diffs - ./trunk/tika-parsers/pom.xml 1636144 ./trunk/tika-parsers/src/main/java/org/apache

Re: Review Request 26542: Tika GDAL parser

2014-10-11 Thread Chris Mattmann
://reviews.apache.org/r/26542/diff/ Testing --- Tested via unit tests, and ran locally. Thanks, Chris Mattmann

Review Request 26542: Tika GDAL parser

2014-10-10 Thread Chris Mattmann
/diff/ Testing --- Tested via unit tests, and ran locally. Thanks, Chris Mattmann

Re: Tika at ApacheCon Europe - 2 months time!

2014-09-26 Thread Chris Mattmann
That is awesome guys. Tons of great stuff to hack on! Chris Mattmann chris.mattm...@gmail.com -Original Message- From: David Meikle loo...@gmail.com Reply-To: u...@tika.apache.org Date: Thursday, September 25, 2014 10:01 AM To: dev@tika.apache.org Cc: u

Re: Review Request 22402: Tika OCR

2014-09-19 Thread Chris Mattmann
On Sept. 19, 2014, 6:14 a.m., Chris Mattmann wrote: Ship It! Ready to go @tpalsulich.Tested on my machine looks great! THanks everyone! - Chris --- This is an automatically generated e-mail. To reply, visit: https

[ANNOUNCE] Apache Tika 1.6 release

2014-09-05 Thread Chris Mattmann
not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Chris

Re: [DISCUSS] Apache Tika 1.6 RC #2..today?

2014-08-19 Thread Chris Mattmann
OK, will roll the RC in a day. Cheers, Chris -Original Message- From: Nick Burch apa...@gagravarr.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Tuesday, August 19, 2014 7:41 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: [DISCUSS] Apache Tika 1.6 RC

FW: New answer to What are the best algorithms for classifying the language of a text snippet? Why?

2014-08-14 Thread Chris Mattmann
This seems like a relevant Quora question.. -Original Message- From: Quora nore...@quora.com Date: Thursday, August 14, 2014 7:43 AM To: Chris Mattmann chris.mattm...@gmail.com Subject: New answer to What are the best algorithms for classifying the language of a text snippet? Why

Re: [Tika] Embedded images in PDF documents

2014-08-14 Thread Chris Mattmann
send a blank email to the dev list, dev-subscr...@tika.apache.org and follow the instructions from there. Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Damiano Porta damianopo...@gmail.com Date: Thursday, August 14, 2014 7:30 AM

Re: Review Request 22402: Tika OCR

2014-08-10 Thread Chris Mattmann
, 2014, 10:18 p.m.) Review request for tika and Chris Mattmann. Repository: tika Description --- Integrating Tesseract OCR with Tika through a new Parser. See TIKA-93. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java

Re: Review Request 24506: Create an ExternalTranslator and a MosesTranslator

2014-08-08 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24506/#review50050 --- Ship it! Ship It! - Chris Mattmann On Aug. 8, 2014, 5:40 p.m

Re: NetCDF to Maven Central

2014-08-05 Thread Chris Mattmann
, 2014 12:07 PM To: John Caron ca...@ucar.edu, Chris Mattmann chris.mattm...@gmail.com Cc: John Caron ca...@unidata.ucar.edu, support-net...@unidata.ucar.edu support-net...@unidata.ucar.edu Subject: Re: NetCDF to Maven Central Thanks for the info John, I'll chat with the Tika-dev team about what

Review Request 24051: MicrosoftTranslator setClient and setId NPE

2014-07-29 Thread Chris Mattmann
] mattmann% Thanks, Chris Mattmann

Re: Review Request 23562: Add a CachedTranslator implementation

2014-07-17 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23562/#review48019 --- Ship it! Ship It! - Chris Mattmann On July 17, 2014, 4 p.m

Re: [DISCUSS] 1.6 Release?

2014-07-16 Thread Chris Mattmann
wrote: Thanks Matthias, I will take a look a them before rolling the 1.6 RC. Got to finish up some patches, etc., but thanks for your your interest and I will be in touch soon! ++ Chris Mattmann, Ph.D. Chief Architect

Re: Review Request 23562: Add a CachedTranslator implementation

2014-07-16 Thread Chris Mattmann
/translate/CachedTranslator.java https://reviews.apache.org/r/23562/#comment84193 needs apache license headers trunk/tika-translate/src/main/resources/META-INF/services/org.apache.tika.language.translate.Translator https://reviews.apache.org/r/23562/#comment84194 Nice catch! - Chris Mattmann

Review Request 23299: Add GoogleTranslate implementation of Translation API

2014-07-06 Thread Chris Mattmann
/ Testing --- Tested with my API key, works great. Also tests fail silently using the isAvailable API if the dummy API key is provided (by default). Thanks, Chris Mattmann

Re: Review Request 22892: New parser for ENVI header files

2014-06-25 Thread Chris Mattmann
/EnviHeaderParser.java https://reviews.apache.org/r/22892/#comment82136 Good comment Nick. I committed the version of this patch without this improvement, and we can make this improvement later on with a new issue. - Chris Mattmann On June 23, 2014, 11:14 p.m., Ann Burgess wrote

Re: Review Request 22892: New parser for ENVI header files

2014-06-23 Thread Chris Mattmann
this. - Chris Mattmann On June 23, 2014, 9:43 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892

Re: Review Request 22892: New parser for ENVI header files

2014-06-23 Thread Chris Mattmann
/EnviHeaderParser.java https://reviews.apache.org/r/22892/#comment81848 org.apache.tika.parser.envi trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java https://reviews.apache.org/r/22892/#comment81849 org.apache.tika.parser.envi - Chris Mattmann On June

Review Request 22761: Create a Tika Translator implementation that uses JoshuaDecoder

2014-06-18 Thread Chris Mattmann
using http://joshua-decoder.org/data/fisher-callhome-corpus/ My dataset isn't perfect, but it can do basic translations. Also wrote a unit test, part of the patch. Thanks, Chris Mattmann

Re: Review Request 22761: Create a Tika Translator implementation that uses JoshuaDecoder

2014-06-18 Thread Chris Mattmann
corpus built using http://joshua-decoder.org/data/fisher-callhome-corpus/ My dataset isn't perfect, but it can do basic translations. Also wrote a unit test, part of the patch. Thanks, Chris Mattmann

Re: Tika Language Detection

2014-06-15 Thread Chris Mattmann
Dear Omid, Looks like you got it added correctly :) Thanks for your question and for your Github pull request. I've filed a JIRA issue for you: https://issues.apache.org/jira/browse/TIKA-1337 I will get your patch into the sources and I sincerely appreciate it. In the future, please feel free

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-09 Thread Chris Mattmann
://reviews.apache.org/r/22246/#comment79840 agreed, this seems to be extraneous. I would remove this part. trunk/tika-parsers/pom.xml https://reviews.apache.org/r/22246/#comment79842 seems to be extraneous. - Chris Mattmann On June 9, 2014, 8:11 p.m., Ann Burgess wrote

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-09 Thread Chris Mattmann
the dependencies and then I think this is good to commit. - Chris Mattmann On June 9, 2014, 8:11 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22246

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Chris Mattmann
/DefaultTranslator.java https://reviews.apache.org/r/22219/#comment79395 Need Apache license here. I will add it. - Chris Mattmann On June 5, 2014, 4:19 p.m., Tyler Palsulich wrote: --- This is an automatically generated e-mail

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Chris Mattmann
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22219/#review44842 --- Ship it! Ship It! - Chris Mattmann On June 5, 2014, 4:19 p.m

Re: Review Request 22219: Add Translation to Tika

2014-06-05 Thread Chris Mattmann
: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22219/ --- (Updated June 5, 2014, 4:19 p.m.) Review request for tika and Chris Mattmann. Repository: tika Description --- This patch adds

Re: Review Request 22219: Add Translation to Tika

2014-06-04 Thread Chris Mattmann
should be dynamically loaded via JavaSPI trunk/tika-core/src/main/java/org/apache/tika/language/MicrosoftTranslator.java https://reviews.apache.org/r/22219/#comment79303 use Eclipse or IdeaJ to auto put javadoc in for interfaces? - Chris Mattmann On June 4, 2014, 7:17 p.m., Tyler

Re: Review Request 22246: New parser for Matlab .mat files

2014-06-04 Thread Chris Mattmann
-mail. To reply, visit: https://reviews.apache.org/r/22246/ --- (Updated June 4, 2014, 10:23 p.m.) Review request for tika and Chris Mattmann. Repository: tika Description --- This is a new parser for Matlab .mat files

Re: JAXRS, endpoints and a / welcome page - any ideas why it's broken?

2014-05-16 Thread Chris Mattmann
Hi Guys, Some thoughts here: -Original Message- From: Nick Burch apa...@gagravarr.org Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Wednesday, May 14, 2014 6:22 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: JAXRS, endpoints and a / welcome page - any ideas why

Re: Use of Levenshtein distance to find similar words

2014-03-16 Thread Chris Mattmann
Dear Margi, Great question and thanks for posting this to the list! :) You may also want to split your extracted text not just by \n but also look to split by perhaps to canonical-ize the words. You may even think of an approach for creating words (recall we discussed a method in class for

Re: Submission to ApacheCon on Tika

2014-03-02 Thread Chris Mattmann
Thanks Jukka! My Tika talk had to be moved to Wednesday since I wasn't sure I would be there at ApacheCon the whole time, and co-locating my talks around the same day was advantageous, so I asked Rich to move me. Annie's talk was originally I believe set for Wed too, however I am not sure if she

Re: CSCI ASSIGNMENT QUESTION

2014-03-01 Thread Chris Mattmann
1, 2014 9:43 PM To: Chris Mattmann chris.a.mattm...@jpl.nasa.gov Subject: Re: CSCI ASSIGNMENT QUESTION Hello professor Mattmann, Thank you for replying to my doubts. I realized there was a small mistake in the above code. I was updating the same pdf file count for every keyword

Re: CSCI ASSIGNMENT QUESTION

2014-02-19 Thread Chris Mattmann
it to automatically call the PDF parser by calling it directly from your program or Java code and then bypass that step. HTH! Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Mohamed Mustafa Rafik Khimani khim...@usc.edu Date

Re: [VOTE] Apache Tika 1.5 RC2

2014-02-10 Thread Chris Mattmann
Hi Dave, +1 from me, SIGS, checksum check out: [chipotle:~/tmp/apache-tika-1.5-rc2] mattmann% $HOME/bin/stage_apache_rc tika 1.5-src http://people.apache.org/~dmeikle/tika-1.5-rc2/ % Total% Received % Xferd Average Speed TimeTime Time Current

Re: [VOTE] Apache Tika 1.5 RC1

2014-02-04 Thread Chris Mattmann
-1.5-rc1] mattmann% gpg --import KEYS gpg: key A355A63E: Jukka Zitting ju...@apache.org not changed gpg: key B876884A: Chris Mattmann (CODE SIGNING KEY) mattm...@apache.org not changed gpg: key 9740DD55: David Meikle (CODE SIGNING KEY) dmei...@apache.org not changed gpg: Total number processed: 3

Re: [VOTE] Apache Tika 1.5 RC1

2014-02-04 Thread Chris Mattmann
/apache-tika-1.5-rc1] mattmann% gpg --import tika.asc gpg: key B876884A: Chris Mattmann (CODE SIGNING KEY) mattm...@apache.org not changed gpg: key A355A63E: Jukka Zitting ju...@apache.org 7 new signatures gpg: key 8A26D9A6: public key Jukka Zitting jukka.zitt...@gmail.com imported gpg: key 42CFAE07

Submission to ApacheCon on Tika

2014-01-30 Thread Chris Mattmann
Hey Guys, I submitted the below talk on Apache Tika, Nutch and Solr to ApacheCon NA 2014: Real Data Science: Exploring the FBI's Vault dataset with Apache Tika, Nutch and Solr Event ApacheCon North America Submission Type Lightning Talk Category Developer Biography Chris Mattmann has a wealth

Re: [DISCUSS] Prepare Release 1.5?

2014-01-09 Thread Chris Mattmann
Hey Dave, I kind of got bogged down and haven't had time to release. If someone else does have time and wants to pick this up, +1 for it! Cheers, Chris -Original Message- From: David Meikle loo...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Thursday, January 9,

Re: Tika 1.5 release ?

2013-12-19 Thread Chris Mattmann
Hi Hong-Thai, Thanks for your question. It's probably time for a release. I wil have the cycles next week or the week after to spin one up if no one beats me to it. Thanks! Cheers, Chris -Original Message- From: Hong-Thai Nguyen hong-thai.ngu...@polyspot.com Reply-To:

Re: Switch to JUnit 4.x?

2013-12-17 Thread Chris Mattmann
+1 from me. Cheers, Chris -Original Message- From: David Meikle loo...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org Date: Tuesday, December 17, 2013 2:03 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re: Switch to JUnit 4.x? Hi, On 14 Dec 2013, at 23:39, Ken

Re: Having Problem in Word Count and Language Detaction

2013-10-26 Thread Chris Mattmann
Hi Animesh, Please detail your issue here on dev@tika.apache.org and I'm sure someone can help. Cheers, Chris -Original Message- From: Animesh Kumar animesh.sa...@gmail.com Date: Wednesday, October 23, 2013 9:15 PM To: dev-ow...@tika.apache.org dev-ow...@tika.apache.org Subject: Fwd:

Re: [DISCUSS] Integrate Apache Any23 into Apache Tika

2013-10-19 Thread Chris Mattmann
Lewis, I for one am supportive of this measure somehow. The exact mechanism by which we can do this is something that could involve e.g., taking you, or anyone else from the Any23 community (at this point I think it's really just you by my own accord lurking on the lists over there) that is

<    1   2   3   >