Re: TIKA-1302 blog post

2016-10-06 Thread Mattmann, Chris A (3980)
Hey Tim yep let’s add the other apachecon prezos from me and Nick thanks. ++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, Open Source Projects Formulation and Development Office

Re: TIKA-1302 blog post

2016-10-05 Thread Mattmann, Chris A (3980)
Tim this is GREAT! Please link it from the wiki that mentions web resource document links. I think: http://wiki.apache.org/tika/TikaResources I fell behind on spinning the release. Will try and make progress today. Chris ++ Chris

Re: Tika 1.14?

2016-09-29 Thread Mattmann, Chris A (3980)
If there aren’t any objections I’ll roll 1.14 this weekend with an RC1 by Monday. ++ Chris Mattmann, Ph.D. Chief Architect, Instrument Software and Science Data Systems Section (398) Manager, Open Source Projects Formulation and

Re: Plans for the first Tika 2.0 release

2016-09-21 Thread Mattmann, Chris A (3980)
NLP/NER is as high a priority to me as the OCR stuff..we have a whole meta framework for doing NER/NLP with NERRecogniser and really cool Tensorflow and other stuff. Hoping 2.0 can help solve this! ☺ ++ Chris Mattmann, Ph.D. Chief

Re: Query on correct use of 'fileUrl' in TikaJAXRS Server to extract document at remote url - my request is not working

2016-09-14 Thread Mattmann, Chris A (3980)
+1 Great idea Konstantin ++ Chris Mattmann, Ph.D. Chief Architect, Instrument Software and Science Data Systems Section (398) Manager, Open Source Projects Formulation and Development Office (8212) NASA Jet Propulsion Laboratory

Re: A new Tika App in 2.0?

2016-09-13 Thread Mattmann, Chris A (3980)
I’ll try and comment on this tomorrow sorry it’s been a tough few weeks, really busy. ++ Chris Mattmann, Ph.D. Chief Architect, Instrument Software and Science Data Systems Section (398) Manager, Open Source Projects Formulation and

Re: Tika 1.14?

2016-08-11 Thread Mattmann, Chris A (3980)
Sounds good to me ++ Chris Mattmann, Ph.D. Chief Architect, Instrument Software and Science Data Systems Section (398) Manager, Open Source Projects Formulation and Development Office (8212) NASA Jet Propulsion Laboratory Pasadena,

Re: Your project VM needs to be migrated.

2016-07-17 Thread Mattmann, Chris A (3980)
Thanks Gav, I replied on the INFRA ticket. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527

Re: Sentiment Analysis Parser updates

2016-07-06 Thread Mattmann, Chris A (3980)
Great work:! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: TIKA-1164

2016-07-04 Thread Mattmann, Chris A (3980)
Hi Samuel I am forwarding your email to dev@tika.a.o and moving dev-owner@t.a.o to BCC. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA

Tika-Python: parsing PDFs and showing analytics

2016-06-30 Thread Mattmann, Chris A (3980)
Great Blog post by Clinton Brownley today: If you haven’t had a chance to check out tika-python [1], I recommend doing so! Would also appreciate any feedback or stars! Cheers, Chris [1]

Re: Sentiment Analysis Parser updates

2016-06-28 Thread Mattmann, Chris A (3980)
-- > >=== Confusion matrix === > > >a b c d | Accuracy | <-- classified as > <149> 13 4 1 | 89,22% | a = negative > 42 <24>3 1 | 34,29% | b = positive > 3511 <10>

Re: Metadata key for "original file location/name"?

2016-06-27 Thread Mattmann, Chris A (3980)
Tim: +1 to TikaCoreProperties.ORIGINAL_RESOURCE_NAME being mapped to: X-TIKA:origResourceName Sound good? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA

Re: regression corpus/vm discussions

2016-06-23 Thread Mattmann, Chris A (3980)
dev@tika is a great place, +1 from me. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527

Re: Sentiment Analysis Parser updates

2016-06-22 Thread Mattmann, Chris A (3980)
Thank you Jason! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: Sentiment Analysis Parser updates

2016-06-22 Thread Mattmann, Chris A (3980)
Great work Anastasija! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: Sentiment Analysis Parser updates

2016-06-17 Thread Mattmann, Chris A (3980)
Great update Anastasija! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: About tika-python error

2016-06-11 Thread Mattmann, Chris A (3980)
I just saw the mail. I will do it today as I want to run some test before >I can push or create pull request. > >On Fri, Jun 10, 2016 at 11:19 PM, Mattmann, Chris A (3980) ><chris.a.mattm...@jpl.nasa.gov> wrote: > >Hi Rakesh, > >Got it. Can you please submit a

Re: About tika-python error

2016-06-10 Thread Mattmann, Chris A (3980)
.assignmenthelp.net/> then > you try to save it as \Temp/www.assignmenthelp.net > <http://www.assignmenthelp.net> > > > >Hence small correction was to remove "/" if it is there at the end of url . > > >Rest everything is ok. > > >On Fri,

Re: About tika-python error

2016-06-10 Thread Mattmann, Chris A (3980)
[moved to dev@tika.a.o list please follow replies there.] Rakesh - looks like you don’t have permissions to write to your temp dir on Windows. Can you confirm that’s the case? ++ Chris Mattmann, Ph.D. Chief Architect Instrument

Re: GSoC 2016: OpenNLP Sentiment Analysis: Status Update

2016-05-19 Thread Mattmann, Chris A (3980)
On Wed, May 18, 2016 at 8:57 AM, Mattmann, Chris A (3980) < >chris.a.mattm...@jpl.nasa.gov> wrote: > >> Hi Everyone, >> >> Anastasija and I met this morning. Here are her next steps: >> >> >> 0. Completed learning, installing and using GeoTopicParser in

Re: GSoC 2016: OpenNLP Sentiment Analysis

2016-05-17 Thread Mattmann, Chris A (3980)
; >Let me know! > > >Thank you, >Anastasija > > >On 17 May 2016 at 07:41, Mattmann, Chris A (3980) ><chris.a.mattm...@jpl.nasa.gov> wrote: > >Dear Anastasija, > >I’m reconnecting here since it’s been a bit. Do you have time for >a Google Hangout tomorrow?

Re: GSoC 2016: OpenNLP Sentiment Analysis

2016-05-16 Thread Mattmann, Chris A (3980)
e one little problem. I have a final exam this time next week (for >my Theory of Computation class), so I can't do the hangout at this time. > > >Sorry for all the time confusions. I realise how hard it is to find the >perfect time to talk considering the time differences. > > &

Re: [VOTE] Release Apache Tika 1.13 Candidate #1

2016-05-16 Thread Mattmann, Chris A (3980)
Late to the party, but voting anyways: +1 from me, SIGS and MD5 looks good! LMC-053601:apache-tika-1.13-rc1 mattmann$ for name in app server; do > /Users/mattmann/bin/stage_apache_rc tika-$name 1.13 > https://dist.apache.org/repos/dist/dev/tika/ > done % Total% Received % Xferd Average

Re: Squashing GitHub pull requests while merging

2016-05-07 Thread Mattmann, Chris A (3980)
squash the commits in the pull request >before we merge into the Tika. So, we don't need to mess up Tika's history. >Right? > >Tyler >On May 6, 2016 8:41 PM, "Mattmann, Chris A (3980)" < >chris.a.mattm...@jpl.nasa.gov> wrote: > >> Squashing messes up history and

Re: Squashing GitHub pull requests while merging

2016-05-06 Thread Mattmann, Chris A (3980)
Squashing messes up history and atm requires infra intervention song would suggest we stay away from it for now Sent from my iPhone > On May 6, 2016, at 2:20 PM, Ken Krugler wrote: > > I was perusing https://wiki.apache.org/tika/UsingGit >

Re: pre-release 1.13 regression testing

2016-05-02 Thread Mattmann, Chris A (3980)
+1 go for it Dave! I’m in Hawaii on vacation so please push forward ;) ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office:

Re: GSoC 2016: OpenNLP Sentiment Analysis

2016-04-27 Thread Mattmann, Chris A (3980)
te: >> >> > sentiment analysis discussion doc : >> > >> > >> > >> https://docs.google.com/document/d/1Gi59YqtisY4NLaVY3B7CNLMTgCRZm9JEk17kmBmWXqQ/edit?usp=sharing >> > >> > On Tue, Apr 26, 2016 at 10:56 PM, Mattmann, Chris A (3980) < >> > chris.a

Re: GSoC 2016: OpenNLP Sentiment Analysis

2016-04-26 Thread Mattmann, Chris A (3980)
> > >Yes, that's perfect. I'll be ready by 9:40am. > > >Thank you, >Anastasija > > >On 25 April 2016 at 23:28, Mattmann, Chris A (3980) ><chris.a.mattm...@jpl.nasa.gov> wrote: > >Hey Anastasija, > >To be honest 9am EST is a little aggressive, I wi

Re: GSoC 2016: OpenNLP Sentiment Analysis

2016-04-25 Thread Mattmann, Chris A (3980)
;>> > Hi Chris, >>> > >>> > I'm available on Tuesday & Wednesday after 6.00 pm IST. >>> > >>> > Thanks, >>> > Madhawa >>> > >>> > Madhawa >>> > >>> > On Sat, Apr 23, 2016 at 11:38 P

Re: [DISCUSS] Backward compatibility

2016-04-25 Thread Mattmann, Chris A (3980)
ctions >will follow until tommorow. > >I don't see any ways to break something by doing this but I will recheck >it. > >Should I also enable clirr-maven-plugin on these classes? > >пн, 25 апр. 2016 г. в 20:39, Mattmann, Chris A (3980) < >chris.a.mattm...@jpl.nasa.gov&g

Re: [DISCUSS] Backward compatibility

2016-04-25 Thread Mattmann, Chris A (3980)
+1 I am fine with: 1. putting the old classes back in. Fine by me. 2. keeping the new tika-langdetect and improvements. I think that this is the easiest. Sorry for breaking the trunk, apologies. I was just eager to backport Ken’s stuff and also to get Text.jl support. Let’s just add back

Re: pre-release 1.13 regression testing

2016-04-25 Thread Mattmann, Chris A (3980)
Thanks Tim I appreciate it ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: GSoC 2016: OpenNLP Sentiment Analysis

2016-04-23 Thread Mattmann, Chris A (3980)
, "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov> wrote: >Hi Anastasija, > >Hope you are well. It’s now time to get started on the project. >Monder, Anthony, Madhawa and I have been discussing ideas about >how to proceed with the project and even developi

Pivotal, Greenplum and Apache TIka

2016-04-23 Thread Mattmann, Chris A (3980)
Hey All, Cool article here on Apache Tika’s use at Pivotal: https://t.co/fPzszrKHtR Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory

GSoC 2016: OpenNLP Sentiment Analysis

2016-04-23 Thread Mattmann, Chris A (3980)
Hi Anastasija, Hope you are well. It’s now time to get started on the project. Monder, Anthony, Madhawa and I have been discussing ideas about how to proceed with the project and even developing a task list. Let’s get your tasks input into that list, and also coordinate. I also have an action

Re: last commits before pre-1.13 regression tests?

2016-04-21 Thread Mattmann, Chris A (3980)
Yeah I have time, but honestly I’m not done. I have a few items left in the MIME type stuff. One more day please, one more day. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA

Re: last commits before pre-1.13 regression tests?

2016-04-20 Thread Mattmann, Chris A (3980)
Original Message- > From: Allison, Timothy B. [mailto:talli...@mitre.org] > Sent: Monday, April 18, 2016 12:26 PM > To: dev@tika.apache.org > Subject: RE: last commits before pre-1.13 regression tests? > > Sounds good to me. Given the amount of changes since the last

Fwd: Getting Files Tags

2016-04-19 Thread Mattmann, Chris A (3980)
Sent from my iPhone Begin forwarded message: From: raj kumar > Date: April 19, 2016 at 4:28:08 AM PDT To: > Subject: Fwd: Getting Files Tags Hi All, In Windows, Images &

Re: last commits before pre-1.13 regression tests?

2016-04-18 Thread Mattmann, Chris A (3980)
Tim I would like to get in and close out all the scientific MIME updates for TREC-DD-Polar and get that in at least. In 1.14, my team from USC and I will deliver an automatic Deep Learning way to do MIME detection based on these updates and also the ContentMIMEDetection mechanism described on the

Fwd: Need Help

2016-04-18 Thread Mattmann, Chris A (3980)
Sent from my iPhone Begin forwarded message: From: harsh kumar > Date: April 18, 2016 at 2:02:23 AM PDT To: > Subject: Fwd: Need Help Hi, I am using tika for detecting the

Apache Tika wikipedia page

2016-04-15 Thread Mattmann, Chris A (3980)
Hi All, I made a Wikipedia page for Apache Tika: https://en.wikipedia.org/wiki/Apache_Tika Please update and edit. Thank you. Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems

Re: file id comparison

2016-04-13 Thread Mattmann, Chris A (3980)
I don't think there are licensing issues and would love to contribute! Sent from my iPhone > On Apr 13, 2016, at 9:33 AM, Allison, Timothy B. wrote: > > All, > Can anyone think of licensing issues/ip issues/other concerns with running a > comparison of 'file', Droid and

Re: @ApacheTika , and release related tweets question

2016-04-06 Thread Mattmann, Chris A (3980)
of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++ On 4/6/16, 10:10 AM, "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov> wrote: >++1 on all the feedback

Re: @ApacheTika , and release related tweets question

2016-04-06 Thread Mattmann, Chris A (3980)
++1 on all the feedback from you two below :) ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop:

Apache Tika used to parse the Panama papers!

2016-04-05 Thread Mattmann, Chris A (3980)
FYI: http://www.forbes.com/sites/thomasbrewster/2016/04/05/panama-papers-amazon-encryption-epic-leak/?utm_campaign=ForbesTech_source=TWITTER_medium=social_channel=Technology=23087770#709893771df5 BTW I know Thomas and am in touch..he wrote an article about MEMEX last year.

Re: dependency upgrades, release 1.13?

2016-04-01 Thread Mattmann, Chris A (3980)
+1 happy to RM it :) I’ll cut 1.13 this week or early next week. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office:

Re: Who's going to Apache: Big Data in May?

2016-03-30 Thread Mattmann, Chris A (3980)
I may be attending briefly :) Just need to get my ducks in a row :) ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office:

Re: GSOC2016 Sentiment Analysis

2016-03-29 Thread Mattmann, Chris A (3980)
ur thoughts. >> > >> > >> >Mondher >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >On Tue, Mar 29, 2016 at 1:51 PM, Madhawa Kasun Gunasekara >> ><madhaw...@gmail.com> wrote: >> &g

Re: GSOC2016 Sentiment Analysis

2016-03-29 Thread Mattmann, Chris A (3980)
n add this feature on >OpenNLP project, and also I would like to suggest > that we should able to detect the target object of the opinions from >this feature as well. > > >WDYT ?? > > > >Thanks, > >Madhawa > > >Madhawa > > >

Re: GSOC2016 Sentiment Analysis

2016-03-28 Thread Mattmann, Chris A (3980)
Dear Anthony, Great! These both sound like fantastic proposals and I’m happy to be a mentor. Madhawa, would you like to join in on these efforts? Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and

Re: GSOC2016 Sentiment Analysis

2016-03-27 Thread Mattmann, Chris A (3980)
ow that you have a >positive response? In my humble opinion, that would prevent others not >involved in your discussion from getting email about the topic. > >Good luck! > >Best Regards, >Nishant > > >On Sun, Mar 27, 2016 at 6:37 AM, Mattmann, Chris A (3980) ><ch

Re: GSOC2016 Sentiment Analysis

2016-03-27 Thread Mattmann, Chris A (3980)
into that my username is "Madhawa Gunasekara" [1] https://wiki.apache.org/tika/GSoC2016 I have created a jira issue on https://issues.apache.org/jira/browse/TIKA-1911 Thanks, Madhawa Madhawa On Sat, Mar 26, 2016 at 3:21 AM, Mattmann, Chris A (3980) <chris.a.mattm...@jpl.nasa

Re: GSOC2016 Sentiment Analysis

2016-03-25 Thread Mattmann, Chris A (3980)
t; > >I have completed an Applied NLP course @ USC. > > >I have done a Literature Review of Papers & Open Source Tools on the same >recently. > > >Regards, >Harsha > > >On Fri, Mar 25, 2016 at 2:07 PM, Mattmann, Chris A (3980) ><chris.a.matt

Re: Change to NER ParserTest re https://builds.apache.org/job/tika-2.x/57

2016-03-25 Thread Mattmann, Chris A (3980)
Hey Tim, I’ll take a look. Would be good to add the @AfterClass for sure though. Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory

Re: GSOC2016 Sentiment Analysis

2016-03-25 Thread Mattmann, Chris A (3980)
Hi Madhawa, So, how about a project that develops and contributes an Apache Tika and OpenNLP based SentimentAnalysisParser? I have some students currently doing work using the Fisher Callhome Corpus and you can build off that. I am CC’ing my USC IRDS team and my student Indhu who is working on

Re: Need suggestion on file type .HFA to be added Tika.

2016-03-02 Thread Mattmann, Chris A (3980)
I agree with Nick’s replies here ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: trunk build failing in bundle --, cxf class not found for GrobidRESTParser?

2016-03-02 Thread Mattmann, Chris A (3980)
g> Subject: RE: trunk build failing in bundle --, cxf class not found for GrobidRESTParser? >There's a chance you hadn't merged my breaking commit? > >-Original Message- >From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] >Sent: Wednesday, March 02,

Re: trunk build failing in bundle --, cxf class not found for GrobidRESTParser?

2016-03-02 Thread Mattmann, Chris A (3980)
wow this is super odd. Last thing I committed was NLTK .. and it built fine locally I Tested before committing. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion

Re: parallel dev on trunk and 2.x?

2016-02-25 Thread Mattmann, Chris A (3980)
+1 I haven’t fully moved over to 2.x yet b/c I haven’t honestly had time to catch up. I suppose after my class in May I will have time to catch up then and I can focus more on 2.x then. But for me I am doing all my work in 1.x now so keeping up to date would be great.

Re: Integrating Tika with MITLL Text.jl library for language detection

2016-02-23 Thread Mattmann, Chris A (3980)
eWriter, etc) from the specific implementation sub-project into >core > > >Once that's done you should be able to try out directly adding your >integration with Text.jl > > >-- Ken > > >________ >From: Trevor Claude Lewis >Sent:

Miredot built for 1.10, 1.12 and linked in main nag

2016-02-19 Thread Mattmann, Chris A (3980)
...thanks to Lewis for getting Miredot into the build and release process. I had forgot to build it for 1.10 and 1.12, so it’s done and published now. I also updated the tree nav to link to mire dot too. Now to start filling out the REST docs there from the wiki.. Cheers, Chris

Website

2016-02-19 Thread Mattmann, Chris A (3980)
Hey Nick, Sorry it took me so long. I spent a bunch of time writing a script on Github to make the release process easier by automatically extracting and building the /index.apt file during the release process. https://git.io/v2Ubm Anyways the site is updated with 1.12. I’m also building

[RESULT] [VOTE] Apache Tika 1.12 Release Candidate #1

2016-02-15 Thread Mattmann, Chris A (3980)
Team, Sorry for the long delay. This VOTE has PASSED with the following tallies: +1 Chris Mattmann* Markus Jelsma Oleg Tikhonov* Ken Krugler* Tim Allison* Konstantin Gribov* David Meikle* Lewis John McGibbney* Tyler Palsulich* * - Tika PMC I’ll go update the website and update the mirrors and

Re: scm info in pom.xml

2016-02-11 Thread Mattmann, Chris A (3980)
I’ve already fixed this in trunk / master :-) Needs fixing in 2.x but you can borrow from what I did there.. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion

Re: Use of interface vs. abstract class

2016-02-09 Thread Mattmann, Chris A (3980)
nterface vs. abstract class >Hi Chris, > >> From: Mattmann, Chris A (3980) >> Sent: February 9, 2016 8:40:06am PST >> To: dev@tika.apache.org >> Cc: Trevor Claude Lewis; Ramirez, Paul M (398M) >> Subject: Re: Use of interface vs. abstract class >> >

Re: Tika 2.0 and language detection

2016-02-04 Thread Mattmann, Chris A (3980)
Hey Ken, This is fine. I wanted to get going with our Julia/MIT-LL Text.jl based detector and turning LanguageIdentifier into an interface. Me and Trevor (CC’ed) are working on it, but not sure where we’re at and shouldn’t be a blocker to moving forward. Cheers, Chris

Re: [VOTE] Apache Tika 1.12 Release Candidate #1

2016-01-29 Thread Mattmann, Chris A (3980)
e Tika 1.12 Release Candidate #1 > >Built & installed on Mac OS X 10.8. > >Switched Bixo to use 1.12, all tests pass. > >+1. > >-- Ken > >> From: Mattmann, Chris A (3980) >> Sent: January 25, 2016 11:58:04am PST >> To: u...@tika.apache.org; dev

[VOTE] Apache Tika 1.12 Release Candidate #1

2016-01-25 Thread Mattmann, Chris A (3980)
Hi Folks, A first candidate for the Tika 1.12 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: https://git-wip-us.apache.org/repos/asf?p=tika.git;a=tag;h=203a26ba5e65db24 27f9e84bc4ff31e569ae661c The SHA1

Sorry 1.12-rc1 not done yet

2016-01-25 Thread Mattmann, Chris A (3980)
...ran into: http://goo.gl/ggfF50 Just fixed it in 2eb671574 -> 809370ecc and moving release:prepare forward again. Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398)

Re: [DISCUSS] Tika 1.12-rc1 (was Re: New Tika release)

2016-01-25 Thread Mattmann, Chris A (3980)
? > >> 'https://git-wip-us.apache.org/repos/asf/tika.git/tika/' not found > > >Isn't the URL https://git-wip-us.apache.org/repos/asf/tika.git > >-- Ken > >> From: Mattmann, Chris A (3980) >> Sent: January 25, 2016 7:07:03am PST >> To: u...@tika.apache.org

Re: Are we on git?

2016-01-22 Thread Mattmann, Chris A (3980)
++ -Original Message- From: Nick Burch <apa...@gagravarr.org> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org> Date: Friday, January 22, 2016 at 1:37 AM To: "dev@tika.apache.org" <dev@tika.apache.org> Subject: Re: Are we on git? >On Fri, 22

Re: Are we on git?

2016-01-21 Thread Mattmann, Chris A (3980)
Hi Nick, We are officially on Git. SVN remains, but it’s R/O. Our new ASF git repo is: https://git-wip-us.apache.org/repos/asf/tika.git Here’s an email I sent to the OODT-dev list about how to convert from your existing SVN checkout to Git. http://s.apache.org/UNr Can we file a ticket to

Writeable Git repo migration is underway

2016-01-18 Thread Mattmann, Chris A (3980)
You can track progress here: https://issues.apache.org/jira/browse/INFRA-11092 Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory

[RESUT] [VOTE] Moving SCM to Git

2016-01-18 Thread Mattmann, Chris A (3980)
This VOTE has passed with the following tallies: +1 Chris Mattmann* Tyler Palsulich* Bob Paulin* Hong-Thai Nguyen* Oleg Tikhonov* David Meikle* Ken Krugler* Lewis John McGibbney* Nick Burch* Konstantin Gribov* Julien Nioche* Tim Allison* I’ll file an INFRA ticket to begin the process. They

Re: [DISCUSS] Apache Joshua Incubator Proposal - Machine Translation Toolkit

2016-01-18 Thread Mattmann, Chris A (3980)
Great Hen, we’d love to have you on board as a mentor! Please add yourself to the proposal on the wiki. Anyone else have interest in Machine Translation? Any OpenNLP folks, Hadoop folks, Tika, or Lucene folks? CC’ing the dev lists for visibility please feel free to reply to general@i.a.o. I’ll

Re: New moderators needed

2016-01-16 Thread Mattmann, Chris A (3980)
Hey Jukka, Am I that single moderator? :) Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519,

Re: Tika questions on StackOverflow

2016-01-13 Thread Mattmann, Chris A (3980)
Great post Nick ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email:

Re: Tika 2.0 Modules first pass.

2016-01-05 Thread Mattmann, Chris A (3980)
Thanks Bob took care of 6 for ya: https://wiki.apache.org/tika/ContributorsGroup I should be able to review this, but not going to be complete review for a few weeks.. thanks for your great work ++ Chris Mattmann, Ph.D. Chief

Re: [VOTE] Moving SCM to Git

2016-01-02 Thread Mattmann, Chris A (3980)
ed, and is much easier for provenance (there may be edge cases I'm >> missing offhand, but we know the ICLA/grant associated with each change >> leading up to the tagged release). > > Did it wind up as "projects can experiment with using git for official > releases"? &

Re: [VOTE] Moving SCM to Git

2016-01-02 Thread Mattmann, Chris A (3980)
:31 PM, Mattmann, Chris A (3980) > <chris.a.mattm...@jpl.nasa.gov> wrote: > > Hey Ken, > > Projects have been using writeable git repos at the ASF since 2009-2010. The > recent conversation at the foundation level was - should we allow GitHub as a > canonical

[VOTE] Moving SCM to Git

2016-01-01 Thread Mattmann, Chris A (3980)
Hi Everyone, DISCUSS thread here: http://s.apache.org/wVE Time to officially VOTE on moving Tika to Git. I’ve made a wiki page for our SCM explaining how to use Git at Apache, and how to use it with Github, and how to use it even in a traditional SVN sense. The page is here:

Re: Looking to contribute

2015-12-20 Thread Mattmann, Chris A (3980)
Pavan awesome glad to have your interest and to have you in the community! Check out our JIRA: https://issues.apache.org/jira/browse/TIKA My own personal recent interests in Tika are related to Named Entity Recognition (Stanford NER, CoreNLP and OpenNLP), and in Automated IR-based

FW: [opensource] Open Source workshop at GSAW March 2

2015-12-18 Thread Mattmann, Chris A (3980)
FYI.. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW:

Re: looking to contribute

2015-12-17 Thread Mattmann, Chris A (3980)
What Tim and Nick said. :) Joey is at Caltech and interested in working with me, so I said jump on the Tika lists and let’s see if there is something we can pin down. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and

Re: more modular parser bundles

2015-11-30 Thread Mattmann, Chris A (3980)
Tim, Fully agreed. One solution that presents itself to me is to finish up the Git discuss (which was overwhelmingly positive, and I need to write a wiki page for Nick), get that VOTE out of the way, move to Git, then basically have two main branches of development. I’d like 1.x to continue

Re: more modular parser bundles

2015-11-30 Thread Mattmann, Chris A (3980)
Sure that’s fine Bob - we don’t need it to be gated on Git. Create a 2.x branch and go to town, +1 from me :) ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion

Re: NER Parser tests behind proxy?

2015-11-24 Thread Mattmann, Chris A (3980)
ls? And, y, my >suggestion was to build a very small model and push it to source control >in the resources directory. > >All this said, 1) again, this could be user error and 2) the addition of >Stanford NER is fantastic...Thank you for this addition! > > >-Original Messa

Re: NER Parser tests behind proxy?

2015-11-23 Thread Mattmann, Chris A (3980)
ut http connectivity outside of >the usual maven stuff. > >-Original Message- >From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] >Sent: Monday, November 23, 2015 10:52 AM >To: dev@tika.apache.org >Cc: ThammeGowda Narayanaswamy <thammegowd...@usc.edu>

Re: NER Parser tests behind proxy?

2015-11-23 Thread Mattmann, Chris A (3980)
with Tika, we wouldn't have to worry about http connectivity outside of >>the >> usual maven stuff. >> >> -Original Message- >> From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] >> Sent: Monday, November 23, 2015 10:52 AM >> T

Re: NER Parser tests behind proxy?

2015-11-23 Thread Mattmann, Chris A (3980)
>org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMethodSi >teNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:229) > at >org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMetho >dSite.java:52) > at >org.codehaus.groovy.runtime.callsit

Re: [DISCUSS] Moving to Git

2015-11-19 Thread Mattmann, Chris A (3980)
pache.org" <dev@tika.apache.org> Date: Thursday, November 19, 2015 at 4:33 AM To: "dev@tika.apache.org" <dev@tika.apache.org> Subject: Re: [DISCUSS] Moving to Git >On Wed, 18 Nov 2015, Mattmann, Chris A (3980) wrote: >> Git has something similar to svn:externa

Re: [DISCUSS] Moving to Git

2015-11-18 Thread Mattmann, Chris A (3980)
" <dev@tika.apache.org> Subject: Re: [DISCUSS] Moving to Git >On Wed, 18 Nov 2015, Mattmann, Chris A (3980) wrote: >> I propose we move to writeable git repos for Tika for our repository. I >> mostly interact with Git & Github nowadays even with Tika using the >>

Re: [DISCUSS] Moving to Git

2015-11-18 Thread Mattmann, Chris A (3980)
ving from an SVN workflow patch -> trunk to git workflow PR -> >master. \ >- Bob > >On Wed, Nov 18, 2015 at 8:48 AM, Tyler Palsulich <tpalsul...@gmail.com> >wrote: > >> +1 from me. >> >> Tyler >> On Nov 18, 2015 6:46 AM, "Mattmann

Named Entity Recognition support in trunk

2015-11-18 Thread Mattmann, Chris A (3980)
Hey Folks, With the commit of TIKA-1787/GH-61 in trunk we now have full integration of Named Entity Recognition with Stanford NER/NLP and Apache OpenNLP. Will also look to see if we can integrate NLTK too. This is a *big deal* since NER is something we’ve always wanted to pull into Tika. Woot!

[DISCUSS] Moving to Git

2015-11-18 Thread Mattmann, Chris A (3980)
Hey Team, I propose we move to writeable git repos for Tika for our repository. I mostly interact with Git & Github nowadays even with Tika using the mirroring and PR interaction support. Thoughts? Cheers, Chris ++ Chris Mattmann,

Github mirroring / commit notifications lagging behind or not coming

2015-11-17 Thread Mattmann, Chris A (3980)
Hey I just committed in r1714835, but didn’t see a commit log notification nor has it sync’ed to Github. I’ve already notified infra@ will let you know what I hear Cheers, Chris ++ Chris Mattmann, Ph.D. Chief Architect Instrument

FW: [jira] [Commented] (TIKA-1787) Include Stanford Name Entity Recognition in Tika

2015-11-17 Thread Mattmann, Chris A (3980)
Thamme, can you have a look here: https://builds.apache.org/job/tika-trunk-jdk1.7/887/org.apache.tika$tika-pa rsers/testReport/junit/org.apache.tika.parser.ner/NamedEntityParserTest/tes tParse/ Tests seem to be failing (worked for me locally maybe b/c I had already downloaded the models?)

Re: Github mirroring / commit notifications lagging behind or not coming

2015-11-17 Thread Mattmann, Chris A (3980)
fixed ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW:

FW: [jira] [Commented] (TIKA-1787) Include Stanford Name Entity Recognition in Tika

2015-11-17 Thread Mattmann, Chris A (3980)
Build back to normal after Thamme and I fixed this. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop:

  1   2   3   4   >