Hi, it might be worth waiting until POI 3.11-FINAL is released so that the TIKA release do not depend on a beta version. It's due on Sunday, corrects a lot of old office parsing and just needs the patch in TIKA-1469 to properly work.
Regards Thomas 2014-12-18 21:54 GMT+01:00 Tyler Palsulich <[email protected]>: > > Hi All, > > It's been a few months, so I just want to follow up on this thread. We've > resolved/closed 51 issues for v1.7 [0]. There are two on JIRA marked as 1.7 > (TIKA-1465 and TIKA-894). Do we still want to aim for 1.7 with TIKA-1445? > Has anyone tried their hand at the suggested (significant) fix? > > Are there any other issues someone would like to fit in? > > Cheers, > Tyler > > [0] - > > https://issues.apache.org/jira/browse/TIKA/fixforversion/12327096/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel > > On Tue, Oct 28, 2014 at 1:46 AM, Mattmann, Chris A (3980) < > [email protected]> wrote: > > > > Thanks Tim saw your patch and am looking now. > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Chris Mattmann, Ph.D. > > Chief Architect > > Instrument Software and Science Data Systems Section (398) > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > Office: 168-519, Mailstop: 168-527 > > Email: [email protected] > > WWW: http://sunset.usc.edu/~mattmann/ > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Adjunct Associate Professor, Computer Science Department > > University of Southern California, Los Angeles, CA 90089 USA > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > > > > -----Original Message----- > > From: <Allison>, "Timothy B." <[email protected]> > > Reply-To: "[email protected]" <[email protected]> > > Date: Monday, October 27, 2014 at 12:30 PM > > To: "[email protected]" <[email protected]> > > Subject: RE: 1.7 release? > > > > >Sounds good. As long as the default behavior remains the same, I'm > > >happy. I'm going to play with a combination of your patch and Tyler's > > >and see what the ramifications are for embedded docs. > > > > > >To confirm, the OCR integration is fantastic. Thank you and Tyler! > > > > > > > > >Best, > > > > > > Tim > > > > > >-----Original Message----- > > >From: Mattmann, Chris A (3980) [mailto:[email protected]] > > >Sent: Friday, October 24, 2014 5:36 PM > > >To: [email protected] > > >Subject: Re: 1.7 release? > > > > > >Hey Tim, > > > > > >What do you think about my existing patch for 1445? For example to > > >just call all the parsers? I thought I was seeing behavior that was > > >slow because of that, but it turned out to be Tesseract and my machine > > >at the time? > > > > > >I think my patch for 1445 may be enough, and we should get the metadata > > >I think? Thoughts? > > > > > >I honestly think we need to deliver Tesseract in 1.7. We're close. I'll > > >even take it upon myself to try and experiment with the idea of multiple > > >parsers being called. I think a simple solution to the metadata key > > >conflict issue is simply to have a policy to add values (by default) and > > >replace if a property is set in ParseContext. Some simple updates to > > >CompositeParser would allow this. > > > > > >Thoughts? > > > > > >Cheers, > > >Chris > > > > > > > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >Chris Mattmann, Ph.D. > > >Chief Architect > > >Instrument Software and Science Data Systems Section (398) > > >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > >Office: 168-519, Mailstop: 168-527 > > >Email: [email protected] > > >WWW: http://sunset.usc.edu/~mattmann/ > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >Adjunct Associate Professor, Computer Science Department > > >University of Southern California, Los Angeles, CA 90089 USA > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > > > > > > > > > > >-----Original Message----- > > >From: <Allison>, "Timothy B." <[email protected]> > > >Reply-To: "[email protected]" <[email protected]> > > >Date: Friday, October 24, 2014 at 2:24 PM > > >To: "[email protected]" <[email protected]> > > >Subject: RE: 1.7 release? > > > > > >>Sorry for coming late to the game on the implications of TIKA-1445. I > > >>don't want to hold up the release of 1.7. > > >> > > >>However, would it be possible to return to the legacy default behavior > of > > >>extracting metadata from images? > > >> > > >>We can then document on the OCR parser page on the wiki that you need > to > > >>install Tesseract _and_ make a change in the parser/mime config file. > If > > >>you want this new capability, it will take a small bit of work until we > > >>solve TIKA-1445. > > >> > > >>I worry that the current behavior of 1.7 would be surprising to most > > >>non-dev users (well, even to at least one dev :) ). > > >> > > >>Cheers, > > >> > > >> Tim > > >> > > >>________________________________________ > > >>From: Oleg Tikhonov [[email protected]] > > >>Sent: Friday, October 24, 2014 2:24 PM > > >>To: [email protected] > > >>Subject: Re: 1.7 release? > > >> > > >>Hi Tyler, > > >>don't mention. > > >> > > >>Cheers, > > >>Oleg > > >>On Oct 24, 2014 8:02 PM, "Tyler Palsulich" <[email protected]> > wrote: > > >> > > >>> Thank you for the help, Oleg! I just resolved TIKA-1422. So, are > there > > >>>any > > >>> other issues anyone would like to resolve before a new release? > > >>> > > >>> Thanks, > > >>> Tyler > > >>> > > >>> On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov < > [email protected] > > > > > >>> wrote: > > >>> > > >>> > Sorry!!! > > >>> > > > >>> > On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) < > > >>> > [email protected]> wrote: > > >>> > > > >>> > > Thanks Oleg, will try tomorrow for me Los angeles time! > > >>> > > > > >>> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > Chris Mattmann, Ph.D. > > >>> > > Chief Architect > > >>> > > Instrument Software and Science Data Systems Section (398) > > >>> > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > >>> > > Office: 168-519, Mailstop: 168-527 > > >>> > > Email: [email protected] > > >>> > > WWW: http://sunset.usc.edu/~mattmann/ > > >>> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > Adjunct Associate Professor, Computer Science Department > > >>> > > University of Southern California, Los Angeles, CA 90089 USA > > >>> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > -----Original Message----- > > >>> > > From: Oleg Tikhonov <[email protected]> > > >>> > > Reply-To: "[email protected]" <[email protected]> > > >>> > > Date: Monday, October 20, 2014 at 11:20 PM > > >>> > > To: "[email protected]" <[email protected]> > > >>> > > Subject: Re: 1.7 release? > > >>> > > > > >>> > > >Please take a try with newest patch. > > >>> > > >Cheers, > > >>> > > >Oleg > > >>> > > > > > >>> > > >On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov < > > >>> [email protected]> > > >>> > > >wrote: > > >>> > > > > > >>> > > >> Taken. Thanks. in progress ... > > >>> > > >> > > >>> > > >> On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) < > > >>> > > >> [email protected]> wrote: > > >>> > > >> > > >>> > > >>> Trunk is the current checkout/branch: > > >>> > > >>> > > >>> > > >>> http://svn.apache.org/repos/asf/tika/trunk > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > >>> Chris Mattmann, Ph.D. > > >>> > > >>> Chief Architect > > >>> > > >>> Instrument Software and Science Data Systems Section (398) > > >>> > > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > >>> > > >>> Office: 168-519, Mailstop: 168-527 > > >>> > > >>> Email: [email protected] > > >>> > > >>> WWW: http://sunset.usc.edu/~mattmann/ > > >>> > > >>> > > >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > >>> Adjunct Associate Professor, Computer Science Department > > >>> > > >>> University of Southern California, Los Angeles, CA 90089 USA > > >>> > > >>> > > >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> -----Original Message----- > > >>> > > >>> From: Oleg Tikhonov <[email protected]> > > >>> > > >>> Reply-To: "[email protected]" <[email protected]> > > >>> > > >>> Date: Monday, October 20, 2014 at 10:16 PM > > >>> > > >>> To: "[email protected]" <[email protected]> > > >>> > > >>> Subject: Re: 1.7 release? > > >>> > > >>> > > >>> > > >>> >Hi, I can try this on. > > >>> > > >>> >What is a trunk? > > >>> > > >>> > > > >>> > > >>> > > > >>> > > >>> >Thanks, > > >>> > > >>> >Oleg > > >>> > > >>> > > > >>> > > >>> >On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) < > > >>> > > >>> >[email protected]> wrote: > > >>> > > >>> > > > >>> > > >>> >> Hmm any idea why this is failing on Windows? Tyler P. and > > >>> > > >>> >> I were talking the other day - maybe we shouldn't run the > > >>> > > >>> >> tests from TIKA-1422 unless Tesseract is installed? > > >>>Thoughts? > > >>> > > >>> >> > > >>> > > >>> >> > > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > >>> >> Chris Mattmann, Ph.D. > > >>> > > >>> >> Chief Architect > > >>> > > >>> >> Instrument Software and Science Data Systems Section (398) > > >>> > > >>> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > >>> > > >>> >> Office: 168-519, Mailstop: 168-527 > > >>> > > >>> >> Email: [email protected] > > >>> > > >>> >> WWW: http://sunset.usc.edu/~mattmann/ > > >>> > > >>> >> > > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > >>> >> Adjunct Associate Professor, Computer Science Department > > >>> > > >>> >> University of Southern California, Los Angeles, CA 90089 > USA > > >>> > > >>> >> > > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >>> > > >>> >> > > >>> > > >>> >> > > >>> > > >>> >> > > >>> > > >>> >> > > >>> > > >>> >> > > >>> > > >>> >> > > >>> > > >>> >> -----Original Message----- > > >>> > > >>> >> From: Hong-Thai Nguyen <[email protected]> > > >>> > > >>> >> Reply-To: "[email protected]" <[email protected]> > > >>> > > >>> >> Date: Thursday, October 16, 2014 at 2:03 AM > > >>> > > >>> >> To: "[email protected]" <[email protected]> > > >>> > > >>> >> Subject: Re: 1.7 release? > > >>> > > >>> >> > > >>> > > >>> >> >Hi Andrzej, > > >>> > > >>> >> > > > >>> > > >>> >> >We are impatient for 1.7 release too. > > >>> > > >>> >> >I'm having compiling problem of TIKA-1422 on me. If > anyone > > >>>can > > >>> > > >>>build > > >>> > > >>> >> >successfully on Windows, I have no objection to release > 1.7 > > >>> > > >>> >> > > > >>> > > >>> >> >Thanks, > > >>> > > >>> >> > > > >>> > > >>> >> >On Thu, Oct 16, 2014 at 10:51 AM, Andrzej BiaĆecki < > > >>> > [email protected]> > > >>> > > >>> >>wrote: > > >>> > > >>> >> > > > >>> > > >>> >> >> Hi, > > >>> > > >>> >> >> > > >>> > > >>> >> >> Any news on the 1.7 release? or at least a 1.6.1 > release > > >>>that > > >>> > > >>> >>includes > > >>> > > >>> >> >>the > > >>> > > >>> >> >> fix for broken ODF parsing... > > >>> > > >>> >> >> > > >>> > > >>> >> >> --- > > >>> > > >>> >> >> Best regards, > > >>> > > >>> >> >> > > >>> > > >>> >> >> Andrzej Bialecki > > >>> > > >>> >> >> > > >>> > > >>> >> >> > > >>> > > >>> >> > > > >>> > > >>> >> > > > >>> > > >>> >> >-- > > >>> > > >>> >> >-------------- > > >>> > > >>> >> >Hong-Thai > > >>> > > >>> >> > > >>> > > >>> >> > > >>> > > >>> > > >>> > > >>> > > >>> > > >> > > >>> > > > > >>> > > > > >>> > > > >>> > > > > > > > >
