I’m not so keen on fundamentally changing the organization of
Tika until 2.x. This seems like a major change to me in the
way people expect to consume Tika.

Can we:

1. Release a 1.11 that doesn’t include these types of changes
2. After 1.11, change trunk to be 2.0-SNAPSHOT and work those
types of issues there?

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Yaniv Kunda <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Wednesday, September 23, 2015 at 9:30 AM
To: "[email protected]" <[email protected]>
Subject: Re: [DISCUSS] Release Tika 1.11?

>+1 for the uber jar!
>
>Regarding jdk7 issues, I have a few more I will create and patch later
>tonight - I'll post a list of issues as well.
>On Sep 23, 2015 5:26 PM, "Konstantin Gribov" <[email protected]> wrote:
>
>> Tim, was your check for File#getName done manually or it's present in
>>tests
>> somehow? If it's present in tests we can check it on major platforms (I
>>can
>> test on linux, win xp and maybe on macosx) with different jdks.
>>
>> In case commons-io doesn't support ':' as file separator we can have
>>simple
>> utility class in Tika or send them a patch for it.
>>
>> I think, we can rethink Tika packaging in 1.11/1.12 and produce these
>> artifacts:
>> - tika-core w/ dependency on commons-io (and deprecate most of
>>o.a.tika.io
>> ,
>> forwarding calls to jdk or commons-io),
>> - tika-core-uber w/ shaded commons-io (rename and drop all things
>> unnecessary for o.a.tika.io),
>> - sliced tika-parsers-* as Bob suggested earlier,
>> - tika-parsers jar w/ all tika-parsers-* parts (for compatibility),
>> - other tika-* artifacts (like tika-server, tika-app etc).
>>
>> One who needs tika-core without dependencies would use tika-core-uber
>> instead of it, all others, who prefer using maven/ivy/gradle/sbt/lein
>>will
>> depend on tika-core.
>> And we can drop o.a.tika.io in 2.0.
>>
>> Also, I'll take a look at unresolved jdk7 issues/patches today.
>>
>> вт, 22 сент. 2015 г. в 15:41, Allison, Timothy B. <[email protected]>:
>>
>> > Thank _you_ for all of your work in modernizing us.  With your
>>efforts,
>> > we'll be able to deprecate TikaInputStream#get(PunchCard pc) soon. :)
>> >
>> > >>Regarding FilenameUtils.getName() - I believe that its functionality
>> can
>> > be replaced by Path.getFileName() - and in a platform-aware manner, as
>> each
>> > JVM distribution comes with a specific provider implementation for
>>the OS
>> > it's for.
>> >
>> > I agree that we should use that anytime we're interacting with the
>>file
>> > system.
>> >
>> > However, that's actually the problem for paths that are stored within
>>the
>> > document (say, an embedded resource).  Let's say a user creates a
>>file on
>> > Windows, the file path information for the embedded file (depending on
>> the
>> > parser and the file format) may be in Windows-ese, which is a
>>problem if
>> > you try to use Path.getFileName() (I think... I haven't actually
>>tested
>> > this) on a Linux machine.  I have actually tested this with the old
>>File
>> > getName(), and it did not work cross-platform IIRC.
>> >
>> > In short, Tika needs to have the ability to extract the file name
>>from a
>> > path that was created on any platform (including old Mac and its ":"
>> > separator) while Tika is running on any platform.
>> >
>> > -----Original Message-----
>> > From: Yaniv Kunda [mailto:[email protected]]
>> > Sent: Monday, September 21, 2015 11:31 AM
>> > To: [email protected]
>> > Subject: RE: [DISCUSS] Release Tika 1.11?
>> >
>> > Thanks for the positive spirit!
>> >
>> > Regarding FilenameUtils.getName() - I believe that its functionality
>>can
>> > be replaced by Path.getFileName() - and in a platform-aware manner, as
>> each
>> > JVM distribution comes with a specific provider implementation for
>>the OS
>> > it's for.
>> >
>> > -----Original Message-----
>> > From: Allison, Timothy B. [mailto:[email protected]]
>> > Sent: Monday, September 21, 2015 14:27
>> > To: [email protected]
>> > Subject: RE: [DISCUSS] Release Tika 1.11?
>> >
>> > +1, it would be great to move a bit more into EOL'd Java 7 asap.
>> >
>> > I'll take TIKA-1734 by tomorrow EDT.
>> >
>> > As for the other 2, I'm personally ok waiting for 1.12, but I defer to
>> the
>> > dev community.
>> >
>> > Chris, Nick, Ray, Ken, Konstantin, if you have a chance to chime in on
>> > TIKA-1726, that might help move things forward.
>> >
>> > On TIKA-1706, I share Nick's and Jukka's caution, and I also share
>> Yaniv's
>> > point about duplication of code, bloat within Tika and missing out on
>> > updates.   Aside from one small bit of code I'd like to keep or
>>perhaps
>> try
>> > to move into commons-io (?)[0], I think I'm now +1 to going forward
>>with
>> > TIKA-1706 in core...unless there is a -1 from the community.
>> >
>> > Best,
>> >
>> >              Tim
>> >
>> >
>> > [1] I added some customizations for old MAC OS behavior (treat ":" as
>> file
>> > separator) in FileNameUtils.getName() that I don't want to lose.
>> >
>> >
>> > -----Original Message-----
>> > From: Yaniv Kunda [mailto:[email protected]]
>> > Sent: Sunday, September 20, 2015 7:15 AM
>> > To: [email protected]
>> > Subject: RE: [DISCUSS] Release Tika 1.11?
>> >
>> > I would really like to push the following:
>> >
>> > https://issues.apache.org/jira/browse/TIKA-1706 - Bring back
>>commons-io
>> > to tika-core This requires a decision to re-include commons-io as a
>> > dependency of tika-core.
>> > All the pros and cons have been already debated, but no decision has
>>been
>> > made.
>> >
>> > https://issues.apache.org/jira/browse/TIKA-1726 - Augment public
>>methods
>> > that use a java.io.File with methods that use a java.nio.file.Path
>>Since
>> > this adds new methods to the public API, I requested the group to
>>make a
>> > decision about the new names - but have not received something
>>definite.
>> > However, I did create a subtask -
>> > https://issues.apache.org/jira/browse/TIKA-1734 Use java.nio.file.Path
>> in
>> > TemporaryResources - using [~tallison]'s suggestion, which has not
>>been
>> > committed yet.
>> >
>> > If decisions are made on the above issues, I can quickly create
>>patches
>> > for them.
>> >
>> > -----Original Message-----
>> > From: Mattmann, Chris A (3980) [mailto:[email protected]]
>> > Sent: Saturday, September 19, 2015 08:10
>> > To: [email protected]
>> > Subject: [DISCUSS] Release Tika 1.11?
>> >
>> > Hey Guys and Gals,
>> >
>> > I’d like to roll a 1.11 release. There is TIKA-1716 which in
>>particular
>> > allows some neat functionality in tika-python:
>> > https://github.com/chrismattmann/tika-python/pull/67
>> >
>> >
>> > Anything else to try and get into the release?
>> >
>> > If not, I’ll produce an RC #1 by end of weekend.
>> >
>> > Cheers,
>> > Chris
>> >
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Chris Mattmann, Ph.D.
>> > Chief Architect
>> > Instrument Software and Science Data Systems Section (398) NASA Jet
>> > Propulsion Laboratory Pasadena, CA 91109 USA
>> > Office: 168-519, Mailstop: 168-527
>> > Email: [email protected]
>> > WWW:  http://sunset.usc.edu/~mattmann/
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Adjunct Associate Professor, Computer Science Department University of
>> > Southern California, Los Angeles, CA 90089 USA
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >
>> > --
>> >
>> >
>> > This email communication (including any attachments) contains
>>information
>> > from Answers Corporation or its affiliates that is confidential and
>>may
>> be
>> > privileged. The information contained herein is intended only for the
>>use
>> > of the addressee(s) named above. If you are not the intended recipient
>> (or
>> > the agent responsible to deliver it to the intended recipient), you
>>are
>> > hereby notified that any dissemination, distribution, use, or copying
>>of
>> > this communication is strictly prohibited. If you have received this
>> email
>> > in error, please immediately reply to sender, delete the message and
>> > destroy all copies of it. If you have questions, please email
>> > [email protected].
>> >
>> > If you wish to unsubscribe to commercial emails from Answers and its
>> > affiliates, please go to the Answers Subscription Center
>> > http://campaigns.answers.com/subscriptions to opt out.  Thank you.
>> >
>> > --
>> >
>> >
>> > This email communication (including any attachments) contains
>>information
>> > from Answers Corporation or its affiliates that is confidential and
>>may
>> be
>> > privileged. The information contained herein is intended only for the
>>use
>> > of the addressee(s) named above. If you are not the intended recipient
>> (or
>> > the agent responsible to deliver it to the intended recipient), you
>>are
>> > hereby notified that any dissemination, distribution, use, or copying
>>of
>> > this communication is strictly prohibited. If you have received this
>> email
>> > in error, please immediately reply to sender, delete the message and
>> > destroy all copies of it. If you have questions, please email
>> > [email protected].
>> >
>> > If you wish to unsubscribe to commercial emails from Answers and its
>> > affiliates, please go to the Answers Subscription Center
>> > http://campaigns.answers.com/subscriptions to opt out.  Thank you.
>> >
>> --
>> Best regards,
>> Konstantin Gribov
>>
>
>-- 
>
>
>This email communication (including any attachments) contains information
>from Answers Corporation or its affiliates that is confidential and may
>be 
>privileged. The information contained herein is intended only for the use
>of the addressee(s) named above. If you are not the intended recipient
>(or 
>the agent responsible to deliver it to the intended recipient), you are
>hereby notified that any dissemination, distribution, use, or copying of
>this communication is strictly prohibited. If you have received this
>email 
>in error, please immediately reply to sender, delete the message and
>destroy all copies of it. If you have questions, please email
>[email protected].
>
>If you wish to unsubscribe to commercial emails from Answers and its
>affiliates, please go to the Answers Subscription Center
>http://campaigns.answers.com/subscriptions to opt out.  Thank you.

Reply via email to