Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Luís Filipe Nassif
Great, Thank you, Tim! Em qua., 15 de dez. de 2021 às 16:50, Tim Allison escreveu: > I've merged Lewis's edits to the README and added the EOL. Let's do > what both Konstantin and Nick recommend: README, notifications to > user/dev lists x months out and include EOL in all release messages? >

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460262#comment-17460262 ] Hudson commented on TIKA-3446: -- UNSTABLE: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #153 (See

[jira] [Commented] (TIKA-3619) Augment README with build prerequisites

2021-12-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460229#comment-17460229 ] Hudson commented on TIKA-3619: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk8 #390 (See

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460222#comment-17460222 ] ASF GitHub Bot commented on TIKA-3446: -- nddipiazza merged pull request #465: URL:

[GitHub] [tika] nddipiazza merged pull request #465: TIKA-3446 - 1.x port of TIKA-3446 - Support for parsing OneNote when Alternative Encoding Using the File Synchronization via SOAP Over HTTP Protoco

2021-12-15 Thread GitBox
nddipiazza merged pull request #465: URL: https://github.com/apache/tika/pull/465 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Tim Allison
I've merged Lewis's edits to the README and added the EOL. Let's do what both Konstantin and Nick recommend: README, notifications to user/dev lists x months out and include EOL in all release messages? Please let me know/edit the README if there are other improvements we should make. Thank

[jira] [Commented] (TIKA-3619) Augment README with build prerequisites

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460210#comment-17460210 ] ASF GitHub Bot commented on TIKA-3619: -- tballison merged pull request #464: URL:

[GitHub] [tika] tballison merged pull request #464: TIKA-3619 Augment README with build prerequisites

2021-12-15 Thread GitBox
tballison merged pull request #464: URL: https://github.com/apache/tika/pull/464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460209#comment-17460209 ] ASF GitHub Bot commented on TIKA-3446: -- tballison commented on pull request #465: URL:

[GitHub] [tika] tballison commented on pull request #465: TIKA-3446 - 1.x port of TIKA-3446 - Support for parsing OneNote when Alternative Encoding Using the File Synchronization via SOAP Over HTTP Pr

2021-12-15 Thread GitBox
tballison commented on pull request #465: URL: https://github.com/apache/tika/pull/465#issuecomment-995141124 Looks good to me. Y, go for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Comment Edited] (TIKA-491) Add language identification support for Norwegian Bokmål and Norwegian Nynorsk

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460203#comment-17460203 ] Tim Allison edited comment on TIKA-491 at 12/15/21, 7:31 PM: - This module in

[jira] [Commented] (TIKA-491) Add language identification support for Norwegian Bokmål and Norwegian Nynorsk

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460203#comment-17460203 ] Tim Allison commented on TIKA-491: -- This module in Tika 2.x does distinguish between those two languages:

[jira] [Commented] (TIKA-3616) Upgrade log4j2

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460180#comment-17460180 ] Tim Allison commented on TIKA-3616: --- Thank you for looking at this carefully, [~grossws]! > Upgrade

[jira] [Commented] (TIKA-3616) Upgrade log4j2

2021-12-15 Thread Konstantin Gribov (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460168#comment-17460168 ] Konstantin Gribov commented on TIKA-3616: - I looked a bit how Tika and it's upstream dependencies

Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Konstantin Gribov
My +1 to EOL on September 30, 2022 with effective backport submission freeze 3 months before that. I think it would be better if we mention the EOL timeline at least in 3 places: in each release announcement, in README and on the site (on the main page or in release news articles). Different

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460152#comment-17460152 ] ASF GitHub Bot commented on TIKA-3446: -- nddipiazza commented on pull request #465: URL:

[GitHub] [tika] nddipiazza commented on pull request #465: TIKA-3446 - 1.x port of TIKA-3446 - Support for parsing OneNote when Alternative Encoding Using the File Synchronization via SOAP Over HTTP P

2021-12-15 Thread GitBox
nddipiazza commented on pull request #465: URL: https://github.com/apache/tika/pull/465#issuecomment-995047236 OK @tballison got a moment for a quick review on this one? Is this OK to add into the upcoming 1.x release? -- This is an automated message from the Apache Git Service. To

[jira] [Commented] (TIKA-3616) Upgrade log4j2

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460150#comment-17460150 ] Tim Allison commented on TIKA-3616: --- 2.15's vulnerability seemed to require extra complexity

Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Nick Burch
On Wed, 15 Dec 2021, Tim Allison wrote: Sounds good, Nick. Unless there are objections, I'll add an EOL September 30, 2022 for the 1.x branch on our github README and maybe our site somewhere? Maybe just mention it in the news section at the end any 1.x fix releases? Nick

[jira] [Commented] (TIKA-3616) Upgrade log4j2

2021-12-15 Thread Dan Switzer (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460140#comment-17460140 ] Dan Switzer commented on TIKA-3616: --- Is Tika being upgraded to Log4j v2.16, since 2.15 still has a

Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Tim Allison
Sounds good, Nick. Unless there are objections, I'll add an EOL September 30, 2022 for the 1.x branch on our github README and maybe our site somewhere? >I'm not keen on adding new features to 1.x, as that'll only encourage people to stick on the old one, but I wouldn't go as far as -1'ing other

[jira] [Created] (TIKA-3620) Language detection documentation needs attention

2021-12-15 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created TIKA-3620: -- Summary: Language detection documentation needs attention Key: TIKA-3620 URL: https://issues.apache.org/jira/browse/TIKA-3620 Project: Tika

[jira] [Assigned] (TIKA-3620) Language detection documentation needs attention

2021-12-15 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned TIKA-3620: -- Assignee: Lewis John McGibbney > Language detection documentation needs

[jira] [Resolved] (TIKA-3241) Clarify parser module structure in 2.0.0

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3241. --- Fix Version/s: 2.0.0-ALPHA Resolution: Fixed Y, thanks [~lewismc] > Clarify parser module

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460068#comment-17460068 ] ASF GitHub Bot commented on TIKA-3446: -- tballison commented on pull request #465: URL:

[GitHub] [tika] tballison commented on pull request #465: TIKA-3446 - 1.x port of TIKA-3446 - Support for parsing OneNote when Alternative Encoding Using the File Synchronization via SOAP Over HTTP Pr

2021-12-15 Thread GitBox
tballison commented on pull request #465: URL: https://github.com/apache/tika/pull/465#issuecomment-994938829 @nddipiazza I think the goal is to keep tika-core as slim as possible, but you can put whatever you need in tika-parsers, as long as we don't have any conflicts. -- This is an

[jira] [Commented] (TIKA-3241) Clarify parser module structure in 2.0.0

2021-12-15 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460066#comment-17460066 ] Lewis John McGibbney commented on TIKA-3241: Hi [~tallison] can this ticket be closed? >

[jira] [Commented] (TIKA-3229) mvn clean install failure - tika-1.24 on windows

2021-12-15 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460065#comment-17460065 ] Lewis John McGibbney commented on TIKA-3229: [~Simmo] are you able to reproduce this/

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460064#comment-17460064 ] ASF GitHub Bot commented on TIKA-3446: -- nddipiazza commented on pull request #465: URL:

[jira] [Resolved] (TIKA-491) Add language identification support for Norwegian Bokmål and Norwegian Nynorsk

2021-12-15 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-491. --- Resolution: Won't Fix Cleanup [~kkrugler] [~pandermusubi] > Add language

[GitHub] [tika] nddipiazza commented on pull request #465: TIKA-3446 - 1.x port of TIKA-3446 - Support for parsing OneNote when Alternative Encoding Using the File Synchronization via SOAP Over HTTP P

2021-12-15 Thread GitBox
nddipiazza commented on pull request #465: URL: https://github.com/apache/tika/pull/465#issuecomment-994933278 @tballison am I OK to use stringutils3 in the 1.x branch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[jira] [Commented] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

2021-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17460062#comment-17460062 ] ASF GitHub Bot commented on TIKA-3446: -- nddipiazza opened a new pull request #465: URL:

[GitHub] [tika] nddipiazza opened a new pull request #465: TIKA-3446 - 1.x port of TIKA-3446 - Support for parsing OneNote when Alternative Encoding Using the File Synchronization via SOAP Over HTTP P

2021-12-15 Thread GitBox
nddipiazza opened a new pull request #465: URL: https://github.com/apache/tika/pull/465 Porting https://github.com/apache/tika/pull/461 to Tika 1.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[jira] [Resolved] (TIKA-369) Improve accuracy of language detection

2021-12-15 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved TIKA-369. --- Resolution: Fixed Cleaning this one up [~kkrugler] > Improve accuracy of language

Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Nick Burch
On Wed, 15 Dec 2021, Tim Allison wrote: I think we should keep the 1.x branch open for security upgrades for a bit...middle of next year? I have _not_ been adding new features or even some bug fixes to 1.x, and I encourage people to migrate to 2.x. We've seen quite a few queries from people

[jira] [Commented] (TIKA-3610) Emit errors to a specific emitter

2021-12-15 Thread David Pilato (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1746#comment-1746 ] David Pilato commented on TIKA-3610: That's very good. So I believe we are all set and we can close

[jira] [Comment Edited] (TIKA-3610) Emit errors to a specific emitter

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459998#comment-17459998 ] Tim Allison edited comment on TIKA-3610 at 12/15/21, 2:37 PM: -- Hi

[jira] [Commented] (TIKA-3610) Emit errors to a specific emitter

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459998#comment-17459998 ] Tim Allison commented on TIKA-3610: --- Hi [~dadoonet], at the meetup, I forgot that I already added a

[jira] [Commented] (TIKA-3618) Upgrade to log4j2 in 1.x branch

2021-12-15 Thread Subhajit Das (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459931#comment-17459931 ] Subhajit Das commented on TIKA-3618: Yes, I will check and push. > Upgrade to log4j2 in 1.x branch >

[jira] [Commented] (TIKA-3618) Upgrade to log4j2 in 1.x branch

2021-12-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459917#comment-17459917 ] Tim Allison commented on TIKA-3618: --- [~subhajitdas298] once you’ve had a chance to review branch_1x, we

Re: [DISCUSS] upgrading log4j to to log4j2 in Tika's 1.x branch

2021-12-15 Thread Tim Allison
It didn't take too long, and as long as the original author of the metrics stuff in tika-server isn't too concerned about breaking changes, let's hope for the best. Log4j 1.x is so far beyond its EOL, it is embarrassing. I think we should keep the 1.x branch open for security upgrades for a

Re: [VOTE] Release Apache Tika 2.2.0 Candidate #1

2021-12-15 Thread Tamás Cservenák
Howdy, There were some Maven Central issues in the past few days, hopefully fixed. https://status.maven.org/#week Thanks Tamas On Mon, Dec 13, 2021 at 11:18 PM Lewis John McGibbney wrote: > I performed another build of the tika-2.2.0-src.zip artifact which failed. > I've captured the failure

[jira] [Updated] (TIKA-3615) Missing class file while upgrade to TIka 2.1.0

2021-12-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3615: -- Issue Type: Bug (was: Test) > Missing class file while upgrade to TIka 2.1.0 >

[jira] [Closed] (TIKA-3615) Missing class file while upgrade to TIka 2.1.0

2021-12-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed TIKA-3615. - Resolution: Not A Bug > Missing class file while upgrade to TIka 2.1.0 >

[jira] [Reopened] (TIKA-3615) Missing class file while upgrade to TIka 2.1.0

2021-12-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened TIKA-3615: --- > Missing class file while upgrade to TIka 2.1.0 > --

[jira] [Closed] (TIKA-3614) Trying to upgrade from 1.27 to 2.1.0

2021-12-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed TIKA-3614. - Resolution: Not A Bug I'm closing these as "not a bug" because we don't want these to appear in a

[jira] [Reopened] (TIKA-3614) Trying to upgrade from 1.27 to 2.1.0

2021-12-15 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened TIKA-3614: --- > Trying to upgrade from 1.27 to 2.1.0 > > >