[GitHub] [tika] Raahul3010 commented on pull request #1341: [TIKA-4130] Conflict with duplicate org/w3c and org/xml packages in tika-app jar
Raahul3010 commented on PR #1341: URL: https://github.com/apache/tika/pull/1341#issuecomment-1724869055 Hi team, An update on this would be helpful Thanks in advance #1341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (TIKA-4129) Upgrade dependencies requiring > Java 8
[ https://issues.apache.org/jira/browse/TIKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766641#comment-17766641 ] Hudson commented on TIKA-4129: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1253 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1253/]) TIKA-4129: update h2, plexus (tilman: [https://github.com/apache/tika/commit/162f0cbbcae1c9b72a91eb0c79505e726367ce7e]) * (edit) tika-parent/pom.xml > Upgrade dependencies requiring > Java 8 > --- > > Key: TIKA-4129 > URL: https://issues.apache.org/jira/browse/TIKA-4129 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > Fix For: 3.0.0-BETA > > > On TIKA-3735, we documented several dependencies that required > Java 8. Now > that we're working with Java 11 on main, let's make those upgrades. > There was already a separate ticket open for Lucene (TIKA-3641), so let's > upgrade these: > * Apache OpenNLP 2.0.0 requires Java 11. > * DL4J 1.0.0-M2.1 - datavec-data-image-1.0.0-M2.1.jar requires Java 11 > * Fakeload > * > [checkstyle|https://mail.google.com/mail/u/0/#label/lists%2Ftika/WhctKKXXHvjnJRRdBSwLbKkDkXQtRnWGDhblVMQQZhjsDGrFpRMRQJJrZSdskrNCqcmTtjL] > * errorprone requires Java 11 for the build (doesn't mean we can't target 8) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [tika] solomax commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
solomax commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1724783427 I'm using properties like this: `2.0.9` To get the list of updated dependencies using: `mvn versions:display-property-updates -DincludeParent` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (TIKA-1351) Parser implementations should accept null content handlers
[ https://issues.apache.org/jira/browse/TIKA-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766606#comment-17766606 ] Alexey Pismenskiy commented on TIKA-1351: - Any update on this ticket? Previous comment mentions that there is a dummy ContentHandler - what is the name of this class?. But It would be nice to ONLY extract metadata and do not waste the resources to parse the content in some cases. > Parser implementations should accept null content handlers > -- > > Key: TIKA-1351 > URL: https://issues.apache.org/jira/browse/TIKA-1351 > Project: Tika > Issue Type: Improvement > Components: parser >Reporter: Sergey Beryozkin >Priority: Minor > > Applications which want to let users search documents based only on their > metadata do not need to get the content parsed. > The only workaround I've found so far is to pass a no op content handler > which can ignore the content events but it does not stop the parser such as > PDFParser from parsing the content. > Proposal: update parser API docs to let implementers know ContentHandler can > be null and update the shipped implementations to parse the metadata only if > ContentHandler is null -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (TIKA-2245) Standardise logging
[ https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596 ] Luís Filipe Nassif edited comment on TIKA-2245 at 9/18/23 10:23 PM: Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml. It is still there into main branch. was (Author: lfcnassif): Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml (and add exclusion for commons-logging from jackcess). It is still there into main branch. > Standardise logging > --- > > Key: TIKA-2245 > URL: https://issues.apache.org/jira/browse/TIKA-2245 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.14, 1.15 >Reporter: Matthew Caruana Galizia >Assignee: Konstantin Gribov >Priority: Major > Labels: logging > Fix For: 1.15 > > > Tika parsers sometimes use Log4j's Logger, sometimes the JUL > (java.util.logging) Logger and sometimes SLF4j. > It would be better to standardise on a single facade, for the sake of not > having to configure multiple loggers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (TIKA-2245) Standardise logging
[ https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596 ] Luís Filipe Nassif edited comment on TIKA-2245 at 9/18/23 10:23 PM: Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml (and add exclusion for commons-logging from jackcess). It is still there into main branch. was (Author: lfcnassif): Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml. It is still there into main branch. > Standardise logging > --- > > Key: TIKA-2245 > URL: https://issues.apache.org/jira/browse/TIKA-2245 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.14, 1.15 >Reporter: Matthew Caruana Galizia >Assignee: Konstantin Gribov >Priority: Major > Labels: logging > Fix For: 1.15 > > > Tika parsers sometimes use Log4j's Logger, sometimes the JUL > (java.util.logging) Logger and sometimes SLF4j. > It would be better to standardise on a single facade, for the sake of not > having to configure multiple loggers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-2245) Standardise logging
[ https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766598#comment-17766598 ] Luís Filipe Nassif commented on TIKA-2245: -- Wait, please see [~grossws] above answer for a better one. > Standardise logging > --- > > Key: TIKA-2245 > URL: https://issues.apache.org/jira/browse/TIKA-2245 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.14, 1.15 >Reporter: Matthew Caruana Galizia >Assignee: Konstantin Gribov >Priority: Major > Labels: logging > Fix For: 1.15 > > > Tika parsers sometimes use Log4j's Logger, sometimes the JUL > (java.util.logging) Logger and sometimes SLF4j. > It would be better to standardise on a single facade, for the sake of not > having to configure multiple loggers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (TIKA-2245) Standardise logging
[ https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596 ] Luís Filipe Nassif edited comment on TIKA-2245 at 9/18/23 10:17 PM: Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml. It is still there into main branch. was (Author: lfcnassif): Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml > Standardise logging > --- > > Key: TIKA-2245 > URL: https://issues.apache.org/jira/browse/TIKA-2245 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.14, 1.15 >Reporter: Matthew Caruana Galizia >Assignee: Konstantin Gribov >Priority: Major > Labels: logging > Fix For: 1.15 > > > Tika parsers sometimes use Log4j's Logger, sometimes the JUL > (java.util.logging) Logger and sometimes SLF4j. > It would be better to standardise on a single facade, for the sake of not > having to configure multiple loggers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-2245) Standardise logging
[ https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596 ] Luís Filipe Nassif commented on TIKA-2245: -- Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we need to replace commons-logging for jcl-over-slf4j into tika-parser-microsoft-module pom.xml > Standardise logging > --- > > Key: TIKA-2245 > URL: https://issues.apache.org/jira/browse/TIKA-2245 > Project: Tika > Issue Type: Improvement > Components: parser >Affects Versions: 1.14, 1.15 >Reporter: Matthew Caruana Galizia >Assignee: Konstantin Gribov >Priority: Major > Labels: logging > Fix For: 1.15 > > > Tika parsers sometimes use Log4j's Logger, sometimes the JUL > (java.util.logging) Logger and sometimes SLF4j. > It would be better to standardise on a single facade, for the sake of not > having to configure multiple loggers. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
[ https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766560#comment-17766560 ] Tim Allison commented on TIKA-4134: --- I think that'd be great. As I think about this more...unless there are objections, we can add *.tar.gz along with our current fat jars later. We don't need to swap the fat jars for *.tar.gz at a major version change. We could stop building fat jars at say 4.x, but we could have parallel release artifacts in 3.x. WDYT? > Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x? > -- > > Key: TIKA-4134 > URL: https://issues.apache.org/jira/browse/TIKA-4134 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > > On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, > [~desruisseaux] pointed out that uber jars might not be the best idea with > jpms in the future. > I'm opening this issue to discuss if we want to change the packaging > structure in 3.x. > If we wanted to "go small" we can keep things as they are in 3.x and warn > users that we might move away from an uber jar in 4.x. > If we wanted to "go big", we could use the maven dependency plugin to create > a "lib/" directory with all of the dependencies and then have a small > tika-app.jar that includes those dependencies in its classpath. > Are there other, better ways that we should think about packaging tika-app > and tika-server in 3.x and beyond? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
[ https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766392#comment-17766392 ] Thorsten Heit commented on TIKA-4134: - I don't know directly of projects that use them. I have used them company-internal several times to create a tar.gz file containing everything I needed: - the ./bin directory holding the startup shell script (created by the appassembler plugin) and a few configuration files - a lib folder with the project jar and all its dependencies - a folder for the program's log files No big magic. If you like I can try and create a PR for that. > Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x? > -- > > Key: TIKA-4134 > URL: https://issues.apache.org/jira/browse/TIKA-4134 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > > On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, > [~desruisseaux] pointed out that uber jars might not be the best idea with > jpms in the future. > I'm opening this issue to discuss if we want to change the packaging > structure in 3.x. > If we wanted to "go small" we can keep things as they are in 3.x and warn > users that we might move away from an uber jar in 4.x. > If we wanted to "go big", we could use the maven dependency plugin to create > a "lib/" directory with all of the dependencies and then have a small > tika-app.jar that includes those dependencies in its classpath. > Are there other, better ways that we should think about packaging tika-app > and tika-server in 3.x and beyond? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [tika] theit commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
theit commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1723437367 > Separately, we're doing a better job of moving most dependencies into dependencyManagement and using boms. Should we get rid of the version numbers in the elements and just control the versions via the boms where possible? +1 IMHO it would also be nice to get rid of properties whose only purpose is to define a version that is then later used only in the dependency management section. Example: `jackson.version` Another example is `log4j2.version`, defined in `tika-parent/pom.xml`. It's used in a couple of other pom.xml files, but unnecessary there because the corresponding dependencies are already referenced/defined via the import BOM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
[ https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766380#comment-17766380 ] Tim Allison commented on TIKA-4134: --- Nice! I wasn't aware of: Application Assembler Maven Plugin... Any model projects you'd recommend that we can, erm, borrow from? Thank you! > Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x? > -- > > Key: TIKA-4134 > URL: https://issues.apache.org/jira/browse/TIKA-4134 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > > On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, > [~desruisseaux] pointed out that uber jars might not be the best idea with > jpms in the future. > I'm opening this issue to discuss if we want to change the packaging > structure in 3.x. > If we wanted to "go small" we can keep things as they are in 3.x and warn > users that we might move away from an uber jar in 4.x. > If we wanted to "go big", we could use the maven dependency plugin to create > a "lib/" directory with all of the dependencies and then have a small > tika-app.jar that includes those dependencies in its classpath. > Are there other, better ways that we should think about packaging tika-app > and tika-server in 3.x and beyond? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
[ https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766379#comment-17766379 ] Thorsten Heit commented on TIKA-4134: - If you consider tika-app and tika-server as standlone tools/programs to be executed from the command-line, what about using a combination of m-assembly-p and appassembler-maven-plugin to create a distribution containing everything that's needed? > Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x? > -- > > Key: TIKA-4134 > URL: https://issues.apache.org/jira/browse/TIKA-4134 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > > On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, > [~desruisseaux] pointed out that uber jars might not be the best idea with > jpms in the future. > I'm opening this issue to discuss if we want to change the packaging > structure in 3.x. > If we wanted to "go small" we can keep things as they are in 3.x and warn > users that we might move away from an uber jar in 4.x. > If we wanted to "go big", we could use the maven dependency plugin to create > a "lib/" directory with all of the dependencies and then have a small > tika-app.jar that includes those dependencies in its classpath. > Are there other, better ways that we should think about packaging tika-app > and tika-server in 3.x and beyond? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
tballison commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1723398386 Separately, we're doing a better job of moving most dependencies into dependencyManagement and using boms. Should we get rid of the version numbers in the elements and just control the versions via the boms where possible? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
tballison commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1723395686 > > 4. Running `mvn verify` fails in Tika server core: > > > ``` > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.054 s <<< FAILURE! -- in org.apache.tika.server.core.StackTraceTest > [ERROR] org.apache.tika.server.core.StackTraceTest.testEmptyParser -- Time elapsed: 0.006 s <<< FAILURE! > org.opentest4j.AssertionFailedError: bad type: /tika ==> expected: <200> but was: <500> >at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > (...) > ``` > No idea. Let me see if I can reproduce this locally and figure out what's going on. Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
tballison commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1723394374 > I was working locally for myself on the migration of Tika from javax to jakarta before I found out this PR and now have a few comments/hints, especially WRT to `tika-parent/pom.xml`: > > 1. The different CXF dependencies could be replaced by the import BOM. > > 2. The same mechanism could be used for SLF4J that also offers a BOM; no need to manage three dependencies. > > 3. On my tests I had to add an exclusion for `xml-apis:xml-apis` in the dependency manangement section for `xerces:xercesImpl`. Otherwise I get tons of errors in my IDE (Eclipse) because the package `org.xml.sax` is accessible from more than one module: ``, `java.xml`. Haven't you seen them too? > > 4. Running `mvn verify` fails in Tika server core: > > > ``` > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.054 s <<< FAILURE! -- in org.apache.tika.server.core.StackTraceTest > [ERROR] org.apache.tika.server.core.StackTraceTest.testEmptyParser -- Time elapsed: 0.006 s <<< FAILURE! > org.opentest4j.AssertionFailedError: bad type: /tika ==> expected: <200> but was: <500> >at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > (...) > ``` > Thank you! Adding the above shortly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
Tim Allison created TIKA-4134: - Summary: Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x? Key: TIKA-4134 URL: https://issues.apache.org/jira/browse/TIKA-4134 Project: Tika Issue Type: Task Reporter: Tim Allison On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, [~desruisseaux] pointed out that uber jars might not be the best idea with jpms in the future. I'm opening this issue to discuss if we want to change the packaging structure in 3.x. If we wanted to "go small" we can keep things as they are in 3.x and warn users that we might move away from an uber jar in 4.x. If we wanted to "go big", we could use the maven dependency plugin to create a "lib/" directory with all of the dependencies and then have a small tika-app.jar that includes those dependencies in its classpath. Are there other, better ways that we should think about packaging tika-app and tika-server in 3.x and beyond? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
tballison commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1723300367 > Note that uber JAR files created by Maven Shade Plugin may not work anymore in a JPMS world. Currently, the plugin seems to workaround by removing all `module-info.class` files. It works if the dependencies duplicate all their services declarations into `META-INF/services/` for compatibility with the old world. Apache SIS and Derby for instances do that. But any library could decide to stop doing this duplication in some future version, because it makes more difficult to use some features that are unique to `module-info`. It may be safer to progressively move away from uber JAR if possible. Do you have recommendations for first steps in this direction? We have some flexibility with 3.x. How do we start? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766256#comment-17766256 ] ASF GitHub Bot commented on TIKA-3948: -- theit commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1722932587 I was working locally for myself on the migration of Tika from javax to jakarta before I found out this PR and now have a few comments/hints, especially WRT to `tika-parent/pom.xml`: 1. The different CXF dependencies could be replaced by the import BOM. 2. The same mechanism could be used for SLF4J that also offers a BOM; no need to manage three dependencies. 3. On my tests I had to add an exclusion for `xml-apis:xml-apis` in the dependency manangement section for `xerces:xercesImpl`. Otherwise I get tons of errors in my IDE (Eclipse) because the package `org.xml.sax` is accessible from more than one module: ``, `java.xml`. Haven't you seen them too? 4. Running `mvn verify` fails in Tika server core: ``` [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.054 s <<< FAILURE! > Migrate to jakarta in Tika 3.x > -- > > Key: TIKA-3948 > URL: https://issues.apache.org/jira/browse/TIKA-3948 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > Labels: tika-3x > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [tika] theit commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
theit commented on PR #1345: URL: https://github.com/apache/tika/pull/1345#issuecomment-1722932587 I was working locally for myself on the migration of Tika from javax to jakarta before I found out this PR and now have a few comments/hints, especially WRT to `tika-parent/pom.xml`: 1. The different CXF dependencies could be replaced by the import BOM. 2. The same mechanism could be used for SLF4J that also offers a BOM; no need to manage three dependencies. 3. On my tests I had to add an exclusion for `xml-apis:xml-apis` in the dependency manangement section for `xerces:xercesImpl`. Otherwise I get tons of errors in my IDE (Eclipse) because the package `org.xml.sax` is accessible from more than one module: ``, `java.xml`. Haven't you seen them too? 4. Running `mvn verify` fails in Tika server core: ``` [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.054 s <<< FAILURE! -- in org.apache.tika.server.core.StackTraceTest [ERROR] org.apache.tika.server.core.StackTraceTest.testEmptyParser -- Time elapsed: 0.006 s <<< FAILURE! org.opentest4j.AssertionFailedError: bad type: /tika ==> expected: <200> but was: <500> at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) (...) ``` However, executing the (single) test in Eclipse, this works. Also when I let Eclipse run all JUnit tests in that module. Do you have an idea what is causing this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766246#comment-17766246 ] ASF GitHub Bot commented on TIKA-3948: -- theit commented on code in PR #1345: URL: https://github.com/apache/tika/pull/1345#discussion_r1328345983 ## tika-parent/pom.xml: ## @@ -730,6 +723,16 @@ commons-math3 ${commons.math3.version} + +org.apache.cxf +cxf-rt-rs-client +${cxf.version} + + +org.apache.cxf +cxf-rt-frontend-jaxrs +${cxf.version} + Review Comment: This could be simplified: ```xml org.apache.cxf cxf-bom ${cxf.version} pom import ``` > Migrate to jakarta in Tika 3.x > -- > > Key: TIKA-3948 > URL: https://issues.apache.org/jira/browse/TIKA-3948 > Project: Tika > Issue Type: Task >Reporter: Tim Allison >Priority: Major > Labels: tika-3x > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [tika] theit commented on a diff in pull request #1345: TIKA-3948 -- migrate from javax -> jakarta
theit commented on code in PR #1345: URL: https://github.com/apache/tika/pull/1345#discussion_r1328345983 ## tika-parent/pom.xml: ## @@ -730,6 +723,16 @@ commons-math3 ${commons.math3.version} + +org.apache.cxf +cxf-rt-rs-client +${cxf.version} + + +org.apache.cxf +cxf-rt-frontend-jaxrs +${cxf.version} + Review Comment: This could be simplified: ```xml org.apache.cxf cxf-bom ${cxf.version} pom import ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org