[GitHub] [tika] Raahul3010 commented on pull request #1341: [TIKA-4130] Conflict with duplicate org/w3c and org/xml packages in tika-app jar

2023-09-18 Thread via GitHub


Raahul3010 commented on PR #1341:
URL: https://github.com/apache/tika/pull/1341#issuecomment-1724869055

   Hi team,
   An update on this would be helpful
   Thanks in advance
   #1341 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (TIKA-4129) Upgrade dependencies requiring > Java 8

2023-09-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766641#comment-17766641
 ] 

Hudson commented on TIKA-4129:
--

SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1253 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1253/])
TIKA-4129: update h2, plexus (tilman: 
[https://github.com/apache/tika/commit/162f0cbbcae1c9b72a91eb0c79505e726367ce7e])
* (edit) tika-parent/pom.xml


> Upgrade dependencies requiring > Java 8
> ---
>
> Key: TIKA-4129
> URL: https://issues.apache.org/jira/browse/TIKA-4129
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
> Fix For: 3.0.0-BETA
>
>
> On TIKA-3735, we documented several dependencies that required > Java 8.  Now 
> that we're working with Java 11 on main, let's make those upgrades.
> There was already a separate ticket open for Lucene (TIKA-3641), so let's 
> upgrade these:
>  * Apache OpenNLP 2.0.0 requires Java 11.
>  * DL4J 1.0.0-M2.1 - datavec-data-image-1.0.0-M2.1.jar requires Java 11
>  * Fakeload
>  * 
> [checkstyle|https://mail.google.com/mail/u/0/#label/lists%2Ftika/WhctKKXXHvjnJRRdBSwLbKkDkXQtRnWGDhblVMQQZhjsDGrFpRMRQJJrZSdskrNCqcmTtjL]
>  * errorprone requires Java 11 for the build (doesn't mean we can't target 8)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [tika] solomax commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


solomax commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1724783427

   I'm using properties like this: `2.0.9`
   To get the list of updated dependencies using: 
   `mvn versions:display-property-updates -DincludeParent`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (TIKA-1351) Parser implementations should accept null content handlers

2023-09-18 Thread Alexey Pismenskiy (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766606#comment-17766606
 ] 

Alexey Pismenskiy commented on TIKA-1351:
-

Any update on this ticket? 

Previous comment mentions that there is a dummy ContentHandler - what is the 
name of this class?. 

But It would be nice to ONLY extract metadata and do not waste the resources to 
parse the content in some cases. 

> Parser implementations should accept null content handlers
> --
>
> Key: TIKA-1351
> URL: https://issues.apache.org/jira/browse/TIKA-1351
> Project: Tika
>  Issue Type: Improvement
>  Components: parser
>Reporter: Sergey Beryozkin
>Priority: Minor
>
> Applications which want to let users search documents based only on their 
> metadata do not need to get the content parsed. 
> The only workaround I've found so far is to pass a no op content handler 
> which can ignore the content events but it does not stop the parser such as 
> PDFParser from parsing the content.
> Proposal: update parser API docs to let implementers know ContentHandler can 
> be null and update the shipped implementations to parse the metadata only if 
> ContentHandler is null



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (TIKA-2245) Standardise logging

2023-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596
 ] 

Luís Filipe Nassif edited comment on TIKA-2245 at 9/18/23 10:23 PM:


Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml. It is still there into main branch.


was (Author: lfcnassif):
Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml (and add exclusion for commons-logging 
from jackcess). It is still there into main branch.

> Standardise logging
> ---
>
> Key: TIKA-2245
> URL: https://issues.apache.org/jira/browse/TIKA-2245
> Project: Tika
>  Issue Type: Improvement
>  Components: parser
>Affects Versions: 1.14, 1.15
>Reporter: Matthew Caruana Galizia
>Assignee: Konstantin Gribov
>Priority: Major
>  Labels: logging
> Fix For: 1.15
>
>
> Tika parsers sometimes use Log4j's Logger, sometimes the JUL 
> (java.util.logging) Logger and sometimes SLF4j.
> It would be better to standardise on a single facade, for the sake of not 
> having to configure multiple loggers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (TIKA-2245) Standardise logging

2023-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596
 ] 

Luís Filipe Nassif edited comment on TIKA-2245 at 9/18/23 10:23 PM:


Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml (and add exclusion for commons-logging 
from jackcess). It is still there into main branch.


was (Author: lfcnassif):
Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml. It is still there into main branch.

> Standardise logging
> ---
>
> Key: TIKA-2245
> URL: https://issues.apache.org/jira/browse/TIKA-2245
> Project: Tika
>  Issue Type: Improvement
>  Components: parser
>Affects Versions: 1.14, 1.15
>Reporter: Matthew Caruana Galizia
>Assignee: Konstantin Gribov
>Priority: Major
>  Labels: logging
> Fix For: 1.15
>
>
> Tika parsers sometimes use Log4j's Logger, sometimes the JUL 
> (java.util.logging) Logger and sometimes SLF4j.
> It would be better to standardise on a single facade, for the sake of not 
> having to configure multiple loggers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-2245) Standardise logging

2023-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766598#comment-17766598
 ] 

Luís Filipe Nassif commented on TIKA-2245:
--

Wait, please see [~grossws] above answer for a better one.

> Standardise logging
> ---
>
> Key: TIKA-2245
> URL: https://issues.apache.org/jira/browse/TIKA-2245
> Project: Tika
>  Issue Type: Improvement
>  Components: parser
>Affects Versions: 1.14, 1.15
>Reporter: Matthew Caruana Galizia
>Assignee: Konstantin Gribov
>Priority: Major
>  Labels: logging
> Fix For: 1.15
>
>
> Tika parsers sometimes use Log4j's Logger, sometimes the JUL 
> (java.util.logging) Logger and sometimes SLF4j.
> It would be better to standardise on a single facade, for the sake of not 
> having to configure multiple loggers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (TIKA-2245) Standardise logging

2023-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596
 ] 

Luís Filipe Nassif edited comment on TIKA-2245 at 9/18/23 10:17 PM:


Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml. It is still there into main branch.


was (Author: lfcnassif):
Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml

> Standardise logging
> ---
>
> Key: TIKA-2245
> URL: https://issues.apache.org/jira/browse/TIKA-2245
> Project: Tika
>  Issue Type: Improvement
>  Components: parser
>Affects Versions: 1.14, 1.15
>Reporter: Matthew Caruana Galizia
>Assignee: Konstantin Gribov
>Priority: Major
>  Labels: logging
> Fix For: 1.15
>
>
> Tika parsers sometimes use Log4j's Logger, sometimes the JUL 
> (java.util.logging) Logger and sometimes SLF4j.
> It would be better to standardise on a single facade, for the sake of not 
> having to configure multiple loggers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-2245) Standardise logging

2023-09-18 Thread Jira


[ 
https://issues.apache.org/jira/browse/TIKA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766596#comment-17766596
 ] 

Luís Filipe Nassif commented on TIKA-2245:
--

Hi [~tallison], sorry for my long delay, I didn't see your question... Yes, we 
need to replace commons-logging for jcl-over-slf4j into 
tika-parser-microsoft-module pom.xml

> Standardise logging
> ---
>
> Key: TIKA-2245
> URL: https://issues.apache.org/jira/browse/TIKA-2245
> Project: Tika
>  Issue Type: Improvement
>  Components: parser
>Affects Versions: 1.14, 1.15
>Reporter: Matthew Caruana Galizia
>Assignee: Konstantin Gribov
>Priority: Major
>  Labels: logging
> Fix For: 1.15
>
>
> Tika parsers sometimes use Log4j's Logger, sometimes the JUL 
> (java.util.logging) Logger and sometimes SLF4j.
> It would be better to standardise on a single facade, for the sake of not 
> having to configure multiple loggers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?

2023-09-18 Thread Tim Allison (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766560#comment-17766560
 ] 

Tim Allison commented on TIKA-4134:
---

I think that'd be great.

As I think about this more...unless there are objections, we can add *.tar.gz 
along with our current fat jars later.  We don't need to swap the fat jars for 
*.tar.gz at a major version change. We could stop building fat jars at say 4.x, 
but we could have parallel release artifacts in 3.x.

WDYT?



> Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
> --
>
> Key: TIKA-4134
> URL: https://issues.apache.org/jira/browse/TIKA-4134
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, 
> [~desruisseaux] pointed out that uber jars might not be the best idea with 
> jpms in the future.
> I'm opening this issue to discuss if we want to change the packaging 
> structure in 3.x.
> If we wanted to "go small" we can keep things as they are in 3.x and warn 
> users that we might move away from an uber jar in 4.x.
> If we wanted to "go big", we could use the maven dependency plugin to create 
> a "lib/" directory with all of the dependencies and then have a small 
> tika-app.jar that includes those dependencies in its classpath.
> Are there other, better ways that we should think about packaging tika-app 
> and tika-server in 3.x and beyond?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?

2023-09-18 Thread Thorsten Heit (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766392#comment-17766392
 ] 

Thorsten Heit commented on TIKA-4134:
-

I don't know directly of projects that use them. I have used them 
company-internal several times to create a tar.gz file containing everything I 
needed:
 - the ./bin directory holding the startup shell script (created by the 
appassembler plugin) and a few configuration files
 - a lib folder with the project jar and all its dependencies
 - a folder for the program's log files

No big magic.

If you like I can try and create a PR for that.

> Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
> --
>
> Key: TIKA-4134
> URL: https://issues.apache.org/jira/browse/TIKA-4134
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, 
> [~desruisseaux] pointed out that uber jars might not be the best idea with 
> jpms in the future.
> I'm opening this issue to discuss if we want to change the packaging 
> structure in 3.x.
> If we wanted to "go small" we can keep things as they are in 3.x and warn 
> users that we might move away from an uber jar in 4.x.
> If we wanted to "go big", we could use the maven dependency plugin to create 
> a "lib/" directory with all of the dependencies and then have a small 
> tika-app.jar that includes those dependencies in its classpath.
> Are there other, better ways that we should think about packaging tika-app 
> and tika-server in 3.x and beyond?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [tika] theit commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


theit commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1723437367

   > Separately, we're doing a better job of moving most dependencies into 
dependencyManagement and using boms. Should we get rid of the version numbers 
in the elements and just control the versions via the boms where possible?
   
   +1
   IMHO it would also be nice to get rid of properties whose only purpose is to 
define a version that is then later used only in the dependency management 
section. Example: `jackson.version`
   
   Another example is `log4j2.version`, defined in `tika-parent/pom.xml`. It's 
used in a couple of other pom.xml files, but unnecessary there because the 
corresponding dependencies are already referenced/defined via the import BOM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?

2023-09-18 Thread Tim Allison (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766380#comment-17766380
 ] 

Tim Allison commented on TIKA-4134:
---

Nice!  I wasn't aware of: Application Assembler Maven Plugin... Any model 
projects you'd recommend that we can, erm, borrow from? Thank you!

> Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
> --
>
> Key: TIKA-4134
> URL: https://issues.apache.org/jira/browse/TIKA-4134
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, 
> [~desruisseaux] pointed out that uber jars might not be the best idea with 
> jpms in the future.
> I'm opening this issue to discuss if we want to change the packaging 
> structure in 3.x.
> If we wanted to "go small" we can keep things as they are in 3.x and warn 
> users that we might move away from an uber jar in 4.x.
> If we wanted to "go big", we could use the maven dependency plugin to create 
> a "lib/" directory with all of the dependencies and then have a small 
> tika-app.jar that includes those dependencies in its classpath.
> Are there other, better ways that we should think about packaging tika-app 
> and tika-server in 3.x and beyond?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?

2023-09-18 Thread Thorsten Heit (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766379#comment-17766379
 ] 

Thorsten Heit commented on TIKA-4134:
-

If you consider tika-app and tika-server as standlone tools/programs to be 
executed from the command-line, what about using a combination of m-assembly-p 
and appassembler-maven-plugin to create a distribution containing everything 
that's needed?

> Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?
> --
>
> Key: TIKA-4134
> URL: https://issues.apache.org/jira/browse/TIKA-4134
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>
> On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, 
> [~desruisseaux] pointed out that uber jars might not be the best idea with 
> jpms in the future.
> I'm opening this issue to discuss if we want to change the packaging 
> structure in 3.x.
> If we wanted to "go small" we can keep things as they are in 3.x and warn 
> users that we might move away from an uber jar in 4.x.
> If we wanted to "go big", we could use the maven dependency plugin to create 
> a "lib/" directory with all of the dependencies and then have a small 
> tika-app.jar that includes those dependencies in its classpath.
> Are there other, better ways that we should think about packaging tika-app 
> and tika-server in 3.x and beyond?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


tballison commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1723398386

   Separately, we're doing a better job of moving most dependencies into 
dependencyManagement and using boms.  Should we get rid of the version numbers 
in the  elements and just control the versions via the boms where 
possible?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


tballison commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1723395686

   > 
   > 4. Running `mvn verify` fails in Tika server core:
   > 
   > 
   > ```
   > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
0.054 s <<< FAILURE! -- in org.apache.tika.server.core.StackTraceTest
   > [ERROR] org.apache.tika.server.core.StackTraceTest.testEmptyParser -- Time 
elapsed: 0.006 s <<< FAILURE!
   > org.opentest4j.AssertionFailedError: bad type: /tika ==> expected: <200> 
but was: <500>
   >at 
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
   > (...)
   > ```
   > 
   No idea. Let me see if I can reproduce this locally and figure out what's 
going on. Thank you.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


tballison commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1723394374

   > I was working locally for myself on the migration of Tika from javax to 
jakarta before I found out this PR and now have a few comments/hints, 
especially WRT to `tika-parent/pom.xml`:
   > 
   > 1. The different CXF dependencies could be replaced by the import BOM.
   > 
   > 2. The same mechanism could be used for SLF4J that also offers a BOM; 
no need to manage three dependencies.
   > 
   > 3. On my tests I had to add an exclusion for `xml-apis:xml-apis` in 
the dependency manangement section for `xerces:xercesImpl`. Otherwise I get 
tons of errors in my IDE (Eclipse) because the package `org.xml.sax` is 
accessible from more than one module: ``, `java.xml`. Haven't you seen 
them too?
   > 
   > 4. Running `mvn verify` fails in Tika server core:
   > 
   > 
   > ```
   > [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
0.054 s <<< FAILURE! -- in org.apache.tika.server.core.StackTraceTest
   > [ERROR] org.apache.tika.server.core.StackTraceTest.testEmptyParser -- Time 
elapsed: 0.006 s <<< FAILURE!
   > org.opentest4j.AssertionFailedError: bad type: /tika ==> expected: <200> 
but was: <500>
   >at 
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
   > (...)
   > ```
   > 
   Thank you!  Adding the above shortly.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (TIKA-4134) Maybe move away from an uber jar for tika-app and tika-server-standard in 3.x?

2023-09-18 Thread Tim Allison (Jira)
Tim Allison created TIKA-4134:
-

 Summary: Maybe move away from an uber jar for tika-app and 
tika-server-standard in 3.x?
 Key: TIKA-4134
 URL: https://issues.apache.org/jira/browse/TIKA-4134
 Project: Tika
  Issue Type: Task
Reporter: Tim Allison


On https://github.com/apache/tika/pull/1345#issuecomment-1723321327, 
[~desruisseaux] pointed out that uber jars might not be the best idea with jpms 
in the future.

I'm opening this issue to discuss if we want to change the packaging structure 
in 3.x.

If we wanted to "go small" we can keep things as they are in 3.x and warn users 
that we might move away from an uber jar in 4.x.

If we wanted to "go big", we could use the maven dependency plugin to create a 
"lib/" directory with all of the dependencies and then have a small 
tika-app.jar that includes those dependencies in its classpath.

Are there other, better ways that we should think about packaging tika-app and 
tika-server in 3.x and beyond?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [tika] tballison commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


tballison commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1723300367

   > Note that uber JAR files created by Maven Shade Plugin may not work 
anymore in a JPMS world. Currently, the plugin seems to workaround by removing 
all `module-info.class` files. It works if the dependencies duplicate all their 
services declarations into `META-INF/services/` for compatibility with the old 
world. Apache SIS and Derby for instances do that. But any library could decide 
to stop doing this duplication in some future version, because it makes more 
difficult to use some features that are unique to `module-info`. It may be 
safer to progressively move away from uber JAR if possible.
   
   Do you have recommendations for first steps in this direction?  We have some 
flexibility with 3.x.  How do we start?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766256#comment-17766256
 ] 

ASF GitHub Bot commented on TIKA-3948:
--

theit commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1722932587

   I was working locally for myself on the migration of Tika from javax to 
jakarta before I found out this PR and now have a few comments/hints, 
especially WRT to `tika-parent/pom.xml`:
   
   1. The different CXF dependencies could be replaced by the import BOM.
   2. The same mechanism could be used for SLF4J that also offers a BOM; no 
need to manage three dependencies.
   3. On my tests I had to add an exclusion for `xml-apis:xml-apis` in the 
dependency manangement section for `xerces:xercesImpl`. Otherwise I get tons of 
errors in my IDE (Eclipse) because the package `org.xml.sax` is accessible from 
more than one module: ``, `java.xml`. Haven't you seen them too?
   4. Running `mvn verify` fails in Tika server core:
   ```
   [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
0.054 s <<< FAILURE! 

> Migrate to jakarta in Tika 3.x
> --
>
> Key: TIKA-3948
> URL: https://issues.apache.org/jira/browse/TIKA-3948
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>  Labels: tika-3x
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [tika] theit commented on pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


theit commented on PR #1345:
URL: https://github.com/apache/tika/pull/1345#issuecomment-1722932587

   I was working locally for myself on the migration of Tika from javax to 
jakarta before I found out this PR and now have a few comments/hints, 
especially WRT to `tika-parent/pom.xml`:
   
   1. The different CXF dependencies could be replaced by the import BOM.
   2. The same mechanism could be used for SLF4J that also offers a BOM; no 
need to manage three dependencies.
   3. On my tests I had to add an exclusion for `xml-apis:xml-apis` in the 
dependency manangement section for `xerces:xercesImpl`. Otherwise I get tons of 
errors in my IDE (Eclipse) because the package `org.xml.sax` is accessible from 
more than one module: ``, `java.xml`. Haven't you seen them too?
   4. Running `mvn verify` fails in Tika server core:
   ```
   [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
0.054 s <<< FAILURE! -- in org.apache.tika.server.core.StackTraceTest
   [ERROR] org.apache.tika.server.core.StackTraceTest.testEmptyParser -- Time 
elapsed: 0.006 s <<< FAILURE!
   org.opentest4j.AssertionFailedError: bad type: /tika ==> expected: <200> but 
was: <500>
at 
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
   (...)
   ```
   However, executing the (single) test in Eclipse, this works. Also when I let 
Eclipse run all JUnit tests in that module. Do you have an idea what is causing 
this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766246#comment-17766246
 ] 

ASF GitHub Bot commented on TIKA-3948:
--

theit commented on code in PR #1345:
URL: https://github.com/apache/tika/pull/1345#discussion_r1328345983


##
tika-parent/pom.xml:
##
@@ -730,6 +723,16 @@
 commons-math3
 ${commons.math3.version}
   
+  
+org.apache.cxf
+cxf-rt-rs-client
+${cxf.version}
+  
+  
+org.apache.cxf
+cxf-rt-frontend-jaxrs
+${cxf.version}
+  

Review Comment:
   This could be simplified:
   ```xml
 
   org.apache.cxf
   cxf-bom
   ${cxf.version}
   pom
   import
 
   ```





> Migrate to jakarta in Tika 3.x
> --
>
> Key: TIKA-3948
> URL: https://issues.apache.org/jira/browse/TIKA-3948
> Project: Tika
>  Issue Type: Task
>Reporter: Tim Allison
>Priority: Major
>  Labels: tika-3x
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [tika] theit commented on a diff in pull request #1345: TIKA-3948 -- migrate from javax -> jakarta

2023-09-18 Thread via GitHub


theit commented on code in PR #1345:
URL: https://github.com/apache/tika/pull/1345#discussion_r1328345983


##
tika-parent/pom.xml:
##
@@ -730,6 +723,16 @@
 commons-math3
 ${commons.math3.version}
   
+  
+org.apache.cxf
+cxf-rt-rs-client
+${cxf.version}
+  
+  
+org.apache.cxf
+cxf-rt-frontend-jaxrs
+${cxf.version}
+  

Review Comment:
   This could be simplified:
   ```xml
 
   org.apache.cxf
   cxf-bom
   ${cxf.version}
   pom
   import
 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org