[ 
https://issues.apache.org/jira/browse/PDFBOX-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768774#comment-17768774
 ] 

Marcelo Modesto edited comment on PDFBOX-5667 at 9/27/23 8:35 PM:
------------------------------------------------------------------

Thank you very much for your reply!

Let me try to quote some parts of your answer in a different order to better 
contextualize some information
{quote}If you want to test text extraction you might use the class 
PDFTextStripper directly to avoid that issue.
{quote}
My intention was to write some tests for a change I suggested through 
PDFBOX-5670.
As I described there, my main goal was to look for a way to avoid the overhead 
of calling the JVM multiple times to process multiple PDF files.
But I think I can explain it better in that ticket (to be honest, I think 
something like PDFMerge would be better - accepts a array of Files. I have a 
patch for this modification too).
{quote}It is not an issue with the test framework.
{quote}
{quote}All static instances should be made non static to avoid such suprises
{quote}
I don't have in-depth knowledge of Java to confirm whether there was a problem 
with the testing framework.
So I asked for help to understand this strange behavior.
Furthermore, my intention with this ticket was to share this strange behavior 
with you.
You may already know about this, but I haven't found anything similar in Jira.
I did some testing and what made me think of something related to the testing 
framework was that if I change the SYSOUT setting back to "final static" the 
modified ExtractText will still work as a repeatable subcommand (see 
picocli_execution.txt). But all the tests I wrote fail.
{quote}If the value of System.out is changed during subsequent calls of 
ExtractText those changes don't matter as the static value in ExtractText isn't 
changed.
{quote}
One thing I needed to change in the PDFBOX-5670 was to avoid closing the 
OutputStreamWriter created from System.out. Nothing works if I do not change 
this. 
I suspect it's because SYSOUT is a "static final reference" to System.Out, not 
a copy of a new System.Out. But again, it's just a guess. 
I don't know much about Java and I don't have enough skills and tools to 
investigate this behavior further.

Feel free to close this ticket if you feel appropriate.

Thank you!


was (Author: JIRAUSER301940):
Thank you very much for your reply!

Let me try to quote some parts of your answer in a different order to better 
contextualize some information
{quote}If you want to test text extraction you might use the class 
PDFTextStripper directly to avoid that issue.
{quote}
My intention was to write some tests for a change I suggested through 
PDFBOX-5670.
As I described there, my main goal was to look for a way to avoid the overhead 
of calling the JVM multiple times to process multiple PDF files.
But I think I can explain it better in that ticket (to be honest, I think 
something like PDFMerge would be better - accepts a array of Files. I have a 
patch for this modification too).
{quote}It is not an issue with the test framework.
{quote}
{quote}All static instances should be made non static to avoid such suprises
{quote}
I don't have in-depth knowledge of Java to confirm whether there was a problem 
with the testing framework.
So I asked for help to understand this strange behavior.
Furthermore, my intention with this ticket was to share this strange behavior 
with you.
You may already know about this, but I haven't found anything similar in Jira.
I did some testing and what made me think of something related to the testing 
framework was that if I change the SYSOUT setting back to "final static" the 
modified ExtractText will still work as a repeatable subcommand (see 
picocli_execution.txt). But all the tests I wrote fail.
{quote}If the value of System.out is changed during subsequent calls of 
ExtractText those changes don't matter as the static value in ExtractText isn't 
changed.
{quote}
One thing I needed to change in the PDFBOX-5670 was to avoid closing the 
OutputStream created from System.out. Nothing works if I do not change this. 
I suspect it's because SYSOUT is a "static final reference" to System.Out, not 
a copy of a new System.Out. But again, it's just a guess. 
I don't know much about Java and I don't have enough skills and tools to 
investigate this behavior further.

Feel free to close this ticket if you feel appropriate.

Thank you!

> Can't create test for ExtractText command line tool
> ---------------------------------------------------
>
>                 Key: PDFBOX-5667
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5667
>             Project: PDFBox
>          Issue Type: Test
>          Components: Text extraction
>    Affects Versions: 3.0.0 PDFBox
>         Environment: openjdk 11.0.20 2023-07-18
> OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu122.04)
> OpenJDK 64-Bit Server VM (build 11.0.20+8-post-Ubuntu-1ubuntu122.04, mixed 
> mode, sharing)
>            Reporter: Marcelo Modesto
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>         Attachments: TestExtractText fails.txt, picocli_execution.txt
>
>
> I think it's an issue with the testing framework, not PDFBox.
> When I try to create a new test for the ExtractText command line tool, it 
> fails because the standard output is empty after running the first test.
> If I change the declaration of a ExtractText variable (removing "static 
> final") that references System.out the test works.
> If you can confirm it's an issue with the test framework I would appreciate 
> it!
> Thank you! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to