RE: TextToPDF function removes the first char since 2.0.28

2023-08-01 Thread michael.a...@universa.de
Many thanks Andreas! Hmm this could be a problem for others, too. But ok i will 
use a temporary file.
Kind regards Michael 


I've ran your shell script and got the same result, the first char is missing 
in the pdf.

It seems to be related to the way you are calling TextToPDF. You are simply 
print the text to the console and redirect it to TextToPDF.

I've changed that and echoed the text to a file and used that file as input for 
TextToPDF. Voila, everything works fine.

PDFBOX-5554 added support for a charset parameter and a leading UTF-8 BOM is 
removed automatically. I assume the latter is the issue here. It reads the 
input twice and somehow this doesn't work with a redirected input on linux

Andreas

Am 25.07.23 um 08:10 schrieb michael.a...@universa.de:
>> the question is, where does the char got lost, when creating the pdf or when 
>> extracting the text?
> 
> Sorry if i was not precise enough. The created pdf misses the first char. So 
> the TextToPDF function has a problem.
> 
>> Did you check the created pdf? Does it contain the whole text?
> 
> I tested/viewed it. The first char is missing.
> 
> 
> Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
> hier:
> https://go2web.universa.de/redirect/?https://www.universa.de/e-mail-ko
> mmunikation
> 
> Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
> nachlesen unter:
> https://go2web.universa.de/redirect/?https://www.universa.de/datenschu
> tz
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
> 

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

+--+
| uniVersa Sicherheitshinweis  |
+--+
| - Die Nachricht war weder inhaltsverschluesselt noch digital |
| unterschrieben   |
+--+

Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier: 
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter: 
https://www.universa.de/datenschutz

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



Re: TextToPDF function removes the first char since 2.0.28

2023-07-27 Thread Andreas Lehmkühler
I've ran your shell script and got the same result, the first char is 
missing in the pdf.


It seems to be related to the way you are calling TextToPDF. You are 
simply print the text to the console and redirect it to TextToPDF.


I've changed that and echoed the text to a file and used that file as 
input for TextToPDF. Voila, everything works fine.


PDFBOX-5554 added support for a charset parameter and a leading UTF-8 
BOM is removed automatically. I assume the latter is the issue here. It 
reads the input twice and somehow this doesn't work with a redirected 
input on linux


Andreas

Am 25.07.23 um 08:10 schrieb michael.a...@universa.de:

the question is, where does the char got lost, when creating the pdf or when 
extracting the text?


Sorry if i was not precise enough. The created pdf misses the first char. So 
the TextToPDF function has a problem.


Did you check the created pdf? Does it contain the whole text?


I tested/viewed it. The first char is missing.


Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier:
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter:
https://www.universa.de/datenschutz

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



Re: TextToPDF function removes the first char since 2.0.28

2023-07-26 Thread Gilad Denneboom
I ran the same commands as you did. The first PDF file was created just
fine, but when the text was extracted in the second command a line-break
was added to the end of it, which is probably why your comparison of the
two text files failed. Did you actually view the first file and saw it was
missing text?

On Wed, Jul 26, 2023 at 7:27 AM michael.a...@universa.de <
michael.a...@universa.de> wrote:

> The text is in the provided user acceptance test (bash script). The pdf is
> created with it. From my point of view, all the information is there.
> The creation of a Jira account was denied with reference to the mailing
> list.
>
> I'll give it one last chance and try again to open a Jira account with the
> text from my posted uat.
>
> Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden
> Sie hier:
> https://www.universa.de/e-mail-kommunikation
>
> Informationen zum Datenschutz und zu den Betroffenenrechten können Sie
> nachlesen unter:
> https://www.universa.de/datenschutz
>


Re: TextToPDF function removes the first char since 2.0.28

2023-07-25 Thread michael.a...@universa.de
The text is in the provided user acceptance test (bash script). The pdf is 
created with it. From my point of view, all the information is there.
The creation of a Jira account was denied with reference to the mailing list.

I'll give it one last chance and try again to open a Jira account with the text 
from my posted uat.

Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier: 
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter: 
https://www.universa.de/datenschutz


Re: TextToPDF function removes the first char since 2.0.28

2023-07-25 Thread Tilman Hausherr
Please link to a site with the text and the created PDF. Or open an 
issue in JIRA. If you don't have an account, include a meaningful text 
in https://selfserve.apache.org/jira-account.html


Tilman

On 25.07.2023 08:10, michael.a...@universa.de wrote:

the question is, where does the char got lost, when creating the pdf or when 
extracting the text?

Sorry if i was not precise enough. The created pdf misses the first char. So 
the TextToPDF function has a problem.


Did you check the created pdf? Does it contain the whole text?

I tested/viewed it. The first char is missing.


Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier:
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter:
https://www.universa.de/datenschutz

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



Re: TextToPDF function removes the first char since 2.0.28

2023-07-25 Thread michael.a...@universa.de
>the question is, where does the char got lost, when creating the pdf or when 
>extracting the text?

Sorry if i was not precise enough. The created pdf misses the first char. So 
the TextToPDF function has a problem.

>Did you check the created pdf? Does it contain the whole text?

I tested/viewed it. The first char is missing.


Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier: 
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter: 
https://www.universa.de/datenschutz


Re: TextToPDF function removes the first char since 2.0.28

2023-07-25 Thread Andreas Lehmkühler

Hi,

the question is, where does the char got lost, when creating the pdf or 
when extracting the text?


Did you check the created pdf? Does it contain the whole text?

Andreas

Am 25.07.23 um 07:52 schrieb michael.a...@universa.de:

Hi,

the TextToPDF function worked without problems from 2.0.24 (the first version, 
i used) to 2.0.27.
I use command-line only.

Here is a test:

#!/bin/bash

jar=/usr/share/java/pdfbox-app.jar # adjust

text_in='hello'

java -jar $jar TextToPDF test.pdf <(echo "$text_in") 2>/dev/null
text_out=$(java -jar $jar ExtractText test.pdf >(cat) 2>/dev/null)

echo -e "text_in : $text_in\ntext_out: $text_out"

if [ "$text_in" != "$text_out" ]; then
   echo 'uat failed'
   exit 1
fi

echo 'uat passed'

Kind regards
Michael

Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier:
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter:
https://www.universa.de/datenschutz



-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



TextToPDF function removes the first char since 2.0.28

2023-07-24 Thread michael.a...@universa.de
Hi,

the TextToPDF function worked without problems from 2.0.24 (the first version, 
i used) to 2.0.27.
I use command-line only.

Here is a test:

#!/bin/bash

jar=/usr/share/java/pdfbox-app.jar # adjust

text_in='hello'

java -jar $jar TextToPDF test.pdf <(echo "$text_in") 2>/dev/null
text_out=$(java -jar $jar ExtractText test.pdf >(cat) 2>/dev/null)

echo -e "text_in : $text_in\ntext_out: $text_out"

if [ "$text_in" != "$text_out" ]; then
  echo 'uat failed'
  exit 1
fi

echo 'uat passed'

Kind regards
Michael

Hinweise zur Datensicherheit und zur Vertraulichkeit von E-Mails finden Sie 
hier: 
https://www.universa.de/e-mail-kommunikation

Informationen zum Datenschutz und zu den Betroffenenrechten können Sie 
nachlesen unter: 
https://www.universa.de/datenschutz