I know all that, I wrote the Arabic and bidi support in iText.
RUN_DIRECTION_DEFAULT won't work and that's why I didn't mention it. The run
direction can be deduced from the text and there are even some rules about that
based on the directionality strength of the characters. However, this is
something that should be explicitly coded in the CSS as Balder suggested, as it
also involves specific fonts and encoding.
Paulo
----- Original Message -----
From: Vahid Nasiri
To: Post all your questions about iText here
Sent: Monday, December 12, 2011 8:30 PM
Subject: Re: [iText-questions] Fwd: Re: XMLWorker & RTL
Hello,
Yes, but the problem is
public const int RUN_DIRECTION_DEFAULT = 0;
This default value is set to zero here and it's different from
public const int RUN_DIRECTION_LTR = 2;
public const int RUN_DIRECTION_RTL = 3;
So by default, Arabic and Persian texts will not be displayed correctly by
using iTextSharp.
Suppose we have:
string text ="سلام";
Which means "hello" in English.
If I don't specify BaseFont.IDENTITY_H, nothing will be displayed on the page
with iTextSharp.
If I specify BaseFont.IDENTITY_H, "م ا ل س" will be printed, which is wrong (
"س ل ا م" is the correct run direction and not its inverse form).
If I warp this text in a container which has run_direction and the set its
run_direction to RUN_DIRECTION_RTL or RUN_DIRECTION_LTR, "سلام" will be
displayed correctly.
So both of these setting are necessary to display Arabic and Persian texts
correctly.
Here are some tests if you want to see that in action
Test-1 (using default encoding)
using (var pdfDoc = new Document(PageSize.A4))
{
PdfWriter.GetInstance(pdfDoc, new FileStream("Test.pdf",
FileMode.Create));
pdfDoc.Open();
var chunk = new Chunk("آزمايش");
pdfDoc.Add(chunk);
}
Its result is nothing. An empty page.
Test-2 (using BaseFont.IDENTITY_H)
using (var pdfDoc = new Document(PageSize.A4))
{
PdfWriter.GetInstance(pdfDoc, new FileStream("Test.pdf",
FileMode.Create));
pdfDoc.Open();
var fontPath = Environment.GetEnvironmentVariable("SystemRoot") +
"\\fonts\\tahoma.ttf";
var baseFont = BaseFont.CreateFont(fontPath, BaseFont.IDENTITY_H,
BaseFont.EMBEDDED);
var tahomaFont = new Font(baseFont, 10, Font.NORMAL, BaseColor.BLACK);
var chunk = new Chunk("سلام",tahomaFont);
pdfDoc.Add(chunk);
}
Now it prints something like "م ا ل س" which is completely wrong. It should
be rotated or the correct BIDI processing should be applied here.
Test-3 (uisng BaseFont.IDENTITY_H & PdfWriter.RUN_DIRECTION_RTL or LTR and
not using RUN_DIRECTION_DEFAULT)
using (var pdfDoc = new Document(PageSize.A4))
{
var pdfWriter = PdfWriter.GetInstance(pdfDoc, new
FileStream("Test.pdf", FileMode.Create));
pdfDoc.Open();
var fontPath = Environment.GetEnvironmentVariable("SystemRoot") +
"\\fonts\\tahoma.ttf";
var baseFont = BaseFont.CreateFont(fontPath, BaseFont.IDENTITY_H,
BaseFont.EMBEDDED);
var tahomaFont = new Font(baseFont, 10, Font.NORMAL,
BaseColor.BLACK);
PdfPTable table = new PdfPTable(numColumns: 1);
PdfPCell pdfCell = new PdfPCell(new Phrase("آزمايش", tahomaFont));
pdfCell.RunDirection = PdfWriter.RUN_DIRECTION_RTL; //it should not
be RUN_DIRECTION_DEFAULT
table.AddCell(pdfCell);
pdfDoc.Add(table);
}
Now if I warp that phrase in an element with run_direction and set its
run_dir to RUN_DIRECTION_RTL, "سلام" will be displayed correctly.
So displaying Arabic and Persian texts correctly without using elements which
have not run_direction is impossible in iTextSharp. "سلام" (the correct form)
is not equal to "مالس" (the wrong form, result of the RUN_DIRECTION_DEFAULT).
------------------------------------------------------------------------------
From: Paulo Soares <psoa...@glintt.com>
To: Post all your questions about iText here
<itext-questions@lists.sourceforge.net>
Sent: Monday, December 12, 2011 9:06 PM
Subject: Re: [iText-questions] Fwd: Re: XMLWorker & RTL
There's a confusion between run direction and bidi processing with Arabic
shapping. RUN_DIRECTION_RTL and RUN_DIRECTION_LTR will show Arabic (and Latin)
correctly but the former will start the text from the right and the latter from
the left. The text will be correct in both cases, it's only a preference
depending on the audience.
Paulo
----------------------------------------------------------------------------
From: Vahid Nasiri [mailto:vahid_nas...@yahoo.com]
Sent: Sunday, December 11, 2011 5:37 AM
To: Post all your questions about iText here
Subject: Re: [iText-questions] Fwd: Re: XMLWorker & RTL
Hello,
Thanks for your attention, but encoding = BaseFont.IDENTITY_H is mandatory
for RTL strings otherwise nothing will be displayed on the screen (just an
empty space) and PdfWriter.RUN_DIRECTION_RTL should be applied too to rotate
characters. Without PdfWriter.RUN_DIRECTION_RTL you will see "tac" instead of
"cat". So there is no choice here. It doesn't matter css direction:rtl is set
or not. Without BaseFont.IDENTITY_H and PdfWriter.RUN_DIRECTION_RTL, the result
will be nothing or some garbage for RTL data.
Best regards,
Vahid
----------------------------------------------------------------------------
From: Balder VC <li...@redlab.be>
To: Post all your questions about iText here
<itext-questions@lists.sourceforge.net>
Sent: Saturday, December 10, 2011 5:20 PM
Subject: [iText-questions] Fwd: Re: XMLWorker & RTL
Seems my previous mail did not get through
-------- Original Message -------- Subject: Re: [iText-questions]
XMLWorker & RTL
Date: Sat, 10 Dec 2011 14:30:35 +0100
From: Balder VC <li...@redlab.be>
Organisation: redlab.be
To: itext-questions@lists.sourceforge.net
Hi
Thanks for bringing that to attention. At the moment rtl is not supported.
I would not rely on a regex to determine the text direction after all
perhaps some one intend to display it in the other run direction?
I would opt for using CSS to set the run direction. Just like in html
{
direction:rtl;
}
the encoding should be settable in the same way, we can add an xmlworker
specific css property for that. Forcing .CP1252 is not a good idea, that's
right.
Thanks for the idea
ps: please don't hijack other threads, write a new message for a new topic,
I almost mist this mail.
On 8/12/2011 22:34, Vahid Nasiri wrote:
Hello,
In iTextSharp.tool.xml.css.apply.ChunkCssApplier class, String encoding
is hardcoded to BaseFont.CP1252.
It's easy to detect right to left languages data:
static readonly Regex MatchArabicHebrew = new
Regex(@"[\u0600-\u06FF,\u0590-\u05FF]+", RegexOptions.IgnoreCase |
RegexOptions.Compiled);
public static bool IsRtl(string data)
{
if (string.IsNullOrEmpty(data)) return false;
return MatchArabicHebrew.IsMatch(data);
}
And then we can improve Apply method of ChunkCssApplier class for
instance:
public Chunk Apply(Chunk c, Tag t)
{
String fontName = null;
String encoding = BaseFont.CP1252;
if (IsRtl(c.Content)) encoding = BaseFont.IDENTITY_H;
Also run_direction should be set for PdfPCell and other similar elements
to RTL.
Ex. iTextSharp.tool.xml.html.table.TableData class
public override IList<IElement> End(IWorkerContext ctx, Tag tag,
IList<IElement> currentContent) {
HtmlCell cell = new HtmlCell();
IList<IElement> l = new List<IElement>(1);
foreach (IElement e in currentContent) {
if(e is Chunk)
if (IsRtl(((Chunk)e).Content))
{
cell.RunDirection = PdfWriter.RUN_DIRECTION_RTL;
}
--
twitter
redlab-log
------------------------------------------------------------------------------
Learn Windows Azure Live! Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for
developers. It will provide a great way to learn Windows Azure and what it
provides. You can attend the event by watching it streamed LIVE online.
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a
reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Learn Windows Azure Live! Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for
developers. It will provide a great way to learn Windows Azure and what it
provides. You can attend the event by watching it streamed LIVE online.
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a
reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Learn Windows Azure Live! Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for
developers. It will provide a great way to learn Windows Azure and what it
provides. You can attend the event by watching it streamed LIVE online.
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php