Thanks a lot sir for all the information.chars that may be present in a
equation in a research paper are greek letters like pi,sigma,epsilon
etc.they can be created in a microsoft word document as it provides options
to insert such chars.but my doubt is how can i retrieve those chars from the
doc file by using hwpf.even if i am successfull in retrieving,i should be
able to write them in a pdf file using itext.once again thank u.

On Wed, Apr 8, 2009 at 9:01 PM, MSB <[email protected]> wrote:

>
> Thanks for the reply, I understand what you are after a little better now.
>
> As far as I am aware, formatting information is not exposed by the
> Paragraph
> class but by the CharacterRun - org.apache.poi.hwpf.usermodel.CharacterRun
> -
> class. By no means am I an expert but I think that as the Word document is
> parsed by HWPF, if and when the formatting applied to a piece of text
> changes then it - the text - will be encapsulated within an instance of the
> CharacterRun class. That class provides methods that allow you to get at
> the
> colour of the text, the name and size of the font used, and so on. To get
> at
> the CharacterRun(s) in the document you would do something like this;
>
> HWPFDocument doc = new HWPFDocument(new FileInputStream(new
> File("C:\\temp\\test.doc")));
> Range range = doc.getRange();
> int numCharRuns = doc.numCharacterRuns();
> CharacterRun charRun = null;
> for(int i = 0; i < numCharRuns; i++) {
>   charRun = doc.getCharacterRun(i);
> }
>
> Then once you have the CharacterRun, you should be able to interrogate that
> object for lots of information - have a look at the javadoc to see all of
> the available methods. After obtaining the info, you ought to be able to
> use
> iText to create the pdf file for you. My only concern is whether working
> through the document in this manner will allow you to accurately re-create
> it using iText; I guess that only a test will tell us this.
>
> The reason I asked about the nature of the research paper was that I wanted
> to get some idea of the sort of characters that are included. Forgive me
> please as I am 'mathmatically challenged' and do not know the terms to
> describe the sort of operators found in mathmatical expressions, but I
> feared that we may be dealing with those - knowing that the research paper
> is plain text removes that fear.
>
> Have a run with this and see how it works for you - I hope it may be able
> to
> return some of the characters you were not seeing before. If not, we may
> need to look at other options. Should this fail again, is it possible for
> you to let me have a copy - assuming there is no proprietary information
> contained within it that should not be seen by anyone outside of your
> institution - of the sort of document you are working with? That way, I can
> experiment with it myself; for example, I have OpenOffice on my PC and
> NetBeans configured so that I can create and run applications that use
> Universal Network Objects (OpenOffice's API).
>
>
> nikhil n-2 wrote:
> >
> > hii,
> >
> > i am new to hwpf.i am working on a project where i am supposed to read a
> > research paper in ieee format from a doc file and convert it into a pdf
> > file
> > in a customized format.
> > to do that i need to know the font size variations in the text.i am
> unable
> > to read char's like pi,sigma etc present in equations.
> >
> > thank u.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/font-styles-and-equations-in-word-doc-tp22927872p22953001.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to