Hi Ryan,
I started working on both textmining and your sample code.
Working fine. But few queries here.
1. Is it possible to show the header text at the beginning of
the document page?. As of now it is showing at end of the total
text extraction.
2. Is it possible to show the text box content in the place
where it is? As of now it is showing all the text boxes content
at the end of text extraction.
3. Is it possible to format the extracted text of table content.
As of now its extracting with some control charachters.(I have
identified that '\007' is used as delimiter to seperate cell
contents,row contents ) .
4. From a document How can I extract the picture conent?
5. As these are open source API,Do I have permissions to change
the parts of code in POI or textmining ?.
--- Ryan Ackley <[EMAIL PROTECTED]> wrote:
> What do you want to do? Here is getting the text:
>
> FileInputStream in = new FileInputStream("C:\\test.doc");
> HWPFDocument doc = new HWPFDocument(in);
>
> doc.getRange().text();
>
> ----- Original Message -----
> From: "Koundinya (Sudhakar Chavali)"
> <[EMAIL PROTECTED]>
> To: "POI Users List" <[EMAIL PROTECTED]>; "Ryan
> Ackley"
> <[EMAIL PROTECTED]>
> Sent: Thursday, March 25, 2004 9:02 AM
> Subject: Re: What are the known bugs in WordDocument class
>
>
> > Yes,
> > that helps me to initialise my base work of my project
> >
> > Thanks,
> > Sudhakar
> >
> > --- Ryan Ackley <[EMAIL PROTECTED]> wrote:
> > > The HWPFDocument class cannot handle complex word files.
> Do
> > > you still want
> > > an example?
> > >
> > > -Ryan
> > >
> > > ----- Original Message -----
> > > From: "Koundinya (Sudhakar Chavali)"
> > > <[EMAIL PROTECTED]>
> > > To: "Ryan Ackley" <[EMAIL PROTECTED]>
> > > Cc: <[EMAIL PROTECTED]>
> > > Sent: Thursday, March 25, 2004 2:39 AM
> > > Subject: Re: What are the known bugs in WordDocument class
> > >
> > >
> > > > OK
> > > >
> > > > Can I have the sample example to understand
> HWPFDocument. My
> > > > Work is totally related to parse the Word Document to
> text.
> > > >
> > > > And also please let me know, What is the length of the
> file
> > > that
> > > > it can handle.
> > > >
> > > >
> > > >
> > > > thanks,
> > > > sudhakar
> > > >
> > > >
> > > > --- Ryan Ackley <[EMAIL PROTECTED]> wrote:
> > > > > Textmining.org does not support fast saved files and
> > > neither
> > > > > does POI
> > > > >
> > > > > -Ryan
> > > > >
> > > > > ----- Original Message -----
> > > > > From: "Koundinya (Sudhakar Chavali)"
> > > > > <[EMAIL PROTECTED]>
> > > > > To: "Ryan Ackley" <[EMAIL PROTECTED]>; "POI Users
> List"
> > > > > <[EMAIL PROTECTED]>
> > > > > Sent: Thursday, March 25, 2004 12:16 AM
> > > > > Subject: Re: What are the known bugs in WordDocument
> class
> > > > >
> > > > >
> > > > > > Hi Ryan,
> > > > > >
> > > > > > I am talking here about fast saved files.
> > > > > >
> > > > > >
> > > > > >
> > > > > > thanks,
> > > > > > Sudhakar
> > > > > >
> > > > > > --- Ryan Ackley <[EMAIL PROTECTED]> wrote:
> > > > > > > I'm not sure of your definition of "complex". For
> Word
> > > > > files,
> > > > > > > there is
> > > > > > > "complex" used as a simple adjective and "complex"
> > > used to
> > > > > > > describe a
> > > > > > > special format that Word uses.
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > > From: "Koundinya (Sudhakar Chavali)"
> > > > > > > <[EMAIL PROTECTED]>
> > > > > > > To: "POI Users List"
> <[EMAIL PROTECTED]>;
> > > "Ryan
> > > > > > > Ackley"
> > > > > > > <[EMAIL PROTECTED]>
> > > > > > > Sent: Wednesday, March 24, 2004 9:32 PM
> > > > > > > Subject: Re: What are the known bugs in
> WordDocument
> > > class
> > > > > > >
> > > > > > >
> > > > > > > > Will the new class(HWPFDocument) handles Complex
> > > file
> > > > > > > structure?
> > > > > > > >
> > > > > > > > When I had done initial test with the
> WordDocument
> > > > > class, I
> > > > > > > have
> > > > > > > > found that it is raising exceptions for complex
> file
> > > > > parsing
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > > Sudhakar
> > > > > > > >
> > > > > > > >
> > > > > > > > --- Ryan Ackley <[EMAIL PROTECTED]> wrote:
> > > > > > > > > The WordDocument class is being deprecated.
> Use
> > > the
> > > > > > > > > HWPFDocument class
> > > > > > > > > instead.
> > > > > > > > >
> > > > > > > > > -Ryan
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > > From: "Koundinya (Sudhakar Chavali)"
> > > > > > > > > <[EMAIL PROTECTED]>
> > > > > > > > > To: "POI Users List"
> > > <[EMAIL PROTECTED]>;
> > > > > "Ryan
> > > > > > > > > Ackley"
> > > > > > > > > <[EMAIL PROTECTED]>
> > > > > > > > > Sent: Wednesday, March 24, 2004 7:47 AM
> > > > > > > > > Subject: What are the known bugs in
> WordDocument
> > > class
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Hello World,
> > > > > > > > > >
> > > > > > > > > > I would like to know what are the know bugs
> of
> > > > > > > > > >
> org.apache.poi.hdf.extractor.WordDocument.java
> > > class
> > > > > and
> > > > > > > > > what
> > > > > > > > > > are it's limitations. I am asking this
> because,
> > > I
> > > > > wanted
> > > > > > > to
> > > > > > > > > use
> > > > > > > > > > the POI in my project
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Sudhakar
> > > > > > > > > >
> > > > > > > > > > __________________________________
> > > > > > > > > > Do you Yahoo!?
> > > > > > > > > > Yahoo! Finance Tax Center - File online.
> File on
> > > > > time.
> > > > > > > > > > http://taxes.yahoo.com/filing.html
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
---------------------------------------------------------------------
> > > > > > > > > > To unsubscribe, e-mail:
> > > > > > > > > [EMAIL PROTECTED]
> > > > > > > > > > For additional commands, e-mail:
> > > > > > > > > [EMAIL PROTECTED]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
---------------------------------------------------------------------
> > > > > > > > > To unsubscribe, e-mail:
> > > > > > > > > [EMAIL PROTECTED]
> > > > > > > > > For additional commands, e-mail:
> > > > > > > > > [EMAIL PROTECTED]
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > =====
> > > > > > > > "No one can earn a million dollars honestly."-
> > > William
> > > > > > > Jennings Bryan
> > > > > > > (1860-1925)
> > > > > > > >
> > > > > > > > "Make everything as simple as possible, but not
> > > > > simpler."-
> > > > > > > Albert Einstein
> > > > > > > (1879-1955)
> > > > > > > >
> > > > > > > > "It is dangerous to be sincere unless you are
> also
> > > > > stupid."-
> > > > > > > George
> > > > > > > Bernard Shaw (1856-1950)
> > > > > > > >
> > > > > > > > __________________________________
> > > > > > > > Do you Yahoo!?
> > > > > > > > Yahoo! Finance Tax Center - File online. File on
> > > time.
> > > > > > > > http://taxes.yahoo.com/filing.html
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
---------------------------------------------------------------------
> > > > > > > > To unsubscribe, e-mail:
> > > > > > > [EMAIL PROTECTED]
> > > > > > > > For additional commands, e-mail:
> > > > > > > [EMAIL PROTECTED]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > =====
> > > > > > "No one can earn a million dollars honestly."-
> William
> > > > > Jennings Bryan
> > > > > (1860-1925)
> > > > > >
> > > > > > "Make everything as simple as possible, but not
> > > simpler."-
> > > > > Albert Einstein
> > > > > (1879-1955)
> > > > > >
> > > > > > "It is dangerous to be sincere unless you are also
> > > stupid."-
> > > > > George
> > > > > Bernard Shaw (1856-1950)
> > > > > >
> > > > > > __________________________________
> > > > > > Do you Yahoo!?
> > > > > > Yahoo! Finance Tax Center - File online. File on
> time.
> > > > > > http://taxes.yahoo.com/filing.html
> > > > >
> > > >
> > > >
> > > > =====
> > > > "No one can earn a million dollars honestly."- William
> > > Jennings Bryan
> > > (1860-1925)
> > > >
> > > > "Make everything as simple as possible, but not
> simpler."-
> > > Albert Einstein
> > > (1879-1955)
> > > >
> > > > "It is dangerous to be sincere unless you are also
> stupid."-
> > > George
> > > Bernard Shaw (1856-1950)
> > > >
> > > > __________________________________
> > > > Do you Yahoo!?
> > > > Yahoo! Finance Tax Center - File online. File on time.
> > > > http://taxes.yahoo.com/filing.html
> > > >
> > > >
> > >
> >
>
---------------------------------------------------------------------
> > > > To unsubscribe, e-mail:
> > > [EMAIL PROTECTED]
> > > > For additional commands, e-mail:
> > > [EMAIL PROTECTED]
> > > >
> > >
> > >
> > >
> >
>
---------------------------------------------------------------------
> > > To unsubscribe, e-mail:
> > > [EMAIL PROTECTED]
> > > For additional commands, e-mail:
> > > [EMAIL PROTECTED]
> > >
> >
> >
> > =====
> > "No one can earn a million dollars honestly."- William
> Jennings Bryan
> (1860-1925)
> >
> > "Make everything as simple as possible, but not simpler."-
> Albert Einstein
> (1879-1955)
> >
> > "It is dangerous to be sincere unless you are also stupid."-
> George
> Bernard Shaw (1856-1950)
> >
> > __________________________________
> > Do you Yahoo!?
> > Yahoo! Finance Tax Center - File online. File on time.
> > http://taxes.yahoo.com/filing.html
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
>
=====
"No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925)
"Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)
"It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw
(1856-1950)
__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]