Right, managed to make some progress this evening. Have a look at the class
below, it assembles an ArrayList of the tables in a Word document currently
but you could easilly change that if you wanted to. I have added lots of
comments - probably too many but if you have any questions, just drop a line
to the forum.
You will need to change this line ;
inputFile = new File("C:\\temp\\table.doc");
to point to the file you wish to process by the way.
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Table;
import org.apache.poi.hwpf.usermodel.Range;
import java.io.*;
import java.util.ArrayList;
/**
*
* TEST/DEMONSTRATION CODE ONLY.
*
* @author Mark B.
* @version 1.00 8th April 2009.
*/
public class Main {
/**
*/
public static void main(String[] args) {
BufferedInputStream bufIStream = null;
FileInputStream fileIStream = null;
File inputFile = null;
HWPFDocument doc = null;
Range range = null;
Table table = null;
ArrayList<Table> tables = null;
Paragraph para = null;
boolean inTable = false;
int numParas = 0;
try {
tables = new ArrayList<Table>();
inputFile = new File("C:\\temp\\table.doc");
fileIStream = new FileInputStream(inputFile);
bufIStream = new BufferedInputStream(fileIStream);
//
// Open a Word document.
//
doc = new HWPFDocument(bufIStream);
//
// Get the highest level Range object that represents the
// contents of the document.
//
range = doc.getRange();
//
// Get the number of paragraphs
//
numParas = range.numParagraphs();
//
// Step through each Paragraph.
//
for(int i = 0; i < numParas; i++) {
para = range.getParagraph(i);
//
// Is the Paragraph within a table?
//
if(para.isInTable()) {
//
// The inTable flag is used to ensure that a call is
made
// to the getTable() method of the Range class once only
// when the first Paragraph that is within a table is
// recovered. So......
//
if(!inTable) {
//
// Get the table and add it to an ArrayList for
later
// processing. You do not have to do this, it would
// be possible to process the table here. There are
// methods defined on the Table class that allow you
// to get at the number of rows in the table and to
// recover a reference to each row in turn. Once you
// have a row, it is possible then to get at each
cell
// in turn. Look at the Table, TableRow and
TableCell
// classes.
//
table = range.getTable(para);
tables.add(table);
inTable = true;
}
}
else {
//
// Set the flag false to indicate that all of the
paragrphs
// in the table have been processed. A single blank line
is
// sufficient to indicate the end of the tbale within
the
// Word document.
//
// This is also the place to deal with any non-table
paragraphs.
//
inTable = false;
}
}
//
// This line simply prints out the number of tables found in the
// document - usede for testing purposes here.
//
System.out.println("Found " + tables.size() + " tables in the
document.");
}
catch(Exception ex) {
System.out.println("Caught an: " + ex.getClass().getName());
System.out.println("Message: " + ex.getMessage());
System.out.println("Stacktrace follows:..............");
ex.printStackTrace(System.out);
}
finally {
if(bufIStream != null) {
try {
bufIStream.close();
bufIStream = null;
fileIStream = null;
}
catch(Exception ex) {
// I G N O R E //
}
}
}
}
}
[email protected] wrote:
>
> Hi
> Anyone could post me the minimal code to get the list of all tables
> in a word document. In particular i want to have a class from i could
> extract the Object "Table" from each table in the document.
> Greetings
>
> Enrico
>
>
> Con Tutto Incluso chiami e navighi senza limiti e hai 4 mesi GRATIS.
>
> L'attivazione del servizio è gratis e non paghi più Telecom!
>
> L'offerta è valida solo se attivi entro il 07/04/09
> http://abbonati.tiscali.it/promo/tuttoincluso/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
>
--
View this message in context:
http://www.nabble.com/reading-Tables-tp22911444p22957480.html
Sent from the POI - User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]