Right, managed to make some progress this evening. Have a look at the class
below, it assembles an ArrayList of the tables in a Word document currently
but you could easilly change that if you wanted to. I have added lots of
comments - probably too many but if you have any questions, just drop a line
to the forum.

You will need to change this line ;

inputFile = new File("C:\\temp\\table.doc");

to point to the file you wish to process by the way.

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Table;
import org.apache.poi.hwpf.usermodel.Range;

import java.io.*;
import java.util.ArrayList;

/**
 *
 * TEST/DEMONSTRATION CODE ONLY.
 * 
 * @author Mark B.
 * @version 1.00 8th April 2009.
 */
public class Main {
    
    /**
     */
    public static void main(String[] args) {
        
        BufferedInputStream bufIStream = null;
        FileInputStream fileIStream = null;
        File inputFile = null;
        HWPFDocument doc = null;
        Range range = null;
        Table table = null;
        ArrayList<Table> tables = null;
        Paragraph para = null;
        boolean inTable = false;
        int numParas = 0;
        
        try {
            tables = new ArrayList<Table>();
            inputFile = new File("C:\\temp\\table.doc");
            fileIStream = new FileInputStream(inputFile);
            bufIStream = new BufferedInputStream(fileIStream);
            //
            // Open a Word document.
            //
            doc = new HWPFDocument(bufIStream);
            //
            // Get the highest level Range object that represents the
            // contents of the document.
            //
            range = doc.getRange();
            //
            // Get the number of paragraphs
            //
            numParas = range.numParagraphs();
            //
            // Step through each Paragraph.
            //
            for(int i = 0; i < numParas; i++) {
                para = range.getParagraph(i);
                //
                // Is the Paragraph within a table?
                //
                if(para.isInTable()) {
                    //
                    // The inTable flag is used to ensure that a call is
made
                    // to the getTable() method of the Range class once only
                    // when the first Paragraph that is within a table is
                    // recovered. So......
                    //
                    if(!inTable) {
                        //
                        // Get the table and add it to an ArrayList for
later
                        // processing. You do not have to do this, it would
                        // be possible to process the table here. There are
                        // methods defined on the Table class that allow you
                        // to get at the number of rows in the table and to
                        // recover a reference to each row in turn. Once you
                        // have a row, it is possible then to get at each
cell
                        // in turn. Look at the Table, TableRow and
TableCell
                        // classes.
                        //
                        table = range.getTable(para);
                        tables.add(table);
                        inTable = true;
                    }
                }
                else {
                    //
                    // Set the flag false to indicate that all of the
paragrphs
                    // in the table have been processed. A single blank line
is
                    // sufficient to indicate the end of the tbale within
the
                    // Word document.
                    //
                    // This is also the place to deal with any non-table
paragraphs.
                    //
                    inTable = false;
                }
            }
            //
            // This line simply prints out the number of tables found in the
            // document - usede for testing purposes here.
            //
            System.out.println("Found " + tables.size() + " tables in the
document.");
        }
        catch(Exception ex) {
            System.out.println("Caught an: " + ex.getClass().getName());
            System.out.println("Message: " + ex.getMessage());
            System.out.println("Stacktrace follows:..............");
            ex.printStackTrace(System.out);
        }
        finally {
            if(bufIStream != null) {
                try {
                    bufIStream.close();
                    bufIStream = null;
                    fileIStream = null;
                }
                catch(Exception ex) {
                    // I G N O R E //
                }
            }
        }
        
    }
}



[email protected] wrote:
> 
> Hi
> Anyone could post me the minimal code to get the list of all tables 
> in a word document. In particular i want to have a class from i could 
> extract the Object "Table" from each table in the document.
> Greetings
> 
> Enrico 
> 
> 
> Con Tutto Incluso chiami e navighi senza limiti e hai 4 mesi GRATIS.
> 
> L'attivazione del servizio è gratis e non paghi più Telecom! 
> 
> L'offerta è valida solo se attivi entro il 07/04/09
> http://abbonati.tiscali.it/promo/tuttoincluso/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/reading-Tables-tp22911444p22957480.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to