Hello,

Someone send me a PDF containing some tables with product prices
updates from time to time. I found xpdf and pdftotext shell tool to
export text from pdf file. I want to extract interresting lines from
this text file in order to import relevant data into a database. I
find how to use grep to achieve that.

Most of lines looks like this one :

  26390 PRODUCT
0,28              5,60           0,30              5,90

But sometimes the label was too long in the PDF table cell and
pdftotext exported this :

        START OF A VERY VERY VERY LONG LABEL WITH
 
14604
0,30            14,90            0,30          15,00
        SOME OTHER INFORMATION AT THE END

Can sed help me to rebuild the full product label in a single line ?
Can you help me building out that program ?

Many thanks in advance, best regards,

--
Pierre Y.
[EMAIL PROTECTED] (please get rid of the underscores)

Reply via email to