Thanks
I will give it a try.
I have Ubuntu 16.04.2 running in Virtual Box on my Mac
I have about 900 PDF files to convert


On Friday, April 21, 2017 at 1:37:14 PM UTC-4, Bob Weber wrote:
>
> I use Debian linux.  It was easy to convert the pdf file to text with the 
> command "pdftotext -raw file.pdf".  It produced a file like this:
>
> WeatherCat Daily Report For Jan 1, 2015 
> Hour TempHiTempLo HeatHi HeatLo ChillHi ChillLo DewPHi DewPLo HumHi HumLo 
> PresHi PresLo R/hHi R/hLo Rain AvWsHi AvWsLo GustHi GustLo WDir WRun SolHi 
> SolLo UVHi UVLo 
> 0 25.3 24.2 25.3 24.2 25.3 24.2 15.7 12.9 69 60 30.20 30.18 0.00 0.00 0.00 
> 0 0 3 0 275 0.0 0 0 0.0 0.0 
> 1 28.4 25.3 28.4 25.3 28.4 25.3 14.3 11.6 62 49 30.19 30.17 0.00 0.00 0.00 
> 2 0 9 0 282 0.6 0 0 0.0 0.0 
> 2 28.0 27.4 28.0 27.4 28.0 27.4 12.5 11.3 52 49 30.17 30.16 0.00 0.00 0.00 
> 1 0 5 0 183 0.1 0 0 0.0 0.0 
> 3 27.8 27.1 27.8 27.1 27.8 27.1 11.7 10.6 51 48 30.17 30.16 0.00 0.00 0.00 
> 1 0 4 0 252 0.1 0 0 0.0 0.0 
> 4 27.5 25.5 27.5 25.5 27.5 25.5 11.2 10.3 54 49 30.16 30.15 0.00 0.00 0.00 
> 0 0 2 0 251 0.0 0 0 0.0 0.0 
> 5 25.5 24.7 25.5 24.7 25.5 24.7 12.9 11.0 59 54 30.16 30.15 0.00 0.00 0.00 
> 0 0 2 0 275 0.0 0 0 0.0 0.0 
> 6 24.7 23.6 24.7 23.6 24.7 23.6 15.4 12.1 69 58 30.16 30.15 0.00 0.00 0.00 
> 0 0 1 0 275 0.0 0 0 0.0 0.0 
> 7 24.0 23.5 24.0 23.5 24.0 23.5 14.0 12.8 65 63 30.16 30.15 0.00 0.00 0.00 
> 0 0 3 0 279 0.0 0 0 0.0 0.0 
> 8 28.2 24.0 28.2 24.0 28.2 24.0 18.7 14.0 68 64 30.19 30.16 0.00 0.00 0.00 
> 0 0 3 0 276 0.0 0 0 0.0 0.0 
> 9 34.2 28.2 34.2 28.2 34.2 28.2 19.5 16.6 68 51 30.19 30.19 0.00 0.00 0.00 
> 0 0 3 0 186 0.0 0 0 0.0 0.0 
> 10 37.4 34.3 37.4 34.3 37.4 34.3 20.4 18.1 53 47 30.19 30.15 0.00 0.00 
> 0.00 1 0 7 0 181 0.6 0 0 0.0 0.0 
> 11 40.5 37.3 40.5 37.3 40.5 37.3 21.0 18.5 49 43 30.15 30.10 0.00 0.00 
> 0.00 2 1 6 0 209 1.0 0 0 0.0 0.0 
> 12 42.4 40.5 42.4 40.5 42.4 39.0 20.9 18.2 44 40 30.10 30.06 0.00 0.00 
> 0.00 3 0 11 1 217 1.8 0 0 0.0 0.0 
> 13 42.4 41.8 42.4 41.8 42.4 41.7 19.8 18.1 40 38 30.06 30.03 0.00 0.00 
> 0.00 2 1 8 1 208 1.3 0 0 0.0 0.0 
> 14 42.6 41.6 42.6 41.6 42.6 41.5 20.0 18.0 41 37 30.03 30.01 0.00 0.00 
> 0.00 2 1 13 0 233 1.5 0 0 0.0 0.0 
> 15 41.8 40.8 41.8 40.8 41.8 40.8 21.3 18.7 45 39 30.02 30.01 0.00 0.00 
> 0.00 2 1 7 0 217 1.1 0 0 0.0 0.0 
> 16 40.8 38.5 40.8 38.5 40.8 38.5 20.7 19.5 47 42 30.02 30.01 0.00 0.00 
> 0.00 1 0 5 0 218 0.5 0 0 0.0 0.0 
> 17 38.6 36.9 38.6 36.9 38.6 36.9 20.5 19.5 49 47 30.02 30.00 0.00 0.00 
> 0.00 1 0 6 0 229 0.1 0 0 0.0 0.0 
> 18 36.9 34.6 36.9 34.6 36.9 34.6 23.0 19.4 62 49 30.01 30.00 0.00 0.00 
> 0.00 0 0 0 0 229 0.0 0 0 0.0 0.0 
> 19 34.6 32.8 34.6 32.8 34.6 32.8 23.5 21.0 65 61 30.01 30.00 0.00 0.00 
> 0.00 0 0 1 0 229 0.0 0 0 0.0 0.0 
> 20 32.8 30.7 32.8 30.7 32.8 30.7 23.1 22.3 71 66 30.01 30.00 0.00 0.00 
> 0.00 0 0 1 0 229 0.0 0 0 0.0 0.0 
> 21 30.7 29.4 30.7 29.4 30.7 29.4 22.5 21.2 72 70 30.03 30.01 0.00 0.00 
> 0.00 0 0 0 0 229 0.0 0 0 0.0 0.0 
> 22 29.4 29.2 29.4 29.2 29.4 29.2 23.3 21.1 78 71 30.05 30.03 0.00 0.00 
> 0.00 0 0 0 0 229 0.0 0 0 0.0 0.0 
> 23 29.5 28.3 29.5 28.3 29.5 28.3 23.4 21.8 78 76 30.05 30.04 0.00 0.00 
> 0.00 0 0 2 0 273 0.0 0 0 0.0 0.0 
> Daily High 42.6 41.8 42.6 41.8 42.6 41.7 23.5 22.3 78 76 30.20 30.19 0.00 
> 0.00 0.00 3 1 13 1 - 1.8 0 0 0.0 0.0 
> Daily Low 24.0 23.5 24.0 23.5 24.0 23.5 11.2 10.3 40 37 30.01 30.00 0.00 
> 0.00 0.00 0 0 0 0 - 0.0 0 0 0.0 0.0 
> Daily Average 33.1 31.3 33.1 31.3 33.1 31.2 18.7 16.6 59 53 30.10 30.09 
> 0.00 0.00 0.00 1 0 4 0 236 0.4 0 0 0.0 0.0 
> Daily Total 0.00 8.9
>
> --------------
>
> So a little bash programming
> for f in *pdf;do pdftotext -raw "$f"; done 
> and all the files will be converted in one command line.  Note you need 
> the " around $f since the file name has spaces in it.
>
> Now for a little python programming to take the first line to get the date 
> and apply the hour for each row and convert to the time format you need 
> (like epoch).  The lines of interest appear to have the first character as 
> a number (hour) and each field separated by white space.  Just use some of 
> the neat csv libraries to convert back out ot csv format.
>
> I would just ignore the daily hi and low lines since once you have the 
> data in csv/sqlite you can find these values easily.
>
> I have just been playing with my own station data by downloading the WU 
> data for my station (back to 2008) and converting it to sqlite/postgress in 
> python as a way to learn python (I'm an old C programmer).
>
> If you don't use Debian or similar linux then try a live debian cd in a VM 
> (like VirtualBox) to do the conversions.  You will probably need to 
> install pdftotext with "apt-get install poppler-utils".
>
> ...Bob
>
> On Friday, April 21, 2017 at 10:42:22 AM UTC-4, MRL wrote:
>>
>> Weather Cat daily data in a PDF file
>> I thought I had a program to convert the PDF file to a text file. The 
>> conversion is a mess and unusable.
>>  Any help? I have 3+ years of daily data.
>> I still have Weather Cat but have not been able to find an export 
>> capability.
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to