[email protected] wrote:
I would like to fetch date/time from html file, and use date comparison
and make an ics/vcal file eventually

the date comes as so:

<tr><td colspan="1" rowspan="1">Start Date Time: </td><td colspan="1"
rowspan="1">20/03/2014 1400 Thursday</td></tr>

is 'grep -o' the way to go ? what regex do I need where I put ???? ?

grep -o '<tr><td colspan="1" rowspan="1">Start Date Time: </td><td
colspan="1" rowspan="1">????</td></tr>'
I would recommend egrep and use the following extended regular expression:

   egrep -o '[0-9]{2}/[0-9]{2}/[0-9]{4}[[:space:]][0-9]{4}'

which gives you output like the following:

20/03/2014 1400
19/02/2014 1553
03/01/2013 0114
03/11/2012 1514

as a contrived example containing four dates as in your example.


what do I need to do with date to be able to compare it to a date range?
If you use the ISO 8601 format for all dates/times in your script, life will
be a lot easier, e.g.

   convert "20/03/2014 1400" to "20130320T1400", store it in $datetime

Then you can specify a date/time range on the command line as your lower
and upper bounds, $datelow and $datehigh, for comparison purposes, e.g.

   my_date_script.bash  20100101T0000  20141231T2359

and inside the script only accept date/times within range:

   if [ "$datelow" \> "$datetime" -o "$datetime" \> "$datehigh" ]; then
        echo Datetime $datetime is not in range [ $datelow, $datehigh ]
    else
        echo Found datetime $datetime
    fi

Of course, all of this would be much easier to code in python, perl or ruby.

HTH!

cheers
rickw


--
------------------------------------
Rick Welykochy || Vitendo Consulting

Gerrold's Fundamental Truth:
It's a good thing money can't buy happiness.
We couldn't stand the commercials.

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to