From: perl-win32-users-boun...@listserv.activestate.com [mailto:perl-win32-users-boun...@listserv.activestate.com] On Behalf Of Paul Rousseau Sent: 02 November 2011 16:08 To: perl Win32-users Subject: How to Extract a Date from a File
> Hello Perl folks, > > > I would like to know if there is an eloquent way of extracting a date string > from a file. > > My code goes like this: > > open (INFILE, "<$sourcedir\\$filename") || die "Can not open > $sourcedir\\$filename $!\n"; > @filecontents = <INFILE>; > close INFILE; > @filecontents = map {chomp; $_} @filecontents; > > # > # Within the file contents, look for the text, CurrentWeekLabel > # > # Here is a text sample. > # > # <div style="TEXT-ALIGN: center; min-width: 750px"> > # <div style="OVERFLOW: hidden; HEIGHT: 20px; TEXT-ALIGN: center"><span > id="CurrentWeekLabel">Week Of: > </span><span id="StartWeekLabel" > style="font-weight:bold;">2011/10/29</span><span id="Label6" style="font- > weight:bold;"> - </span><span id="EndWeekLabel" > style="font-weight:bold;">2011/11/04</span></div> > # <div style="OVERFLOW: hidden; HEIGHT: 24px; TEXT-ALIGN: center"><a > id="PreviousWeekLinkButton" > class="LinkButton" href="javascript:OnPreviousWeekLinkButtonClick ()" > href="javascript:__doPostBack('PreviousWeekLinkButton','')">Prev</a><span > id="Label20"> | </span><a > > onclick="SelectWeekButtonClick('PopupCalendar1', 'SelectWeekLinkButton'); > return false;" > id="SelectWeekLinkButton" class="LinkButton" > href="javascript:__doPostBack('SelectWeekLinkButton','')">Select > > Week</a><span id="Label8"> | </span><a id="NextWeekLinkButton" > class="LinkButton" > href="javascript:OnNextWeekLinkButtonClick ()" > href="javascript:__doPostBack('NextWeekLinkButton','')">Next</a></div> > # <div style="OVERFLOW: hidden; OVERFLOW:visible; TEXT-ALIGN: > center"><span id="StatusLabel" > class="StatusLabel"></span></div> > # </div> > # > # Obtain the year, month and day following the text, StartWeekLabel > # > @ans = grep (/StartWeekLabel.+\>(\d{4})\/(\d{2})\/(\d{2})\<\/span/si, > @filecontents); > # > # Build the start date from the matches. > # > $start_date = $1 . $2 . $3 > > I was wondering if there was a neat way to avoid using @ans as a temporary > variable, and extract the > "2011/10/29" straight into $start_date so that $start_date = "20111029" Using regular expressions is not usually recommended. Prefer to use the modules that specialise in doing that. Also, there may be alternate ways to extract the date elements, and modules to validate them. For example... ----------------------------------------------------------- use strict; use warnings; use HTML::TreeBuilder; use Date::Calc qw{check_date}; my $root = HTML::TreeBuilder->new_from_file(*DATA); defined $root or die "Failed to parse\n"; my $element = $root->look_down("id", "StartWeekLabel"); defined $element or die "Failed to locate id=StartWeekLabel\n"; my $rawdate = $element->as_trimmed_text(); print "Raw date '$rawdate'\n"; my @date = split "/", $rawdate; if ((check_date(@date))) { print "Date looks OK: '", @date, "'\n"; } else { print "That date looks invalid\n"; } __DATA__ <div style="TEXT-ALIGN: center; min-width: 750px"> <div style="OVERFLOW: hidden; HEIGHT: 20px; TEXT-ALIGN: center"><span id="CurrentWeekLabel">Week Of: </span><span id="StartWeekLabel" style="font-weight:bold;">2011/10/29</span><span id="Label6" style="font-weight:bold;"> - </span><span id="EndWeekLabel" style="font-weight:bold;">2011/11/04</span></div> <div style="OVERFLOW: hidden; HEIGHT: 24px; TEXT-ALIGN: center"><a id="PreviousWeekLinkButton" class="LinkButton" href="javascript:OnPreviousWeekLinkButtonClick ()" href="javascript:__doPostBack('PreviousWeekLinkButton','')">Prev</a><span id="Label20"> | </span><a onclick="SelectWeekButtonClick('PopupCalendar1', 'SelectWeekLinkButton'); return false;" id="SelectWeekLinkButton" class="LinkButton" href="javascript:__doPostBack('SelectWeekLinkButton','')">Select Week</a><span id="Label8"> | </span><a id="NextWeekLinkButton" class="LinkButton" href="javascript:OnNextWeekLinkButtonClick ()" href="javascript:__doPostBack('NextWeekLinkButton','')">Next</a></div> <div style="OVERFLOW: hidden; OVERFLOW:visible; TEXT-ALIGN: center"><span id="StatusLabel" class="StatusLabel"></span></div> </div> ----------------------------------------------------------- -- Brian Raven Please consider the environment before printing this e-mail. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please advise the sender immediately by reply e-mail and delete this message and any attachments without retaining a copy. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden. _______________________________________________ Perl-Win32-Users mailing list Perl-Win32-Users@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs