Re: REFindnocase - Parsing URL's - final code

Jon Hall Sat, 02 Jun 2001 15:47:11 -0700
For anyone else that is interested here is the final code.
This program will parse out url's out of any file.

Still think there is a better way to do this though...

jon

<cffile action="READ" file="c:\www\test\html.htm" variable="h">

<cfset arrString = ArrayNew(1)>
<cfset rs = ArrayResize(arrString,len(h))>

<cfloop from="1" to="#len(h)#" index="i">
 <cfset arrString[i] = mid (h,i,1)>
</cfloop>

<cfset markStart = 0>
<cfset markEnd = 0>
<cfset markArray = ArrayNew(1)>
<cfloop from="1" to="#arrayLen(arrString)#" index="i">
 <cfif arrString[i] EQ "<">
  <cfset rs = ArrayAppend(markArray,i)>
 </cfif>
 <cfif arrString[i] EQ ">">
  <cfset rs = ArrayAppend(markArray,i)>
 </cfif>
</cfloop>

<cfset hrefArray = ArrayNew(1)>
<cfloop index="i" from="1" to="#arrayLen(markArray)#" step="2">
 <cfset linklen = markArray[i + 1] - markArray[i]>
  <cfif mid(h,markArray[i],2) EQ "<a">
   <cfset rs =
arrayAppend(hrefArray,replace(mid(h,markArray[i],linklen),"<","&lt;","ALL"))
>
<!---    <cfoutput>
    #replace(mid(h,markArray[i],linklen),"<","&lt;","ALL")#<br>
   </cfoutput> --->
  </cfif>
</cfloop>

<cfloop from="1" to="#arrayLen(hrefArray)#" index="i">
 <cfset starthrefPos = find("http",hrefArray[i],1)>
 <cfif starthrefPos NEQ 0>
  <cfset endhrefPos = find("#chr(34)#",hrefArray[i],starthrefPos)>
  <cfset urlLength = endhrefPos - starthrefPos>
  <cfset url = mid(hrefArray[i],starthrefPos, urlLength)>
  <cfoutput>#url#</cfoutput><br>
 </cfif>
</cfloop>




----- Original Message -----
From: "Jon Hall" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Saturday, June 02, 2001 6:16 PM
Subject: Re: REFindnocase - Parsing URL's


> Simple, just delete the the first line and change the name of your
variable
> to 'h'.
>
> This program only parses out the whole <a href ...> tag though. In order
to
> get just the actual url, I'd probably just stick all of the parsed href
tags
> in another array then parse for href=.
>
> I am actually going to extend this program to do this anyway. So I'll make
a
> follow up post with the modified source. I really just needed to do this
for
> a one off program, and it has kinda morphed into something a little more,
> simply since it's a challenge ;-)
> If you have access to irc, I will be idling in #coldfusion on efnet. /nick
> flux0
>
> jon
> ----- Original Message -----
> From: "W Luke" <[EMAIL PROTECTED]>
> To: "CF-Talk" <[EMAIL PROTECTED]>
> Sent: Saturday, June 02, 2001 5:22 PM
> Subject: Re: REFindnocase - Parsing URL's
>
>
> > Jon,
> >
> > How might I change this to searching inside a variable that contains the
> > text, and not a file as you have done?
> >
> > Will
> >
> >
> > --
> > Will
> > Free Advertising-=- www.localbounty.com
> > e: [EMAIL PROTECTED]  icq: 31099745
> >
> >
> > ----- Original Message -----
> > From: "Jon Hall" <[EMAIL PROTECTED]>
> > Newsgroups: cf-talk
> > Sent: Saturday, June 02, 2001 9:57 PM
> > Subject: Re: REFindnocase - Parsing URL's
> >
> >
> > > Wow, now this is too much of a coincidence. I just opened up my email
> > > program to post a message saying I had just successfully written a
> program
> > > that parses url's out of a document, and was just wondering if anyone
> had
> > a
> > > better way to do it. Well here is how I did it.
> > >
> > > If anyone knows of a faster way I am definately interested. I imagine
> > > regular expressions would be much faster...
> > > Cfscripting this would most likely make it faster too, but for
> readability
> > I
> > > am leaving it in regular cfml for now.
> >
> >
> >
> >
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Structure your ColdFusion code with Fusebox. Get the official book at 
http://www.fusionauthority.com/bkinfo.cfm

Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists
Re: REFindnocase - Parsing URL's - final code

Reply via email to