On Aug 30, 2006, at 6:17 PM, Christopher Jett wrote:

I am needing to validate the contents of an EditField as a properly formatted URL. Does someone have a good suggestion on how to accomplish this? Perhaps a REGEX?

This is probably more than you want to hear, but here it goes...

There are basically two parts to a URL (but sometimes three); the protocol and the address. These parts are separated by a "://" string. So the first task is to separate the protocol from the address, which you can do like this:

  Function IsValidURL(source As String) As Boolean

Since a valid URL *cannot* have any spaces, it is easy to test for them and reject the URL if there are any. The Trim() function below removes any whitespace in the front or at the end of the string, so you can ignore any accidental invisible characters. We only really want to reject the string if there is a space within the URL.

    Dim tsource As String = Trim(source)
    If (tsource.InStr(" ") > 0) Then Return False

    Dim s() As String = Split(tsource, "://")

The great thing about the Split command is that it will return the entire string in slot s(0) if the substring is not found. Therefore we can test if the source string had "://" by checking to see if UBound(s) > 0. However, there can be exactly one "://" sequence for a valid URL, so you should reject it if it does not split into exactly two parts.

    If UBound(s) <> 1 Then Return False

You can *assume* that the user forgot to put in "http://";, and add it in for them. That is your call, but for the purpose of these function it is not a valid URL.

There is an optional third part to a URL; and that is the username and password. This is separated by the "@" symbol, and anything that is before the "@" is *not* part of the website address. This used to be a common trick by spammers making you think that you were going to a different website. for example, this URL actually points to your local computer:

    http://www.realsoftware.com/download/[EMAIL PROTECTED]/

So you should really test for the "@" symbol also with the same approach above:

    Dim a() As String = Split(s(1), "@")

We don't care about the user name and password, if there is more than one element to the a(), we can just copy the second part to our s() variable.

    If UBound(a) > 0 Then s(1) = a(1)

Finally, you are left with the address and file request portion of the URL. In order for an url address to be valid, it can either be in the format of an IP address (192.168.1.1) or a domain name (www.website.com). Either method requires the use of a period symbol, so you could just test for that. But in order to make sure it is part of the address, you need to make sure the period comes before the first "/" character (if present). The code above will return False only in the case of something like "absdfajl/test.html".

If (s(1).InStr("/") > 0) And (s(1).InStr(".") > s(1).InStr("/")) Then Return False

    // if you got this far then the URL is good
    Return True
End Function

This is just about all that you need to verify the address of the URL as properly formatted. There is no need to check for ".com" or other top-level domains because a URL does not have to point to a website. For example, Appletalk allows you to communicate with other computers via a URL written like "afp://myG4.local/". There are also addresses for network printers which use URL; and anyone can write a custom protocol and tag. Which goes to the final point...

The above code can be used to determine if a string is a valid URL, but the protocol has not been tested. There can be http, https, ftp, ftps, and feed to just name a few more common ones. If you care about which protocols you are willing to support, then you can add additional tests for this. For example, you can also enter a "supported protocols" parameter with the protocols you wish to validate the URL. I would add a single string and comma or semicolon separate the items: "http;https;feed".

The function without all of the comments:

  Function IsValidURL(source As String) As Boolean

    // Get rid of accidental whitespace
    Dim tsource As String = Trim(source)

    // reject if any space characters remain
    If (tsource.InStr(" ") > 0) Then Return False

    // separate the two main parts
    Dim s() As String = Split(tsource, "://")

    // if not exactly two parts, then reject as valid URL
    If UBound(s) <> 1 Then Return False

    // ignore the username:password format separated by the "@" symbol
    Dim a() As String = Split(s(1), "@")
    If UBound(a) > 0 Then s(1) = a(1)

// make certain there is at least one period symbol before the first "/" If (s(1).InStr("/") > 0) And (s(1).InStr(".") > s(1).InStr("/")) Then Return False

    // if you got this far then the URL is good
    Return True
  End Function


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to