Designing a Date Format-aware FTP Entry Parser After having percolated on the back burner for several years as an unresolved issue, there is finally some momentum toward solving the problem of parsing FTP entries from servers which format the file timestamps in the directory listings in a format other than the NetComponents âstandardâ.
In order to understand what must be done, it would be helpful to understand what we now do. In brief, we are using a regular expression to achieve basically the same results as attempting to parse the date portion of the listing with one of two alternate java.text.SimpleDateFormats in the en_US locale: 1.MMM dd HH:mm for dates within one year of the current time 2.MMM dd yyyy for dates older than one year. Additionally, these formats presume some timezone, which is either the local timezone of the server or GMT, I presume. The alternative mechanism that I am proposing would remove the parsing of the timestamp from the responsibilities of the regular expression and unload this onto some other object. But what object? The obvious candidate would be java.text.DateFormat. This abstract class allows a formatter object to be created on the basis of some formatting codes defined in DateFormat (âLONG, MEDIUM, SHORTâ) and a Locale. But this is problematic because what is meant by MEDIUM in en_US is a string like âSep 25, 2004â while in âde_DEâ, you get a string like â25.09.2004â. This just won't do. So we have to fall back on java.text.SimpleDateFormat, passing in both a specific formatting string and a Locale, which provides the month names, etc. (By the way, has anyone ever noticed that SimpleDateFormat is actually less simple than DateFormat?) :-) The regular expression would merely extract from the listing the entire timestamp portion and delegate the task of parsing it to a pair of SimpleDateFormat objects (one for less than 1 year old and the other for one year old or older), each constructed on the basis of a format string and a locale. Since the Locale should be the same for both formats, we would require the user to provide the two format Strings, and the Locale (or possibly the constituent elements of the locale, the country code and language code). We want an object that encapsulates all of that, say, org.apache.commons.net.ftp.parser.FTPDateFormat. So each parser would have a settable member of this class FTPDateFormat would be constructed from two format strings and a Locale. Possibly a timezone as well. We probably would have to provide some default FTPDateFormat objects for some of the common locales. One consequence of this is that we would start making heavier use of the FTPFileEntryParserFactory objects. We might want to start thinking about deemphasizing but not deprecating the use of FTPClient.listFiles() which is simple but makes too many assumptions. There are already four or five different overrides of this method name and adding several more parameters into the mix will make this completely unworkable. Instead, going through the factory would become the more common, more documented and recommended approach. This would be the preferred method of accessing commons-net ftp for clients such as Ant and VFS. Users who are happily using listFiles() in its current form in their custom apps built directly from commons-net could continue to do so. Well, these are some preliminary thoughts. Let's hear from the other developers of this project. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
