Hi, Doug, Now I have a ftp clinet based on apache commons-net. It's been used to build up an intranet search engine of ftp sites (roughly 20 millions files), thus pretty stable. Together with it, there are codes for figuring out content-type from file magic number and filename extension; since, unlike httpd, ftpd does not provide content-type info in its response. Commons-net is an apache project, so there is no license issue for ftp client. The mapping from filename extension to content-type uses jaf (JavaBean Activation Framework), it should be okay too. However the code for deriving content-type from file magic number was adopted from a source that is MPL'ed. Will this be a problem for nutch? If yes, I will just drop off this part of code in my patch.
John On Sun, Feb 15, 2004 at 03:13:39PM -0800, [EMAIL PROTECTED] wrote: > Hi, Doug, > > > >My current ftp implementation > > >uses java URL class (with help of a little hacked sun.net.www.*) > > >and is thus not fully portable, though pretty reliable. > > >I will post it after some cleanup. Any suggestions? > > > > What are the copyright restictions on the sun.net.www.* code? My guess > > is that we probably can't accept a hacked version of that code. If the > > code is already included in most JVMs, can you get away with subclassing > > things, overriding a few methods? > > I will look into this. > > > > > How hard would it be to write a simple FTP client from scratch? I > > originally wrote Nutch's HTTP client in about a day. It's evolved since > > then to support more features, but a working, correct HTTP client is not > > very difficult to write. Is FTP that much harder? > > For one, FTP involves two channels: data and command. > I was trying to beat a deadline and could not ask our sponsor > for the luxury of a fresh write (otherwise our proposal would have been > flatly turned down). In two months (after pushing through my > current project), I will have time available to reexamine the situation. > > Thank you very much for nutch. > > John > __________________________________________ http://www.neasys.com - A Good Place to Be Come to visit us today! ------------------------------------------------------- This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek For a limited time only, get FREE Ground shipping on all orders of $35 or more. Hurry up and shop folks, this offer expires April 30th! http://www.thinkgeek.com/freeshipping/?cpg=12297 _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
