Hello Mr Tim Bray,
Thank you for the suggestion, but according to my project specification the crawler should be made in java. I am facing some problems while implementing the program,like i am using URLConnection Object to obtain the connection. But here i am not getting how to set a timeout period for particular url. The timout can be set, if we are using socket API using setSoTimout, is there any way i can set the timeout using URL Connections Whether it is better to use the URLConnection API or Socket API? Thanks in Advance Mohan --- Tim Bray <[EMAIL PROTECTED]> wrote: > > At 02:53 PM 09/06/01 -0700, Ed Bockelman wrote: > >> I am java programmer, presently making a web > spider > >> program for a search engine.. > > > >I wouldn't think of Java as the first choice for a > high-volume web spider. What are the advantages? > > It's a nice programming language, and has a pretty > good > net interface library. The only downside is that a > spider > spends a huge amount of its time picking apart page > content > looking for links and so on, and has to deal with > all the > badly broken HTML out there. This is probably > easier in > perl or python. But then a spider has to be > massively > parallel and java's threading is massively better > than > perl's. -T > > > -- > This message was sent by the Internet robots and > spiders discussion list ([EMAIL PROTECTED]). For > list server commands, send "help" in the body of a > message to "[EMAIL PROTECTED]". __________________________________________________ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/ -- This message was sent by the Internet robots and spiders discussion list ([EMAIL PROTECTED]). For list server commands, send "help" in the body of a message to "[EMAIL PROTECTED]".