Oleg,

I'm *pretty* sure I'm always calling close connection.
I'm attaching the code for the fetch util, please tell
me if I'm missing anything (but I'm pretty sure I'm
not).

For background, the crawler is designed to only
download from unique sets of ip addresses at any given
time (unique across all distributed crawlers). For
this reason, we want to close connections pretty
quickly after downloading because it's unlikely we
will need the same connection again any time soon.

Any comments on making this code better is much
appreciated.

-George 

static { 
  HttpConnectionManagerParams connMgrParams = 
    new HttpConnectionManagerParams();
  connMgrParams.setConnectionTimeout(4000);
  connMgrParams.setSoTimeout(4000);
  connMgrParams.setLinger(4000);
  // set to one so we don't hammer a web site by
accident
  connMgrParams.setMaxConnectionsPerHost(
    HostConfiguration.ANY_HOST_CONFIGURATION,1);
  connMgrParams.setMaxTotalConnections(2000);
  MultiThreadedHttpConnectionManager connMgr = 
    new MultiThreadedHttpConnectionManager();
  connMgr.setParams(connMgrParams);
  // static instance of HttpClient used by all threads
  httpClient= new HttpClient(connMgr); 
}

public String getContentAsString(String url) throws
Exception {
  GetMethod method = new GetMethod(url);
  StringBuffer buffer = new StringBuffer();
  try {
    method.getParams().setCookiePolicy(
      CookiePolicy.IGNORE_COOKIES);
    method.addRequestHeader( "Connection", "close");
    int statusCode = httpClient.executeMethod(method);
    Header header =
method.getRequestHeader("Content-Type");
    Header[] headers = method.getRequestHeaders();
    boolean headerOK=true;
    String offendingHeader=null;
    for (int i=0;i<headers.length && headerOK;i++) {
      if
(!(isValidContentType(headers[i].toString()))) {
        offendingHeader=headers[i].toString();
        headerOK=false;
        break;
      }
    }
    if (!headerOK) {
      method.releaseConnection();
      throw new
InvalidMIMETypeException(offendingHeader);
    }
    BufferedReader reader = new BufferedReader(
      new InputStreamReader(
method.getResponseBodyAsStream(),       
        method.getResponseCharSet())); 
    // consume the response entity
    int ch =0;
    while(((ch=reader.read())!=-1) && 
      buffer.length()<MAX_DOC_SIZE)
      buffer.append((char)ch);
  } finally {
    method.releaseConnection();
  }
  return buffer.toString();
}



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to