You have modified the Fetcher to index documents? In that case, you should 
index in the reducer (FetcherOutputFormat), not while mapping, and reuse the 
existing indexing code of SolrWriter. In any case, you should not create a 
client per document. 
 
-----Original message-----
> From:S.L <[email protected]>
> Sent: Monday 16th December 2013 15:57
> To: [email protected]; [email protected]
> Subject: Re: Excessive HttpClient creation (Nutch 1.7 on Hadoop 2.2)
> 
> Markus,
> 
> 
> Yes you are right FetcherThread does not use SolrJ by itself ,I am adding a 
> call to Solr to save the data. I am concerned about the number of HttpClients 
> being created,it.seems its creating a client per a document i am saving in 
> Solr,thiscould be expected but I just want to confirm.
> 
> Thanks.
> 
> Sent from my HTC Inspireā„¢ 4G on AT&T
> 
> ----- Reply message -----
> From: "Markus Jelsma" <[email protected]>
> To: "[email protected]" <[email protected]>
> Subject: Excessive HttpClient creation (Nutch 1.7 on Hadoop 2.2)
> Date: Mon, Dec 16, 2013 6:27 am
> 
> 
> Hi - How can this be in FetcherThread, Nutch does not use SolrJ in Fetcher. 
> Do you have the entire Fetcher log? 
>  
> -----Original message-----
> > From:S.L <[email protected]>
> > Sent: Monday 16th December 2013 6:40
> > To: [email protected]
> > Subject: Excessive HttpClient creation (Nutch 1.7 on Hadoop 2.2)
> > 
> > Hi Folks,
> > 
> > I am running Nutch 1.7 on Hadoop 2.2 and in the Hadoop logs for
> > FetcherThread, I see the following statements , which tells me that the
> > HttpCleints are being created per URL, is this correct assumption? Also
> > after a few fetches I also notice that the Hadoop job throws a OOM error ,
> > please advise.
> > 
> > 2013-12-15 23:47:31,921 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:31,931 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:31,932 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,034 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,034 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,040 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,187 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,214 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,250 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 2013-12-15 23:47:32,264 INFO [FetcherThread]
> > org.apache.solr.client.solrj.impl.HttpClientUtil: Creating new http
> > client, 
> > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
> > 
> 

Reply via email to