Re: Connecting MySQL to Apache Nutch

Markus Jelsma Thu, 13 Jan 2011 05:34:44 -0800

Try using the logger, this way you can check hadoop.log for your output.

import:
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;


declare:
public static Log LOG = LogFactory.getLog(SolrWriter.class);

use:
LOG.info("bla bla");





On Thursday 13 January 2011 14:28:21 PEEYUSH CHANDEL wrote:
> hi markus
> 
> here is my modified SolarWriter class,please check it and correct me
> if i am doing something wrong.
> 
> i tried this code but nothing happens.
> 
> package org.apache.nutch.indexer.solr;
> 
> import java.io.IOException;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Map.Entry;
> import java.util.Iterator;
> import java.sql.*;
> 
> import org.apache.hadoop.mapred.JobConf;
> import org.apache.nutch.indexer.NutchDocument;
> import org.apache.nutch.indexer.NutchField;
> import org.apache.nutch.indexer.NutchIndexWriter;
> import org.apache.solr.client.solrj.SolrServer;
> import org.apache.solr.client.solrj.SolrServerException;
> import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
> import org.apache.solr.common.SolrInputDocument;
> 
> public class SolrWriter implements NutchIndexWriter {
> 
>   private SolrServer solr;
>   private SolrMappingReader solrMapping;
> 
>   private final List<SolrInputDocument> inputDocs =
>     new ArrayList<SolrInputDocument>();
> 
>   private int commitSize;
> 
>   public void open(JobConf job, String name) throws IOException {
>     solr = new CommonsHttpSolrServer(job.get(SolrConstants.SERVER_URL));
>     commitSize = job.getInt(SolrConstants.COMMIT_SIZE, 1000);
>     solrMapping = SolrMappingReader.getInstance(job);
>   }
> 
>   public void write(NutchDocument doc) throws IOException {
>     final SolrInputDocument inputDoc = new SolrInputDocument();
>     for(final Entry<String, NutchField> e : doc) {
>       for (final Object val : e.getValue().getValues()) {
>         inputDoc.addField(solrMapping.mapKey(e.getKey()), val,
> e.getValue().getWeight());
>         String sCopy = solrMapping.mapCopyKey(e.getKey());
>         if (sCopy != e.getKey()) {
>               inputDoc.addField(sCopy, val, e.getValue().getWeight());
>         }
>       }
>     }
>     inputDoc.setDocumentBoost(doc.getWeight());
>     inputDocs.add(inputDoc);
> 
> //here is my modified code
> 
>     SolrInputDocument abc;
>     Iterator it=inputDocs.iterator();
>     while(it.hasNext())
>     {
>       abc=(SolrInputDocument)it.next();
>       String test=(abc.toString());
> 
>         Connection conn = null;
>         String url = "jdbc:mysql://localhost:3306/";
>         String dbName = "data";
>         String driver = "com.mysql.jdbc.Driver";
>         String userName = "root";
>         String password = "passwd";
>         try {
>             Class.forName(driver).newInstance();
>             conn =
> DriverManager.getConnection(url+dbName,userName,password);
> System.out.println("Connected to the database");
> 
>                       java.sql.Statement s = conn.createStatement();
>                       int r = s.executeUpdate("INSERT INTO data(data)
> VALUES('"+test+"')");
> 
>             System.out.println("Done");
>        conn.close();
>             System.out.println("Disconnected from database");
> 
>               }
>               catch (Exception e) {
>                       System.out.println(e);
>                       System.exit(0);
>                       }
> 
>     }
> 
>     if (inputDocs.size() > commitSize) {
>       try {
>         solr.add(inputDocs);
> 
>       } catch (final SolrServerException e) {
>         throw makeIOException(e);
>       }
>       inputDocs.clear();
>     }
>   }
> 
>   public void close() throws IOException {
>     try {
>       if (!inputDocs.isEmpty()) {
>         solr.add(inputDocs);
>         inputDocs.clear();
>       }
>       // solr.commit();
>     } catch (final SolrServerException e) {
>       throw makeIOException(e);
>     }
>   }
> 
>   public static IOException makeIOException(SolrServerException e) {
>     final IOException ioe = new IOException();
>     ioe.initCause(e);
>     return ioe;
>   }
> 
> }
> 
> -Thanks you very much
> 
> On 1/13/11, Markus Jelsma <[email protected]> wrote:
> > public void write gets called for each NutchDocument and collects them in
> > inputDocs. You could, after line 60, call a customer method to read all
> > fields
> > and create a SQL insert statement out of it.
> > 
> > On Thursday 13 January 2011 13:55:14 PEEYUSH CHANDEL wrote:
> >> hi markus,
> >> 
> >> i try to modify the SolrWriter.java class and place my mysql connecter
> >> their but nothing
> >> 
> >> happens  so can please explain a little more with example of code that
> >> exactly which
> >> 
> >> part of SolrWriter class is going to be replace by mysql connecter.
> >> 
> >> -Thanks You Very Much
> >> 
> >> On 1/13/11, Markus Jelsma <[email protected]> wrote:
> >> > Here's the class you need to look at:
> >> > http://svn.apache.org/viewvc/nutch/branches/branch-1.2/src/java/org/ap
> >> > ach e/nutch/indexer/solr/SolrWriter.java?view=markup
> >> > 
> >> >> Modifying the Solr index writer to use a MySQL connector is surely
> >> >> the easiest short cut.
> >> >> 
> >> >> > hi O.Klein
> >> >> > 
> >> >> > thanks for the answer but i am using nutch 1.2 so any solution for
> >> >> > this version.
> >> >> > 
> >> >> > On 1/13/11, O. Klein <[email protected]> wrote:
> >> >> > > Nutch 2.0 supports storage of data in MySQL DB.
> >> >> > > 
> >> >> > > But that version is not for production yet.
> >> >> > > 
> >> >> > > Check
> >> >> > > http://techvineyard.blogspot.com/2010/12/build-nutch-20.html on
> >> >> > > how to get it running.
> >> >> > > --
> >> >> > > View this message in context:
> >> >> > > http://lucene.472066.n3.nabble.com/Connecting-MySQL-to-Apache-Nut
> >> >> > > ch- tp2 24 3983p2244263.html Sent from the Nutch - User mailing
> >> >> > > list archive at
> >> >> > > Nabble.com.
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Connecting MySQL to Apache Nutch

Reply via email to