Re: Indexing from a database via SolrJ

Shawn Heisey Tue, 16 Aug 2011 08:27:14 -0700

On 8/16/2011 7:14 AM, Erick Erickson wrote:

What have you tried and what doesn't it do that you want it to do?


This works, instantiating the StreamingUpdateSolrServer (server) and
the JDBC connection/SQL statement are left as exercises for the
reader<G>.:

     while (rs.next()) {
       SolrInputDocument doc = new SolrInputDocument();

       String id = rs.getString("id");
       String title = rs.getString("title");
       String text = rs.getString("text");

       doc.addField("id", id);
       doc.addField("title", title);
       doc.addField("text", text);

       docs.add(doc);
       ++counter;
       ++total;
       if (counter>  100) { // Completely arbitrary, just batch up more
than one document for throughput!
         server.add(docs);
         docs.clear();
         counter = 0;
       }
     }

I've implemented a basic loop with the structure you've demonstrated,but it currently doesn't do anything yet with SolrInputDocument orSolrDocumentList. I figured there would be a way to avoid going throughthe field list one by one, but what you've written suggests that thefield-by-field method is required. I can live with that.

It does look like addField just takes an Object, so hopefully I cancreate a loop that determines the type of each field from the JDBCmetadata, retrieves the correct Java type from the ResultSet, andinserts it. I imagine that everything still works if you happen toinsert a field that doesn't exist in the index. This must be how theDIH does it, so I was hoping that the DIH might expose a method thattakes a ResultSet and produces a SolrDocumentList. I still have to takea deeper look at the source and documentation.


Thanks for the help so far, I can get a little more implemented now.

Shawn

Re: Indexing from a database via SolrJ

Reply via email to