On 8/16/2011 7:14 AM, Erick Erickson wrote:
What have you tried and what doesn't it do that you want it to do?
This works, instantiating the StreamingUpdateSolrServer (server) and
the JDBC connection/SQL statement are left as exercises for the
reader<G>.:
while (rs.next()) {
SolrInputDocument doc = new SolrInputDocument();
String id = rs.getString("id");
String title = rs.getString("title");
String text = rs.getString("text");
doc.addField("id", id);
doc.addField("title", title);
doc.addField("text", text);
docs.add(doc);
++counter;
++total;
if (counter> 100) { // Completely arbitrary, just batch up more
than one document for throughput!
server.add(docs);
docs.clear();
counter = 0;
}
}
I've implemented a basic loop with the structure you've demonstrated,
but it currently doesn't do anything yet with SolrInputDocument or
SolrDocumentList. I figured there would be a way to avoid going through
the field list one by one, but what you've written suggests that the
field-by-field method is required. I can live with that.
It does look like addField just takes an Object, so hopefully I can
create a loop that determines the type of each field from the JDBC
metadata, retrieves the correct Java type from the ResultSet, and
inserts it. I imagine that everything still works if you happen to
insert a field that doesn't exist in the index. This must be how the
DIH does it, so I was hoping that the DIH might expose a method that
takes a ResultSet and produces a SolrDocumentList. I still have to take
a deeper look at the source and documentation.
Thanks for the help so far, I can get a little more implemented now.
Shawn