In the trunk code, the DocumentBuilder is not handling null values well.
SolrInputDocument doc = new SolrInputDocument();
doc.addField( "id", "hello", 1.0f );
doc.addField( "name", null, 1.0f );
Document out = DocumentBuilder.toDocument( doc, core.getSchema() );
throws an exception:
"unknown field 'name'"
Fixing it is easy, but I'm not clear what the semantics of indexing a
'null' value is indented to be. It looks like FieldTypes are given a
chance to deal with 'null' values with the toInternal()
I have not looked into it, but I think the StAX parser would make both:
<field name="name" />
<field name="name" ></field>
into: doc.addField( "name", "", 1.0f );
To me, it makes the most sense to just skip fields that don't have any
value. This change passes all test and fixes the 'unknown' field error,
but I'm not sure if it changes any undocumented/untested assumptions:
Index: src/java/org/apache/solr/update/DocumentBuilder.java
===================================================================
--- src/java/org/apache/solr/update/DocumentBuilder.java
(revision 564002)
+++ src/java/org/apache/solr/update/DocumentBuilder.java (working
copy)
@@ -188,8 +188,10 @@
SchemaField[] destArr = schema.getCopyFields(name);
// load each field value
+ boolean hasField = false;
for( Object v : field ) {
String val = null;
+ hasField = true;
// TODO!!! HACK -- date conversion
if( sfield != null && v instanceof Date && sfield.getType()
instanceof DateField ) {
@@ -232,7 +234,7 @@
}
// make sure the field was used somehow...
- if( !used ) {
+ if( !used && hasField ) {
throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,"ERROR:unknown field '" + name + "'");
}
}