On 6/26/2019 9:52 AM, Vincenzo D'Amore wrote:
I have a very basic question related to the SolrInputDocument behaviour.

Looking at SolrInputDocument source code I found how the method setField
works:

   public void setField(String name, Object value )
   {
     SolrInputField field = new SolrInputField( name );
     _fields.put( name, field );
     field.setValue( value );
   }

The field name is "duplicated" into the SolrInputField.

What this does is creates an entirely new SolrInputField object -- one that does not have a value. Then it puts that object into a map of all fields for this document. Then it assigns the value directly to the Field object, which is already inside the map.

Side note: The "put" method used there will replace any existing field with the same name, turning that field object into garbage that Java will eventually collect.

If there is already an existing Field object in the document's map object with the same name, it will likely have no references, so the garbage collector will eventually collect that object and its component objects.

The only duplication I can see here is that both the inner field object and the outer map contain the name of the field. Unless you have a really huge number of fields, this would not have a significant impact on the amount of memory required.

The map object (_fields) that basically represents the whole document needs *something* to map each entry. The field name is convenient and relevant. It is also usually a fairly short string.

It is likely that other code that uses a SolrInputField object will only have that object, not the map, so the name of the field must be in the field object.

It is probably possible to achieve slightly better memory efficiency by switching the internal implementation from Map to List or Set ... but it would make SolrInputDocument MUCH less efficient in other ways, including the setField method you have quoted above. I do not think it would be a worthwhile trade.

Thanks,
Shawn

Reply via email to