Re: SolrInputDocument setField method
I noticed this yesterday as well. The toString() and jsonStr() (in later versions) of SolrJ both include things like toString(): {id=id=[foo123](https://www.nga.mil/careers/studentopp/Pages/default.aspx), ...} or jsonStr(): {"id":"id=[foo123](https://www.nga.mil/careers/studentopp/Pages/default.aspx)",...} However Solr does not reject the documents so this must just be an issue with the two methods. On Wed, Jun 26, 2019 at 12:31 PM, Samuel Kasimalla wrote: > Hi Vicenzo, > > May be looking at the overridden toString() would give you a clue. > > The second part, I don't think SolrJ holds it it twice(if you are worried > about redundant usage of memory), BUT if you haven't used SolrJ so far and > wanted to know if this is the format in which it pushes to Solr, I'm pretty > sure it doesn't push this format into Solr. > > Thanks, > Sam > https://www.linkedin.com/in/skasimalla > > On Wed, Jun 26, 2019 at 11:52 AM Vincenzo D'Amore > wrote: > >> Hi all, >> >> I have a very basic question related to the SolrInputDocument behaviour. >> >> Looking at SolrInputDocument source code I found how the method setField >> works: >> >> public void setField(String name, Object value ) >> { >> SolrInputField field = new SolrInputField( name ); >> _fields.put( name, field ); >> field.setValue( value ); >> } >> >> The field name is "duplicated" into the SolrInputField. >> >> For example, if I'm storing a field "color" with value "red" what we have >> is a Map like this: >> >> { "key" : "color", "value" : { "name" : "color", "value" : "red" } } >> >> the name field "color" appears twice. Very likely there is a reason for >> this, could you please point me in the right direction? >> >> For example, I'm worried about at what happens with SolrJ when I'm sending >> a lot of documents, where for each field the fieldName is sent twice. >> >> Thanks, >> Vincenzo >> >> >> -- >> Vincenzo D'Amore >>-BEGIN PGP PUBLIC KEY BLOCK- Version: Pmcrypto Golang 0.0.1 (ddacebe0) Comment: https://protonmail.com xjMEXMJGxxYJKwYBBAHaRw8BAQdAbwlnObuOIUWLq2qqb+MFiIqxKvGaHeKEk/k/ 7Eh5SUjNPyJtYXJrLmQuc2hvbHVuZEBwcm90b25tYWlsLmNvbSIgPG1hcmsuZC5z aG9sdW5kQHByb3Rvbm1haWwuY29tPsJ3BBAWCgAfBQJcwkbHBgsJBwgDAgQVCAoC AxYCAQIZAQIbAwIeAQAKCRB2Mb5icFoL0j/8AP9tDyF3ziA4+0zM93ZTD8FuffX0 6mAIbnW/EmXujHZLDQEA3ALWhh1hjlQpm2ruuF1+dlsngebhd1AO93xMsYhGkwPO OARcwkbHEgorBgEEAZdVAQUBAQdAoA4U5UGvfPMnqvmLKkRdcvyL5tgFAkoSqSnJ QWFauykDAQgHwmEEGBYIAAkFAlzCRscCGwwACgkQdjG+YnBaC9K9XwD+NyBcSQqc pUop1n12B+VA/ZKRMNiz8LQusBUEEr9XAr4A/im3m0KIJGHSwgBTNzSuZreg5n6U DLlTkt3B58b1z3wP =BNNh -END PGP PUBLIC KEY BLOCK-
Re: SolrInputDocument setField method
On 6/26/2019 9:52 AM, Vincenzo D'Amore wrote: I have a very basic question related to the SolrInputDocument behaviour. Looking at SolrInputDocument source code I found how the method setField works: public void setField(String name, Object value ) { SolrInputField field = new SolrInputField( name ); _fields.put( name, field ); field.setValue( value ); } The field name is "duplicated" into the SolrInputField. What this does is creates an entirely new SolrInputField object -- one that does not have a value. Then it puts that object into a map of all fields for this document. Then it assigns the value directly to the Field object, which is already inside the map. Side note: The "put" method used there will replace any existing field with the same name, turning that field object into garbage that Java will eventually collect. If there is already an existing Field object in the document's map object with the same name, it will likely have no references, so the garbage collector will eventually collect that object and its component objects. The only duplication I can see here is that both the inner field object and the outer map contain the name of the field. Unless you have a really huge number of fields, this would not have a significant impact on the amount of memory required. The map object (_fields) that basically represents the whole document needs *something* to map each entry. The field name is convenient and relevant. It is also usually a fairly short string. It is likely that other code that uses a SolrInputField object will only have that object, not the map, so the name of the field must be in the field object. It is probably possible to achieve slightly better memory efficiency by switching the internal implementation from Map to List or Set ... but it would make SolrInputDocument MUCH less efficient in other ways, including the setField method you have quoted above. I do not think it would be a worthwhile trade. Thanks, Shawn
Re: SolrInputDocument setField method
Hi Vicenzo, May be looking at the overridden toString() would give you a clue. The second part, I don't think SolrJ holds it it twice(if you are worried about redundant usage of memory), BUT if you haven't used SolrJ so far and wanted to know if this is the format in which it pushes to Solr, I'm pretty sure it doesn't push this format into Solr. Thanks, Sam https://www.linkedin.com/in/skasimalla On Wed, Jun 26, 2019 at 11:52 AM Vincenzo D'Amore wrote: > Hi all, > > I have a very basic question related to the SolrInputDocument behaviour. > > Looking at SolrInputDocument source code I found how the method setField > works: > > public void setField(String name, Object value ) > { > SolrInputField field = new SolrInputField( name ); > _fields.put( name, field ); > field.setValue( value ); > } > > The field name is "duplicated" into the SolrInputField. > > For example, if I'm storing a field "color" with value "red" what we have > is a Map like this: > > { "key" : "color", "value" : { "name" : "color", "value" : "red" } } > > the name field "color" appears twice. Very likely there is a reason for > this, could you please point me in the right direction? > > For example, I'm worried about at what happens with SolrJ when I'm sending > a lot of documents, where for each field the fieldName is sent twice. > > Thanks, > Vincenzo > > > -- > Vincenzo D'Amore >
SolrInputDocument setField method
Hi all, I have a very basic question related to the SolrInputDocument behaviour. Looking at SolrInputDocument source code I found how the method setField works: public void setField(String name, Object value ) { SolrInputField field = new SolrInputField( name ); _fields.put( name, field ); field.setValue( value ); } The field name is "duplicated" into the SolrInputField. For example, if I'm storing a field "color" with value "red" what we have is a Map like this: { "key" : "color", "value" : { "name" : "color", "value" : "red" } } the name field "color" appears twice. Very likely there is a reason for this, could you please point me in the right direction? For example, I'm worried about at what happens with SolrJ when I'm sending a lot of documents, where for each field the fieldName is sent twice. Thanks, Vincenzo -- Vincenzo D'Amore