RE: getFieldValue always returns an ArrayList?
Interesting. You guessed right. I changed multivalued to multiValued and all of a sudden I get Strings. But, doesn't multivalued default to false? In my schema, I originally did not set multivalued. I only put in multivalued=false after I experienced this issue. -Rich For the record, I had a number of fields which had never settings for multivalued because none of them were multivalued and I expected the default to be false. When I experienced this problem, I added multivalued=false to all of them. I still had the problem. So, I added a method to deal with the returned ArrayLists: private Object getFieldValue(String field, SolrDocument document) { ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } I deliberately did not test if the return Object was an ArrayList because I wanted to get an exception if any of them were Strings; I got no exceptions, so they were all returned as ArrayLists. I then changed one of the fields to use multiValued=false, and I got an exception, trying to cast String to ArrayList! So, I changed all the troublesome fields to use multiValued, and changed my helper method to look like this: private Object getFieldValue(String field, SolrDocument document) { Object o = document.getFieldValue(field); if (o instanceof ArrayList) { System.out.println(### Field + field + is an instance of ArrayList.); ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } else { if (!(o instanceof String)) { System.out.println(## ERROR); } else { System.out.println(### Field + field + is an instance of String.); } return o; } } Here's the output, interspersed with the schema definitions of the fields: field name=uri type=string indexed=true stored=true multiValued=false required=true / ### Field uri is an instance of String. field name=entity_label type=string indexed=false stored=true required=false / ### Field entity_label is an instance of ArrayList. field name=institution_uri type=string indexed=true stored=true required=false / ### Field institution_uri is an instance of ArrayList. field name=asserted_type_uri type=string indexed=true stored=true required=false / ### Field asserted_type_uri is an instance of ArrayList. field name=asserted_type_label type=text_eaglei indexed=true stored=true required=false / ### Field asserted_type_label is an instance of ArrayList. field name=provider_uri type=string indexed=true stored=true multiValued=false required=false / ### Field provider_uri is an instance of String. field name=provider_label type=string indexed=true stored=true multiValued=false required=false / ### Field provider_label is an instance of String. As you can see, the ones with no declaration for multivalued are returned as ArrayLists, while the ones with multiValued=false are returned as Strings. So, it looks like there are two problems here: multivalued (small v) is not recognized, since using that in the schema still causes all fields to be returned as ArrayLists; and, multivalued does not default to false (or, at least, not setting it causes a field to be returned as an ArrayList, as though it were set to true). -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 4:25 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm not seeing the behavior you are. My question about reindexing could have been better stated, I was just making sure you didn't have some leftover cruft where your field was multi-valued from previous experiments, but if you're reindexing each time that's not the problem. Arrrh, camel case may be striking again. Try multiValued, not multivalued If that's still not it, can we see the code? Best Erick On Wed, Jun 15, 2011 at 3:47 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: We rebuild the index from scratch each time we start (for now). The fields in question are not multi-valued; in fact, I explicitly set multi-valued to false, just to be sure. Yes, this is SolrJ, using the embedded server, if that matters. Using Solr/Lucene 3.1.0. -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 3:44 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Did you perhaps change the schema but not re-index? I'm
RE: getFieldValue always returns an ArrayList?
FYI: Using multiValued=false for all string fields results in the following output: ### Field uri is an instance of String. ### Field entity_label is an instance of String. ### Field institution_uri is an instance of String. ### Field asserted_type_uri is an instance of String. ### Field asserted_type_label is an instance of String. ### Field provider_uri is an instance of String. ### Field provider_label is an instance of String. -Rich -Original Message- From: Simon, Richard T Sent: Thursday, June 16, 2011 10:08 AM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: RE: getFieldValue always returns an ArrayList? Interesting. You guessed right. I changed multivalued to multiValued and all of a sudden I get Strings. But, doesn't multivalued default to false? In my schema, I originally did not set multivalued. I only put in multivalued=false after I experienced this issue. -Rich For the record, I had a number of fields which had never settings for multivalued because none of them were multivalued and I expected the default to be false. When I experienced this problem, I added multivalued=false to all of them. I still had the problem. So, I added a method to deal with the returned ArrayLists: private Object getFieldValue(String field, SolrDocument document) { ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } I deliberately did not test if the return Object was an ArrayList because I wanted to get an exception if any of them were Strings; I got no exceptions, so they were all returned as ArrayLists. I then changed one of the fields to use multiValued=false, and I got an exception, trying to cast String to ArrayList! So, I changed all the troublesome fields to use multiValued, and changed my helper method to look like this: private Object getFieldValue(String field, SolrDocument document) { Object o = document.getFieldValue(field); if (o instanceof ArrayList) { System.out.println(### Field + field + is an instance of ArrayList.); ArrayList list = (ArrayList)document.getFieldValue(field); return list.get(0); } else { if (!(o instanceof String)) { System.out.println(## ERROR); } else { System.out.println(### Field + field + is an instance of String.); } return o; } } Here's the output, interspersed with the schema definitions of the fields: field name=uri type=string indexed=true stored=true multiValued=false required=true / ### Field uri is an instance of String. field name=entity_label type=string indexed=false stored=true required=false / ### Field entity_label is an instance of ArrayList. field name=institution_uri type=string indexed=true stored=true required=false / ### Field institution_uri is an instance of ArrayList. field name=asserted_type_uri type=string indexed=true stored=true required=false / ### Field asserted_type_uri is an instance of ArrayList. field name=asserted_type_label type=text_eaglei indexed=true stored=true required=false / ### Field asserted_type_label is an instance of ArrayList. field name=provider_uri type=string indexed=true stored=true multiValued=false required=false / ### Field provider_uri is an instance of String. field name=provider_label type=string indexed=true stored=true multiValued=false required=false / ### Field provider_label is an instance of String. As you can see, the ones with no declaration for multivalued are returned as ArrayLists, while the ones with multiValued=false are returned as Strings. So, it looks like there are two problems here: multivalued (small v) is not recognized, since using that in the schema still causes all fields to be returned as ArrayLists; and, multivalued does not default to false (or, at least, not setting it causes a field to be returned as an ArrayList, as though it were set to true). -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 4:25 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm not seeing the behavior you are. My question about reindexing could have been better stated, I was just making sure you didn't have some leftover cruft where your field was multi-valued from previous experiments, but if you're reindexing each time that's not the problem. Arrrh, camel case may be striking again. Try multiValued, not multivalued If that's still not it, can
RE: getFieldValue always returns an ArrayList?
: and all of a sudden I get Strings. But, doesn't multivalued default to : false? In my schema, I originally did not set multivalued. I only put in : multivalued=false after I experienced this issue. That's dependent on the version of Solr, and it's is where the version property of the schema comes in. (as the default behavior in solr changes, it does so dependent on what version you specify in your schema to prevent radical behavior changes if you upgrade but keep the same configs)... schema name=example version=1.4 !-- attribute name is the name of this schema and is only used for display purposes. Applications should change this to reflect the nature of the search collection. version=1.4 is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications. 1.0: multiValued attribute did not exist, all fields are multiValued by nature 1.1: multiValued attribute introduced, false by default 1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields. 1.3: removed optional field compress feature 1.4: default auto-phrase (QueryParser feature) to off -- -Hoss
RE: getFieldValue always returns an ArrayList?
We haven't changed Solr versions. We've been using 3.1.0 all along. Plus, I have some code that runs during indexing and retrieves the fields from a SolrInputDocument, rather than a SolrDocument. That code gets Strings without any problem, and always has, even without saying multiValued=false. -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, June 16, 2011 2:18 PM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: RE: getFieldValue always returns an ArrayList? : and all of a sudden I get Strings. But, doesn't multivalued default to : false? In my schema, I originally did not set multivalued. I only put in : multivalued=false after I experienced this issue. That's dependent on the version of Solr, and it's is where the version property of the schema comes in. (as the default behavior in solr changes, it does so dependent on what version you specify in your schema to prevent radical behavior changes if you upgrade but keep the same configs)... schema name=example version=1.4 !-- attribute name is the name of this schema and is only used for display purposes. Applications should change this to reflect the nature of the search collection. version=1.4 is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications. 1.0: multiValued attribute did not exist, all fields are multiValued by nature 1.1: multiValued attribute introduced, false by default 1.2: omitTermFreqAndPositions attribute introduced, true by default except for text fields. 1.3: removed optional field compress feature 1.4: default auto-phrase (QueryParser feature) to off -- -Hoss
RE: getFieldValue always returns an ArrayList?
Ah! That was the problem. The version was 1.0. I'll change it to 1.2. Thanks! -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, June 16, 2011 2:33 PM To: Simon, Richard T Cc: solr-user@lucene.apache.org Subject: RE: getFieldValue always returns an ArrayList? : We haven't changed Solr versions. We've been using 3.1.0 all along. but that's not what i'm talking about. I'm talking about the schema version ... a specific property declared in your schema.xml file. did you check it? (even when people start with Solr X, they sometimes are using schema.xml files provided by external packages -- Drupal, wordpress, etc... -- and don't notice that those are from older versions) : Plus, I have some code that runs during indexing and retrieves the : fields from a SolrInputDocument, rather than a SolrDocument. That code : gets Strings without any problem, and always has, even without saying : multiValued=false. SolrInputDocument's are irelevant. they are used to index data, but they don't know anything about the schema. A SolrInputDocument might be completely invalid because of multiple values for singled value fields, or missing values for required fields, etc... what comes back from a search *is* consistent with the schema (even when there is only one value stored in a multiValued field) -Hoss
Re: getFieldValue always returns an ArrayList?
Did you perhaps change the schema but not re-index? I'm grasping at straws here, but something like this might happen if part of your index has that field as a multi-valued field If that't not the problem, what version of solr are you using? I presume this is SolrJ? Best Erick On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: Hi - I am examining a SolrDocument I retrieved through a query. The field I am looking at is declared this way in my schema: field name=uri type=string indexed=true stored=true multivalued=false required=true / I know multivalued defaults to false, but I set it explicitly because I'm seeing some unexpected behavior. I retrieve the value of the field like so: final String resource = (String)document.getFieldValue(uri); However, I get an exception because an ArrayList is returned. I confirmed that the returned ArrayList has one element with the correct value, but I thought getFieldValue would return a String if the field is single valued. When I index the document, I have some code that retrieves the same field in the same way from the SolrInputDocument, and that code works. I looked at the code for SolrDocument.setField and it looks like the only way a field should be set to an ArrayList is if one is passed in by the code creating the SolrDocument. Why would it do that if the field is not multivalued? Is this behavior expected? -Rich
RE: getFieldValue always returns an ArrayList?
We rebuild the index from scratch each time we start (for now). The fields in question are not multi-valued; in fact, I explicitly set multi-valued to false, just to be sure. Yes, this is SolrJ, using the embedded server, if that matters. Using Solr/Lucene 3.1.0. -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 3:44 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Did you perhaps change the schema but not re-index? I'm grasping at straws here, but something like this might happen if part of your index has that field as a multi-valued field If that't not the problem, what version of solr are you using? I presume this is SolrJ? Best Erick On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: Hi - I am examining a SolrDocument I retrieved through a query. The field I am looking at is declared this way in my schema: field name=uri type=string indexed=true stored=true multivalued=false required=true / I know multivalued defaults to false, but I set it explicitly because I'm seeing some unexpected behavior. I retrieve the value of the field like so: final String resource = (String)document.getFieldValue(uri); However, I get an exception because an ArrayList is returned. I confirmed that the returned ArrayList has one element with the correct value, but I thought getFieldValue would return a String if the field is single valued. When I index the document, I have some code that retrieves the same field in the same way from the SolrInputDocument, and that code works. I looked at the code for SolrDocument.setField and it looks like the only way a field should be set to an ArrayList is if one is passed in by the code creating the SolrDocument. Why would it do that if the field is not multivalued? Is this behavior expected? -Rich
Re: getFieldValue always returns an ArrayList?
Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm not seeing the behavior you are. My question about reindexing could have been better stated, I was just making sure you didn't have some leftover cruft where your field was multi-valued from previous experiments, but if you're reindexing each time that's not the problem. Arrrh, camel case may be striking again. Try multiValued, not multivalued If that's still not it, can we see the code? Best Erick On Wed, Jun 15, 2011 at 3:47 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: We rebuild the index from scratch each time we start (for now). The fields in question are not multi-valued; in fact, I explicitly set multi-valued to false, just to be sure. Yes, this is SolrJ, using the embedded server, if that matters. Using Solr/Lucene 3.1.0. -Rich -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, June 15, 2011 3:44 PM To: solr-user@lucene.apache.org Subject: Re: getFieldValue always returns an ArrayList? Did you perhaps change the schema but not re-index? I'm grasping at straws here, but something like this might happen if part of your index has that field as a multi-valued field If that't not the problem, what version of solr are you using? I presume this is SolrJ? Best Erick On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T richard_si...@hms.harvard.edu wrote: Hi - I am examining a SolrDocument I retrieved through a query. The field I am looking at is declared this way in my schema: field name=uri type=string indexed=true stored=true multivalued=false required=true / I know multivalued defaults to false, but I set it explicitly because I'm seeing some unexpected behavior. I retrieve the value of the field like so: final String resource = (String)document.getFieldValue(uri); However, I get an exception because an ArrayList is returned. I confirmed that the returned ArrayList has one element with the correct value, but I thought getFieldValue would return a String if the field is single valued. When I index the document, I have some code that retrieves the same field in the same way from the SolrInputDocument, and that code works. I looked at the code for SolrDocument.setField and it looks like the only way a field should be set to an ArrayList is if one is passed in by the code creating the SolrDocument. Why would it do that if the field is not multivalued? Is this behavior expected? -Rich