RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T
Interesting. You guessed right. I changed multivalued to multiValued and 
all of a sudden I get Strings. But, doesn't multivalued default to false? In my 
schema, I originally did not set multivalued. I only put in multivalued=false 
after I experienced this issue. 

-Rich

For the record, I had a number of fields which had never settings for 
multivalued because none of them were multivalued and I expected the default to 
be false. When I experienced this problem, I added multivalued=false to all 
of them. I still had the problem. So, I added a method to deal with the 
returned ArrayLists:

private Object getFieldValue(String field, SolrDocument document) {

ArrayList list = 
(ArrayList)document.getFieldValue(field);
return list.get(0);

}


I deliberately did not test if the return Object was an ArrayList because I 
wanted to get an exception if any of them were Strings; I got no exceptions, so 
they were all returned as ArrayLists. 

I then changed one of the fields to use multiValued=false, and I got an 
exception, trying to cast String to ArrayList! So, I changed all the 
troublesome fields to use multiValued, and changed my helper method to look 
like this:

private Object getFieldValue(String field, SolrDocument document) {
Object o = document.getFieldValue(field);

if (o instanceof ArrayList) {
System.out.println(### Field  + field +  is an 
instance of ArrayList.);
ArrayList list = 
(ArrayList)document.getFieldValue(field);
return list.get(0);
} else {
if (!(o instanceof String)) {
System.out.println(## ERROR);
} else {
System.out.println(### Field  + field +  
is an instance of String.);
}
return o;
}

}


Here's the output, interspersed with the schema definitions of the fields:

field name=uri type=string indexed=true stored=true 
multiValued=false required=true /
### Field uri is an instance of String.

field name=entity_label type=string indexed=false stored=true 
required=false /
### Field entity_label is an instance of ArrayList.

field name=institution_uri type=string indexed=true stored=true 
required=false /
### Field institution_uri is an instance of ArrayList.

field name=asserted_type_uri type=string indexed=true stored=true 
required=false /
### Field asserted_type_uri is an instance of ArrayList.

field name=asserted_type_label type=text_eaglei indexed=true 
stored=true required=false /
### Field asserted_type_label is an instance of ArrayList.

 field name=provider_uri type=string indexed=true stored=true 
multiValued=false required=false /
### Field provider_uri is an instance of String.

field name=provider_label type=string indexed=true stored=true 
multiValued=false required=false /
### Field provider_label is an instance of String.


As you can see, the ones with no declaration for multivalued are returned as 
ArrayLists, while the ones with multiValued=false are returned as Strings. 

So, it looks like there are two problems here: multivalued (small v) is not 
recognized, since using that in the schema still causes all fields to be 
returned as ArrayLists; and, multivalued does not default to false (or, at 
least, not setting it causes a field to be returned as an ArrayList, as though 
it were set to true).

-Rich


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, June 15, 2011 4:25 PM
To: solr-user@lucene.apache.org
Subject: Re: getFieldValue always returns an ArrayList?

Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm
not seeing the behavior you are.

My question about reindexing could have been better stated, I
was just making sure you didn't have some leftover cruft where
your field was multi-valued from previous experiments, but if
you're reindexing each time that's not the problem.

Arrrh, camel case may be striking again. Try multiValued, not
multivalued

If that's still not it, can we see the code?

Best
Erick

On Wed, Jun 15, 2011 at 3:47 PM, Simon, Richard T
richard_si...@hms.harvard.edu wrote:
 We rebuild the index from scratch each time we start (for now). The fields in 
 question are not multi-valued; in fact, I explicitly set multi-valued to 
 false, just to be sure.

 Yes, this is SolrJ, using the embedded server, if that matters.

 Using Solr/Lucene 3.1.0.

 -Rich

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Wednesday, June 15, 2011 3:44 PM
 To: solr-user@lucene.apache.org
 Subject: Re: getFieldValue always returns an ArrayList?

 Did you perhaps change the schema but not re-index? I'm

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T
FYI: Using multiValued=false for all string fields results in the following 
output:

### Field uri is an instance of String.
### Field entity_label is an instance of String.
### Field institution_uri is an instance of String.
### Field asserted_type_uri is an instance of String.
### Field asserted_type_label is an instance of String.
### Field provider_uri is an instance of String.
### Field provider_label is an instance of String.

-Rich

-Original Message-
From: Simon, Richard T 
Sent: Thursday, June 16, 2011 10:08 AM
To: solr-user@lucene.apache.org
Cc: Simon, Richard T
Subject: RE: getFieldValue always returns an ArrayList?

Interesting. You guessed right. I changed multivalued to multiValued and 
all of a sudden I get Strings. But, doesn't multivalued default to false? In my 
schema, I originally did not set multivalued. I only put in multivalued=false 
after I experienced this issue. 

-Rich

For the record, I had a number of fields which had never settings for 
multivalued because none of them were multivalued and I expected the default to 
be false. When I experienced this problem, I added multivalued=false to all 
of them. I still had the problem. So, I added a method to deal with the 
returned ArrayLists:

private Object getFieldValue(String field, SolrDocument document) {

ArrayList list = 
(ArrayList)document.getFieldValue(field);
return list.get(0);

}


I deliberately did not test if the return Object was an ArrayList because I 
wanted to get an exception if any of them were Strings; I got no exceptions, so 
they were all returned as ArrayLists. 

I then changed one of the fields to use multiValued=false, and I got an 
exception, trying to cast String to ArrayList! So, I changed all the 
troublesome fields to use multiValued, and changed my helper method to look 
like this:

private Object getFieldValue(String field, SolrDocument document) {
Object o = document.getFieldValue(field);

if (o instanceof ArrayList) {
System.out.println(### Field  + field +  is an 
instance of ArrayList.);
ArrayList list = 
(ArrayList)document.getFieldValue(field);
return list.get(0);
} else {
if (!(o instanceof String)) {
System.out.println(## ERROR);
} else {
System.out.println(### Field  + field +  
is an instance of String.);
}
return o;
}

}


Here's the output, interspersed with the schema definitions of the fields:

field name=uri type=string indexed=true stored=true 
multiValued=false required=true /
### Field uri is an instance of String.

field name=entity_label type=string indexed=false stored=true 
required=false /
### Field entity_label is an instance of ArrayList.

field name=institution_uri type=string indexed=true stored=true 
required=false /
### Field institution_uri is an instance of ArrayList.

field name=asserted_type_uri type=string indexed=true stored=true 
required=false /
### Field asserted_type_uri is an instance of ArrayList.

field name=asserted_type_label type=text_eaglei indexed=true 
stored=true required=false /
### Field asserted_type_label is an instance of ArrayList.

 field name=provider_uri type=string indexed=true stored=true 
multiValued=false required=false /
### Field provider_uri is an instance of String.

field name=provider_label type=string indexed=true stored=true 
multiValued=false required=false /
### Field provider_label is an instance of String.


As you can see, the ones with no declaration for multivalued are returned as 
ArrayLists, while the ones with multiValued=false are returned as Strings. 

So, it looks like there are two problems here: multivalued (small v) is not 
recognized, since using that in the schema still causes all fields to be 
returned as ArrayLists; and, multivalued does not default to false (or, at 
least, not setting it causes a field to be returned as an ArrayList, as though 
it were set to true).

-Rich


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, June 15, 2011 4:25 PM
To: solr-user@lucene.apache.org
Subject: Re: getFieldValue always returns an ArrayList?

Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm
not seeing the behavior you are.

My question about reindexing could have been better stated, I
was just making sure you didn't have some leftover cruft where
your field was multi-valued from previous experiments, but if
you're reindexing each time that's not the problem.

Arrrh, camel case may be striking again. Try multiValued, not
multivalued

If that's still not it, can

RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Chris Hostetter

: and all of a sudden I get Strings. But, doesn't multivalued default to 
: false? In my schema, I originally did not set multivalued. I only put in 
: multivalued=false after I experienced this issue.

That's dependent on the version of Solr, and it's is where the 
version property of the schema comes in.  (as the default behavior in 
solr changes, it does so dependent on what version you specify in your 
schema to prevent radical behavior changes if you upgrade but keep the 
same configs)...

schema name=example version=1.4
  !-- attribute name is the name of this schema and is only used for display 
purposes.
   Applications should change this to reflect the nature of the search 
collection.
   version=1.4 is Solr's version number for the schema syntax and 
semantics.  It should
   not normally be changed by applications.
   1.0: multiValued attribute did not exist, all fields are multiValued by 
nature
   1.1: multiValued attribute introduced, false by default 
   1.2: omitTermFreqAndPositions attribute introduced, true by default 
except for text fields.
   1.3: removed optional field compress feature
   1.4: default auto-phrase (QueryParser feature) to off
 --



-Hoss


RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T
We haven't changed Solr versions. We've been using 3.1.0 all along.

Plus, I have some code that runs during indexing and retrieves the fields from 
a SolrInputDocument, rather than a SolrDocument. That code gets Strings without 
any problem, and always has, even without saying multiValued=false.

-Rich

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Thursday, June 16, 2011 2:18 PM
To: solr-user@lucene.apache.org
Cc: Simon, Richard T
Subject: RE: getFieldValue always returns an ArrayList?


: and all of a sudden I get Strings. But, doesn't multivalued default to 
: false? In my schema, I originally did not set multivalued. I only put in 
: multivalued=false after I experienced this issue.

That's dependent on the version of Solr, and it's is where the 
version property of the schema comes in.  (as the default behavior in 
solr changes, it does so dependent on what version you specify in your 
schema to prevent radical behavior changes if you upgrade but keep the 
same configs)...

schema name=example version=1.4
  !-- attribute name is the name of this schema and is only used for display 
purposes.
   Applications should change this to reflect the nature of the search 
collection.
   version=1.4 is Solr's version number for the schema syntax and 
semantics.  It should
   not normally be changed by applications.
   1.0: multiValued attribute did not exist, all fields are multiValued by 
nature
   1.1: multiValued attribute introduced, false by default 
   1.2: omitTermFreqAndPositions attribute introduced, true by default 
except for text fields.
   1.3: removed optional field compress feature
   1.4: default auto-phrase (QueryParser feature) to off
 --



-Hoss


RE: getFieldValue always returns an ArrayList?

2011-06-16 Thread Simon, Richard T
Ah! That was the problem. The version was 1.0. I'll change it to 1.2. Thanks!

-Rich

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Thursday, June 16, 2011 2:33 PM
To: Simon, Richard T
Cc: solr-user@lucene.apache.org
Subject: RE: getFieldValue always returns an ArrayList?


: We haven't changed Solr versions. We've been using 3.1.0 all along.

but that's not what i'm talking about.  I'm talking about the schema 
version ... a specific property declared in your schema.xml file.

did you check it?

(even when people start with Solr X, they sometimes are using schema.xml 
files provided by external packages -- Drupal, wordpress, etc... -- and 
don't notice that those are from older versions)

: Plus, I have some code that runs during indexing and retrieves the 
: fields from a SolrInputDocument, rather than a SolrDocument. That code 
: gets Strings without any problem, and always has, even without saying 
: multiValued=false.

SolrInputDocument's are irelevant.  they are used to index data, but they 
don't know anything about the schema.  A SolrInputDocument might be 
completely invalid because of multiple values for singled value fields, or 
missing values for required fields, etc...   what comes back from a search 
*is* consistent with the schema (even when there is only one value stored 
in a multiValued field)

-Hoss


Re: getFieldValue always returns an ArrayList?

2011-06-15 Thread Erick Erickson
Did you perhaps change the schema but not re-index? I'm grasping
at straws here, but something like this might happen if part of
your index has that field as a multi-valued field

If that't not the problem, what version of solr are you using? I
presume this is SolrJ?

Best
Erick

On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T
richard_si...@hms.harvard.edu wrote:
 Hi - I am examining a SolrDocument I retrieved through a query. The field I 
 am looking at is declared this way in my schema:

 field name=uri type=string indexed=true stored=true 
 multivalued=false required=true /

 I know multivalued defaults to false, but I set it explicitly because I'm 
 seeing some unexpected behavior. I retrieve the value of the field like so:

 final String resource = (String)document.getFieldValue(uri);


 However, I get an exception because an ArrayList is returned. I confirmed 
 that the returned ArrayList has one element with the correct value, but I 
 thought getFieldValue would return a String if the field is single valued. 
 When I index the document, I have some code that retrieves the same field in 
 the same way from the SolrInputDocument, and that code works.

 I looked at the code for SolrDocument.setField and it looks like the only way 
 a field should be set to an ArrayList is if one is passed in by the code 
 creating the SolrDocument. Why would it do that if the field is not 
 multivalued?

 Is this behavior expected?

 -Rich



RE: getFieldValue always returns an ArrayList?

2011-06-15 Thread Simon, Richard T
We rebuild the index from scratch each time we start (for now). The fields in 
question are not multi-valued; in fact, I explicitly set multi-valued to false, 
just to be sure.

Yes, this is SolrJ, using the embedded server, if that matters.

Using Solr/Lucene 3.1.0.

-Rich

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, June 15, 2011 3:44 PM
To: solr-user@lucene.apache.org
Subject: Re: getFieldValue always returns an ArrayList?

Did you perhaps change the schema but not re-index? I'm grasping
at straws here, but something like this might happen if part of
your index has that field as a multi-valued field

If that't not the problem, what version of solr are you using? I
presume this is SolrJ?

Best
Erick

On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T
richard_si...@hms.harvard.edu wrote:
 Hi - I am examining a SolrDocument I retrieved through a query. The field I 
 am looking at is declared this way in my schema:

 field name=uri type=string indexed=true stored=true 
 multivalued=false required=true /

 I know multivalued defaults to false, but I set it explicitly because I'm 
 seeing some unexpected behavior. I retrieve the value of the field like so:

 final String resource = (String)document.getFieldValue(uri);


 However, I get an exception because an ArrayList is returned. I confirmed 
 that the returned ArrayList has one element with the correct value, but I 
 thought getFieldValue would return a String if the field is single valued. 
 When I index the document, I have some code that retrieves the same field in 
 the same way from the SolrInputDocument, and that code works.

 I looked at the code for SolrDocument.setField and it looks like the only way 
 a field should be set to an ArrayList is if one is passed in by the code 
 creating the SolrDocument. Why would it do that if the field is not 
 multivalued?

 Is this behavior expected?

 -Rich



Re: getFieldValue always returns an ArrayList?

2011-06-15 Thread Erick Erickson
Hmmm, I admit I'm not using embedded, and I'm using 3.2, but I'm
not seeing the behavior you are.

My question about reindexing could have been better stated, I
was just making sure you didn't have some leftover cruft where
your field was multi-valued from previous experiments, but if
you're reindexing each time that's not the problem.

Arrrh, camel case may be striking again. Try multiValued, not
multivalued

If that's still not it, can we see the code?

Best
Erick

On Wed, Jun 15, 2011 at 3:47 PM, Simon, Richard T
richard_si...@hms.harvard.edu wrote:
 We rebuild the index from scratch each time we start (for now). The fields in 
 question are not multi-valued; in fact, I explicitly set multi-valued to 
 false, just to be sure.

 Yes, this is SolrJ, using the embedded server, if that matters.

 Using Solr/Lucene 3.1.0.

 -Rich

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Wednesday, June 15, 2011 3:44 PM
 To: solr-user@lucene.apache.org
 Subject: Re: getFieldValue always returns an ArrayList?

 Did you perhaps change the schema but not re-index? I'm grasping
 at straws here, but something like this might happen if part of
 your index has that field as a multi-valued field

 If that't not the problem, what version of solr are you using? I
 presume this is SolrJ?

 Best
 Erick

 On Wed, Jun 15, 2011 at 2:21 PM, Simon, Richard T
 richard_si...@hms.harvard.edu wrote:
 Hi - I am examining a SolrDocument I retrieved through a query. The field I 
 am looking at is declared this way in my schema:

 field name=uri type=string indexed=true stored=true 
 multivalued=false required=true /

 I know multivalued defaults to false, but I set it explicitly because I'm 
 seeing some unexpected behavior. I retrieve the value of the field like so:

 final String resource = (String)document.getFieldValue(uri);


 However, I get an exception because an ArrayList is returned. I confirmed 
 that the returned ArrayList has one element with the correct value, but I 
 thought getFieldValue would return a String if the field is single valued. 
 When I index the document, I have some code that retrieves the same field in 
 the same way from the SolrInputDocument, and that code works.

 I looked at the code for SolrDocument.setField and it looks like the only 
 way a field should be set to an ArrayList is if one is passed in by the code 
 creating the SolrDocument. Why would it do that if the field is not 
 multivalued?

 Is this behavior expected?

 -Rich