I appreciate your performance point. One of the things I considered, if performance becomes an issue, is putting a cache in that would map from a field name to the full set of fields it should copy to. We would only go through the longer dynamic matching exercise once, and then put the results in the cache. But I wanted to hold off on until such time as there is evidence that the dynamic approach is too costly.
Has Solr been profiled to see where the bottlenecks really are? -D -----Original Message----- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 06, 2006 9:58 AM To: [email protected] Subject: Re: [jira] Created: (SOLR-21) Dynamic copying of fields (allow wildcard sources in copyField) Ahh, so your copyField allows a source field that isn't even defined in the schema... that actually goes a bit beyond extending copyField to dynamic fields. My mental model had been to "extend copyFields to encompass dynamicFields", and I think yours was "make copyFields dynamic". My original though was that a copyField source would exactly match a dynamicField, and that the copyFields map<String,SchemaField[]> could be extended to handle dynamic fields too. The key for a normal field might be "foo", and the key for a dynamic field could be "*_i" for example. The upside to your approach is more flexibility... the copyField source need not have any schema field definition, or could actually cover multiple at once. One of the downsides is that there isn't a fast-path of a single hash lookup to retrieve the fields and a single hash lookup to retrieve the dynamicFields. Maybe it's nothing to worry about compared to the rest of the indexing process though. -Yonik On 6/6/06, Darren Vengroff <[EMAIL PROTECTED]> wrote: > Hi Yonik, > > The purpose of hasExplicitField() is to enable > > DocumentBuilder. addField(String name, String val, float boost) > > to determine whether or not it should add a single field for the name that > was given. This should only happen if > > a) The name was specified as a field in the schema. > or > b) The name matches a dynamic field. > > If neither of these is the case, then we still might copy the value to one > or more other fields due to a wildcard match in a copyField. > > The syntax you described below, where the destination contains a wildcard, > is not supported by this implementation. The destination must be a an > explicit field, meeting the conditions above. > > Does that clarify what I was trying to do? > > -D > > -----Original Message----- > From: Yonik Seeley [mailto:[EMAIL PROTECTED] > Sent: Tuesday, June 06, 2006 9:01 AM > To: [email protected] > Subject: Re: [jira] Created: (SOLR-21) Dynamic copying of fields (allow > wildcard sources in copyField) > > Hi Darren, > I'm a bit confused about the meaning of hasExplicitField... > > If I have a <copyField source="*_a" dest="*_b"/> > The dynamic fields *_a and *_b must both be defined, right? > In that case, it seems like "if it matches a field or dynamicField > declaration" would always be true, no? > > + /** > + * Does the schema have the specified field defined explicitly, i.e. > + * not as a result of a copyField declaration with a wildcard? We > + * consider it explicitly defined if it matches a field or dynamicField > + * declaration. > + * @param fieldName > + * @return true if explicitly declared in the schema. > + */ > + public boolean hasExplicitField(String fieldName) { > + if(fields.containsKey(fieldName)) { > + return true; > + } > + > + for (DynamicField df : dynamicFields) { > + if (df.matches(fieldName)) return true; > + } > + > + return false; > + } > > -Yonik > > > On 6/6/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > Thanks Darren! > > I'm looking it over now. > > > > -Yonik > > > > On 6/6/06, Darren Erik Vengroff (JIRA) <[EMAIL PROTECTED]> wrote: > > > Dynamic copying of fields (allow wildcard sources in copyField) > > > --------------------------------------------------------------- > > > > > > Key: SOLR-21 > > > URL: http://issues.apache.org/jira/browse/SOLR-21 > > > Project: Solr > > > Type: New Feature > > > > > > Components: update > > > Environment: all > > > Reporter: Darren Erik Vengroff > > > Attachments: dynamicCopy.patch > > > > > > It would be really nice if it were possible to use wildcards to do > things like: > > > > > > <copyField source="*_t" dest="text"/> > > > > > > The above example copies all fields ending in "_t" to the "text" field. > > > > > > I've put together a patch to do this. If there are mutlitple matches, > all copies are done. If there is a match in a dynamicField, then the > dynamic field is also generated, subject to the existing rules that short > expressions go first. I tried to stick to the spirit of the code as I saw > it, and made what I thought were a minimal reasonable set of changes. The > patch includes some additional tests in ConvertedLegacyTest.java to test the > new functionality. That may not be the best place for new tests, but it > beats no tests. > > > > > > I'd really like to get this, or some improved variant of it into the > codebase, as it's quite important to my application. Please review and > comment/criticize as you see fit. > > > > > > -- > > > This message is automatically generated by JIRA. > > > - > > > If you think it was sent incorrectly contact one of the administrators: > > > http://issues.apache.org/jira/secure/Administrators.jspa > > > - > > > For more information on JIRA, see: > > > http://www.atlassian.com/software/jira > > -- -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server
