[
https://issues.apache.org/jira/browse/SOLR-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951357#comment-14951357
]
Hoss Man commented on SOLR-8113:
--------------------------------
Gus, just read through your patch.
My chief concerns are:
# you've redefined the semantics of how the {{dest}} string is interpreted when
a {{fieldRegex}} is used to identify the source (so there's a back compat break
there depending on the value of {{dest}})
# You've designed the "config syntax" for this new feature around the
requirement that it can _only_ be used if at least one {{fieldRegex}} is used
to identify the source fields ...
The original purpose of the {{FieldSelector}} API was to provide more general
appoaches for configuring which fields and {{UpdateProcessor}} should care
about beyond simple string field name glob/pattern matching. I think that
pattern replacements for _destination_ field naming should (in general) be
independent of the original selection criteria, so that a user could say
something like...
bq. I want to make a copy of _any_ {{StrField}} in my documents such that the
copy has the same name as the original but with {{_t}} appended.
...and that shold be possible with this feature, regardless of wether the user
is using an specific naming convention (ie "*_s") for all StrFields in their
index, using some syntax that might look like this...
{code}
<processor class="solr.CloneFieldUpdateProcessorFactory">
<!-- existing source selector syntax -->
<lst name="source">
<str name="typeClass">solr.StrField</str>
</lst>
<!-- hypothetical new destination pattern syntax -->
<lst name="dest">
<str name="pattern">.*</str>
<str name="replacement">$0_t</str>
</lst>
</processor>
{code}
...while prefix\->prefix and suffix\->suffix style of cloning similar to what
{{copyField}} supports could also be specified. Example: a {{<copyField
src="\*_s" dest="\*_t" />}} equivilent would be...
{code}
<processor class="solr.CloneFieldUpdateProcessorFactory">
<!-- existing source selector syntax -->
<lst name="source">
<str name="fieldRegex">^(.*)_s$</str>
</lst>
<!-- hypothetical new destination pattern syntax -->
<lst name="dest">
<str name="pattern">^(.*)_s$</str>
<str name="replacement">$1_t</str>
</lst>
</processor>
{code}
That's fairly verbose, but if we get the nuts & blots of the general case
implemented, then it should be trivial to add syntactic sugar to simplify the
configuration...
{code}
<processor class="solr.CloneFieldUpdateProcessorFactory">
<!-- hypothetical syntactic sugar equivilent to the above example -->
<!-- since no other source selector args are specified, assume pattern based
cloning -->
<str name="pattern">^(.*)_s$</str>
<str name="replacement">$1_t</str>
</processor>
{code}
What do you think?
> Accept replacement strings in CloneFieldUpdateProcessorFactory
> --------------------------------------------------------------
>
> Key: SOLR-8113
> URL: https://issues.apache.org/jira/browse/SOLR-8113
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 5.3
> Reporter: Gus Heck
> Attachments: SOLR-8113.patch
>
>
> Presently CloneFieldUpdateProcessorFactory accepts regular expressions to
> select source fields, which mirrors wildcards in the source for copyField in
> the schema. This patch adds a counterpart to copyField's wildcards in the
> dest attribute by interpreting the dest parameter as a regex replacement
> string.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]