Hi there,

(sorry for maybe posting twice.)

This might be a bug - I couldn't find anything on Jira or Google.



Versions in use/compared:
Solr 1.3
(Nightly 5th August)
Nightly 22nd September

As RegexTransformer is not different between the two nightlies, the
issue probably appeared before.

ISSUE:
Using RegexTransformer with the sourceColName notation will not populate
multiValued (actually containing multiple values) fields with a list but
instead add only one value per document.

WORKAROUND/WORKING CONFIG:
I've just rerun the index with the only difference between the reruns
being those following two different usages of RegexTransformer:

(Both fields are of type solr.StrField and multiValued.)


was working with 1.3, but not with nightly 22nd Sept:
<field column="participant" sourceColName="person" regex="([^\|]+)\|.*" />
<field column="role" sourceColName="person"
regex="[^\|]+\|\d+,\d+,\d+,(.*)" />


this works with the nightly from 22nd Sept:
<field column="person" groupNames="participant,role"
regex="([^\|]+)\|\d+,\d+,\d+,(.*)" />


Comparing the source code of RegexTransformer 1.3 vs. 22nd Sept, I found:

for (Object result : results)
        row.put(col, result);

(lines 106-107 of transformRow() 22nd of Sept)
This looks like the list items are added using the same key over and
over again which would explain that there is no list but only one item
left in the end.


Cheers!
Chantal


  • RegexTransformer's sou... Chantal Ackermann

Reply via email to