[
https://issues.apache.org/jira/browse/METRON-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448876#comment-16448876
]
ASF GitHub Bot commented on METRON-1526:
----------------------------------------
Github user justinleet commented on the issue:
https://github.com/apache/metron/pull/995
@merrimanr Let me replay my understanding to see if I'm on the right track.
The problem we have is that we're returning fields that we can't reindex as
a whole document when we run a glob query ("*"). In particular, the ones we've
seen are the subfields of LatLon. We can't reindex the _coordinate fields, but
they come back in a search.
These fields will come back if they are either
* stored (which are returned normally),
* docValues that aren't stored, which are returned in the case of a glob
query per Solr docs:
>Field values retrieved during search queries are typically returned from
stored values. However, non-stored docValues fields will be also returned along
with other stored fields when all fields (or pattern matching globs) are
specified to be returned (e.g. “fl=*”)
This is why setting the dynamic field solves the problem (it both makes
them not stored and not docValues).
Is this correct so far?
So I dug the slightest bit into Lucene source for the Currency field (as a
specific example of a nonproblematic field per your test).
Here's a snippet of
```
private void createDynamicCurrencyField(String suffix, FieldType type) {
String name = "*" + POLY_FIELD_SEPARATOR + suffix;
Map<String, String> props = new HashMap<>();
props.put("indexed", "true");
props.put("stored", "false");
props.put("multiValued", "false");
props.put("omitNorms", "true");
int p = SchemaField.calcProps(name, type, props);
schema.registerDynamicFields(SchemaField.create(name, type, p, null));
}
...
@Override
public void inform(IndexSchema schema) {
this.schema = schema;
createDynamicCurrencyField(FIELD_SUFFIX_CURRENCY, fieldTypeCurrency);
createDynamicCurrencyField(FIELD_SUFFIX_AMOUNT_RAW, fieldTypeAmountRaw);
}
```
What's interesting is that it appears to create an entirely new dynamic
field, `*____currency` to catch everything under the hood. This field is not
stored and uses the default docValues, which is false.
Output from a LukeRequest similar to the test above:
```
KEY: *____currency
VALUE NAME: *____currency
FLAGS: [INDEXED, OMIT_NORMS]
KEY: *____amount_raw
VALUE NAME: *____amount_raw
FLAGS: [INDEXED, OMIT_NORMS]
KEY: *
VALUE NAME: *
FLAGS: [DOC_VALUES, OMIT_NORMS, OMIT_TF]
KEY: *.c
VALUE NAME: *.c
FLAGS: [INDEXED, STORED, OMIT_TF]
```
Note that only the catch all has the docValues flag, but the custom type's
currency and amount_raw do not.
The short version is that it seems like the majority of the default types
manage their subfields more reasonably and therefore aren't a problem.
> Location field types cause DocValuesField appear more than once error
> ---------------------------------------------------------------------
>
> Key: METRON-1526
> URL: https://issues.apache.org/jira/browse/METRON-1526
> Project: Metron
> Issue Type: Bug
> Reporter: Ryan Merriman
> Assignee: Ryan Merriman
> Priority: Major
>
> While testing [https://github.com/apache/metron/pull/970] I get this error
> when creating a meta alert:
> {code:java}
> Error from server at http://10.0.2.15:8983/solr/bro: Exception writing
> document id bbc150f5-92f8-485d-93cc-11730c1edf31 to the index; possible
> analysis error: DocValuesField
> \"enrichments.geo.ip_dst_addr.location_point_0_coordinate\" appears more than
> once in this document (only one value is allowed per field){code}
> I tracked it down to the fact that multiple fields are returned for a
> location field. For example when a field named
> "enrichments.geo.ip_dst_addr.location_point" is configured in a schema, these
> fields are returned in a query:
> {code:java}
> {
> "enrichments.geo.ip_dst_addr.location_point_0_coordinate": "33.4499",
> "enrichments.geo.ip_dst_addr.location_point_1_coordinate": "-112.0712",
> "enrichments.geo.ip_dst_addr.location_point": "33.4499,-112.0712"
> }
> {code}
> We need a way to either suppress these extra fields when querying or remove
> them before updating a document.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)