kotman12 commented on code in PR #2382:
URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136


##########
solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java:
##########
@@ -0,0 +1,38 @@
+/*
+ *
+ *  * Licensed to the Apache Software Foundation (ASF) under one or more
+ *  * contributor license agreements.  See the NOTICE file distributed with
+ *  * this work for additional information regarding copyright ownership.
+ *  * The ASF licenses this file to You under the Apache License, Version 2.0
+ *  * (the "License"); you may not use this file except in compliance with
+ *  * the License.  You may obtain a copy of the License at
+ *  *
+ *  *     http://www.apache.org/licenses/LICENSE-2.0
+ *  *
+ *  * Unless required by applicable law or agreed to in writing, software
+ *  * distributed under the License is distributed on an "AS IS" BASIS,
+ *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  * See the License for the specific language governing permissions and
+ *  * limitations under the License.
+ *
+ */
+
+package org.apache.lucene.monitor;
+
+import java.util.Set;
+
+public class MonitorFields {
+
+  public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_";
+  public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_";
+  public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_";
+  public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_";

Review Comment:
   > So I was exploring not specifically w.r.t. making the payload field name 
overridable but the other fields (query id, cache id, query) also and payload 
was just first to explore. Yes, optionally-choosable names would let users 
choose names/terminology to suit their use case, and perhaps might it also then 
reduce or remove the need for the aliasing functionality (which I haven't yet 
looked into further)?
   
   Ok I am on board with the configuration then.
   
   I don't necessarily think it can obviate aliasing because aliasing only 
applies to fields dynamically generated by the presearcher. These don't exist 
in `MonitorFields`. For instance, say you have a query `foo:bar` and you 
naturally have some schema definition that `foo` conforms to in your collection 
of _regular_ documents. You now want to create a collection of queries for 
reverse search and we want to make that as easy as possible. So you use should 
be able to use the same schema (with the same query analyzers etc) with the 
addition of a few control fields. One of those is `__....monitor_alias_*` so 
that when the presearcher is done tokenizing that query `foo:bar` it actually 
sends the token stream to the field `__....monitor_alias_foo`. So if your 
original `foo` field had all sorts of fancy configurations like `docValues` or 
`stored`, the presearcher generated field won't know about any of that and will 
instead give you a plain old indexed field. 
   
   Now because of what appears to be an implementation choice, solr will 
actually let you write a field with a configuration that is different from the 
schema because the only validation is against the first _written_ field (to 
make sure that there is compatibility between the first document's field value 
and all the documents that come after). But relying on this feels hacky. Also, 
the schema discrepancy mentioned above would be a total blocker if you wanted 
to store queries alongside your documents in the same collection (because the 
first written document would be a "true" document with whatever fancy 
configuration you had defined in the schema).
   
   Finally, the multi-pass presearcher sometimes changes field names on the fly 
which further motivated the dynamic aliasing approach. Now if there was ever a 
presearcher that gave you back a different _type_ of field then we'd probably 
have to define another dynamic alias field .. but that doesn't seem like 
something that will happen very often.
   
   > 
https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48
 -- and suffixed an _ underscore, is the suffixing needed or does it just make 
maybe debugging or so easier?
   
   I kind of just assumed based on `_root_` and `_version_` that `_` wrapped 
fields are considered out of the user space, at least by convention, so I 
wanted to be consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to