[jira] [Commented] (SOLR-10574) Choose a default configset for Solr 7

JIRA Wed, 14 Jun 2017 12:42:42 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049575#comment-16049575
 ]


Jan Høydahl commented on SOLR-10574:
------------------------------------

bq. Jan Hoydahl         to be present in schema: yes            used by 
default:no
Long-term I want no catch-all field at all. Because no matter how much we 
document and try to educate, reality is that the defaults (or at least the 
practices used by the defaults) will end up in production for a high percentage 
of installs.

Instead let's consider an ability for the ootb default configsets to auto 
search all fields if neither {{df}} or {{qf}} are specified. A potential 
fast-track solution is to extend {{SimpleQParserPlugin}} to interpret {{qf=\*}} 
as a catch-all mode where it simply iterates all indexed fields in schema and 
searches across these. We could then add to our {{/select}} and {{/query}} 
handlers in the default config sets: {{defType=simple&qf=*}}. Or we could make 
{{simple}} the new default parser instead of {{lucene}} (horrible name btw). 
This could of course be introduced in 7.x and start with catchall _text_ in 
7.0.0...

With a {{qf=*}} catch-all, the WARNING in docs needs to instead be a warning 
that {{qf}} should be tuned or else the query may be too expensive for indices 
with many fields. Another issue with this approach is for installs where the 
schema lists hundreds of fields but most docs in the index contain only a 
handful fields. It could perhaps be possible to do a two-phase search where the 
first phase is to compute fields in use for the doc set after applying all 
fq's, and then phase 2 to search across those fields.

> Choose a default configset for Solr 7
> -------------------------------------
>
>                 Key: SOLR-10574
>                 URL: https://issues.apache.org/jira/browse/SOLR-10574
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Ishan Chattopadhyaya
>            Priority: Blocker
>             Fix For: master (7.0)
>
>         Attachments: SOLR-10574.patch, SOLR-10574.patch, SOLR-10574.patch
>
>
> Currently, the data_driven_schema_configs is the default configset when 
> collections are created using the bin/solr script and no configset is 
> specified.
> However, that may not be the best choice. We need to decide which is the best 
> choice, out of the box, considering many users might create collections 
> without knowing about the concept of a configset going forward.
> (See also SOLR-10272)
> Proposed changes:
> # Remove data_driven_schema_configs and basic_configs
> # Introduce a combined configset, {{_default}} based on the above two 
> configsets.
> # Build a "toggleable" data driven functionality into {{_default}}
> Usage:
> # Create a collection (using _default configset)
> # Data driven / schemaless functionality is enabled by default; so just start 
> indexing your documents.
> # If don't want data driven / schemaless, disable this behaviour: {code}
> curl http://host:8983/solr/coll1/config -d '{"set-user-property": 
> {"update.autoCreateFields":"false"}}'
> {code}
> # Create schema fields using schema API, and index documents



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-10574) Choose a default configset for Solr 7

Reply via email to