Re: Managed schema vs schema.xml

2017-03-07 Thread Erick Erickson
I suggest we make additional comments on SOLR-10241. I created it as a
result of this discussion and anyone who takes it on would benefit
from the comments being made there.

Anyone can make comments there, there's no special karma required
although you do have to create a login.  From the interest this thread
has generated so far this is definitely something that could stand
some clarification/code fixes...

Best,
Erick

On Tue, Mar 7, 2017 at 3:01 PM, Alexandre Rafalovitch
 wrote:
> Actually, the main cross-references are from the solrconfig.xml, and
> primarily from the Update Request Handler chain that creates the
> "schemaless" effect. Then, I think you also have highlighters, etc.
>
> I did that full analysis as a presentation at the last Solr
> Revolution: 
> https://www.slideshare.net/arafalov/rebuilding-solr-6-examples-layer-by-layer-lucenesolrrevolution-2016
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 7 March 2017 at 17:18, Shawn Heisey  wrote:
>> On 3/7/2017 1:32 PM, Phil Scadden wrote:
>>>
>>> I would have to say the "basic-config" seems distinctly more than basic.
>>> It is still a huge file. I thought perhaps I could delete every unused field
>>> type, but worried there were some "system" dependencies.
>>
>>
>> This is definitely true.  Solr example configs tend towards including
>> "everything and the kitchen sink".  Although this is good at illustrating
>> everything that Solr can do, it is also VERY overwhelming to new users.  I
>> have found that in my production configs, I tend to strip almost everything
>> out and make them very lean.  I have kept a number of the schema fieldType
>> definitions from the example, particularly those for basic data types, such
>> as numeric fields.
>>
>> Most of the dependencies in a schema will be contained within the schema
>> itself -- fieldTypes that are referenced by field definitions, etc.  There
>> are a few other possible dependencies, such as a default field parameter in
>> a search handler definition that lives in solrconfig.xml.
>>
>> Thanks,
>> Shawn
>>


Re: Managed schema vs schema.xml

2017-03-07 Thread Alexandre Rafalovitch
Actually, the main cross-references are from the solrconfig.xml, and
primarily from the Update Request Handler chain that creates the
"schemaless" effect. Then, I think you also have highlighters, etc.

I did that full analysis as a presentation at the last Solr
Revolution: 
https://www.slideshare.net/arafalov/rebuilding-solr-6-examples-layer-by-layer-lucenesolrrevolution-2016

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 7 March 2017 at 17:18, Shawn Heisey  wrote:
> On 3/7/2017 1:32 PM, Phil Scadden wrote:
>>
>> I would have to say the "basic-config" seems distinctly more than basic.
>> It is still a huge file. I thought perhaps I could delete every unused field
>> type, but worried there were some "system" dependencies.
>
>
> This is definitely true.  Solr example configs tend towards including
> "everything and the kitchen sink".  Although this is good at illustrating
> everything that Solr can do, it is also VERY overwhelming to new users.  I
> have found that in my production configs, I tend to strip almost everything
> out and make them very lean.  I have kept a number of the schema fieldType
> definitions from the example, particularly those for basic data types, such
> as numeric fields.
>
> Most of the dependencies in a schema will be contained within the schema
> itself -- fieldTypes that are referenced by field definitions, etc.  There
> are a few other possible dependencies, such as a default field parameter in
> a search handler definition that lives in solrconfig.xml.
>
> Thanks,
> Shawn
>


Re: Managed schema vs schema.xml

2017-03-07 Thread Shawn Heisey

On 3/7/2017 1:32 PM, Phil Scadden wrote:
I would have to say the "basic-config" seems distinctly more than 
basic. It is still a huge file. I thought perhaps I could delete every 
unused field type, but worried there were some "system" dependencies.


This is definitely true.  Solr example configs tend towards including 
"everything and the kitchen sink".  Although this is good at 
illustrating everything that Solr can do, it is also VERY overwhelming 
to new users.  I have found that in my production configs, I tend to 
strip almost everything out and make them very lean.  I have kept a 
number of the schema fieldType definitions from the example, 
particularly those for basic data types, such as numeric fields.


Most of the dependencies in a schema will be contained within the schema 
itself -- fieldTypes that are referenced by field definitions, etc.  
There are a few other possible dependencies, such as a default field 
parameter in a search handler definition that lives in solrconfig.xml.


Thanks,
Shawn



Re: Managed schema vs schema.xml

2017-03-07 Thread Walter Underwood
Maybe this is expert stuff, but we keep our schema, solrconfig, and everything 
else checked into source control.

I wrote a Python thingy to hit the cluster through the load balancer, get the 
zkHost string from status, upload the files to zookeeper (kazoo is a nice 
library), link the config, then do an async reload.

I’ve been thinking about time stamping the config directories so I can roll 
back to a previous config if the reload fails.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 7, 2017, at 12:47 PM, OTH  wrote:
> 
> In the reference guide, in the chapter named "The Well Configured Solr
> Instance", it says (I'm copying+pasting from the PDF version) :
> 
> Switching from Managed Schema to Manually Edited schema.xml
>> If you have started Solr with managed schema enabled and you would like to
>> switch to manually editing a schem
>> a.xml
>> a.xml file, you should take the following steps:
>> Rename the
>> Rename the managed-schema file to schema.xml.
>> Modify
>> Modify solrconfig.xml to replace the schemaFactory class.
>> Remove any
>> Remove any ManagedIndexSchemaFactory definition if it exists.
>> Add a
>> Add a ClassicIndexSchemaFactory definition as shown above
>> Reload the core(s).
>> Reload the core(s).
>> Apache Solr Reference Guide 6.4 515
>> If you are using SolrCloud, you may need to modify the files via
>> ZooKeeper. The
>> If you are using SolrCloud, you may need to modify the files via
>> ZooKeeper. The bin/solr script provides an
>> easy way to download the files from ZooKeeper and upload them back after
>> edits. See the section
>> easy way to download the files from ZooKeeper and upload them back after
>> edits. See the section ZooKeeper
>> Operations
>> Operations for more information.
>> IndexConfig in SolrConfig
>> The  section of solrconfig.xml defines low-level behavior of
>> the Lucene index writers.
>> By default, the settings are commented out in the sample
>> By default, the settings are commented out in the sample solrconfig.xml 
>> included
>> with Solr, which means
>> the defaults are used. In most cases, the defaults are fine.
>> the defaults are used. In most cases, the defaults are fine.
>> 
>> ...
>> 
>> Parameters covered in this section:
>> Writing New Segments
>> Merging Index Segments
>> Compound File Segments
>> Index Locks
>> Other Indexing Settings
>> Writing New Segments
>> ramBufferSizeMB
>> Once accumulated document updates exceed this much memory space (defined
>> in megabytes), then the
>> pending updates are flushed. This can also create new segments or trigger
>> a merge. Using this setting is
>> generally preferable to maxBufferedDocs. If both maxBufferedDocs and 
>> ramBufferSizeMB
>> are set in s
>> olrconfig.xml
>> olrconfig.xml, then a flush will occur when either limit is reached. The
>> default is 100Mb.
>> 100
>> maxBufferedDocs
>> Sets the number of document updates to buffer in memory before they are
>> flushed as a new segment. This
>> may also trigger a merge. The default Solr configuration sets to flush by
>> RAM usage (ramBufferSizeMB).
>> 1000
>> useCompoundFile
>> Controls whether newly written (and not yet merged) index segments should
>> use the Compound File
>> Segment
>> Segment format. The default is false.
>> false
>> To have full control over your schema.xml file, you may also want to
>> disable schema guessing, which
>> allows unknown fields to be added to the schema during indexing. The
>> properties that enable this feature
>> are discussed in the section
>> allows unknown fields to be added to the schema during indexing. The
>> properties that enable this feature
>> are discussed in the section Schemaless Mode
> 
> 
> On Wed, Mar 8, 2017 at 1:32 AM, Phil Scadden  wrote:
> 
>> I would second that guide could be clearer on that. I read and reread
>> several times trying to get my head around the schema.xml/managed-schema
>> bit. I came away from first cursory reading with the idea that
>> managed-schema was mostly for schema-less mode and only after some stuff
>> ups and puzzling over comments in the basic-config schema file itself did I
>> go back for more careful re-read. I am still not sure that I have got all
>> the nuances. My understanding is:
>> 
>> If you don’t want ability to edit it via admin UI or config api, rename to
>> schema.xml. Unclear whether you have to make changes to other configs to do
>> this. Also unclear to me whether there was any upside at all to using
>> schema.xml? Why degrade functionality? Does the capacity for schema.xml
>> only exist for backward compatibility?
>> 
>> If you want to run schema-less, you have to use managed-schema? (I
>> didn’t delve too deep into this).
>> 
>> In the end, I used basic-config to create core and then hacked
>> managed-schema from there.
>> 
>> 
>> I would have to say the "basic-config" seems distinctly more than basic.
>> It is still a huge file. I thought 

Re: Managed schema vs schema.xml

2017-03-07 Thread OTH
In the reference guide, in the chapter named "The Well Configured Solr
Instance", it says (I'm copying+pasting from the PDF version) :

Switching from Managed Schema to Manually Edited schema.xml
> If you have started Solr with managed schema enabled and you would like to
> switch to manually editing a schem
> a.xml
> a.xml file, you should take the following steps:
> Rename the
> Rename the managed-schema file to schema.xml.
> Modify
> Modify solrconfig.xml to replace the schemaFactory class.
> Remove any
> Remove any ManagedIndexSchemaFactory definition if it exists.
> Add a
> Add a ClassicIndexSchemaFactory definition as shown above
> Reload the core(s).
> Reload the core(s).
> Apache Solr Reference Guide 6.4 515
> If you are using SolrCloud, you may need to modify the files via
> ZooKeeper. The
> If you are using SolrCloud, you may need to modify the files via
> ZooKeeper. The bin/solr script provides an
> easy way to download the files from ZooKeeper and upload them back after
> edits. See the section
> easy way to download the files from ZooKeeper and upload them back after
> edits. See the section ZooKeeper
> Operations
> Operations for more information.
> IndexConfig in SolrConfig
> The  section of solrconfig.xml defines low-level behavior of
> the Lucene index writers.
> By default, the settings are commented out in the sample
> By default, the settings are commented out in the sample solrconfig.xml 
> included
> with Solr, which means
> the defaults are used. In most cases, the defaults are fine.
> the defaults are used. In most cases, the defaults are fine.
> 
> ...
> 
> Parameters covered in this section:
> Writing New Segments
> Merging Index Segments
> Compound File Segments
> Index Locks
> Other Indexing Settings
> Writing New Segments
> ramBufferSizeMB
> Once accumulated document updates exceed this much memory space (defined
> in megabytes), then the
> pending updates are flushed. This can also create new segments or trigger
> a merge. Using this setting is
> generally preferable to maxBufferedDocs. If both maxBufferedDocs and 
> ramBufferSizeMB
> are set in s
> olrconfig.xml
> olrconfig.xml, then a flush will occur when either limit is reached. The
> default is 100Mb.
> 100
> maxBufferedDocs
> Sets the number of document updates to buffer in memory before they are
> flushed as a new segment. This
> may also trigger a merge. The default Solr configuration sets to flush by
> RAM usage (ramBufferSizeMB).
> 1000
> useCompoundFile
> Controls whether newly written (and not yet merged) index segments should
> use the Compound File
> Segment
> Segment format. The default is false.
> false
> To have full control over your schema.xml file, you may also want to
> disable schema guessing, which
> allows unknown fields to be added to the schema during indexing. The
> properties that enable this feature
> are discussed in the section
> allows unknown fields to be added to the schema during indexing. The
> properties that enable this feature
> are discussed in the section Schemaless Mode


On Wed, Mar 8, 2017 at 1:32 AM, Phil Scadden  wrote:

> I would second that guide could be clearer on that. I read and reread
> several times trying to get my head around the schema.xml/managed-schema
> bit. I came away from first cursory reading with the idea that
> managed-schema was mostly for schema-less mode and only after some stuff
> ups and puzzling over comments in the basic-config schema file itself did I
> go back for more careful re-read. I am still not sure that I have got all
> the nuances. My understanding is:
>
> If you don’t want ability to edit it via admin UI or config api, rename to
> schema.xml. Unclear whether you have to make changes to other configs to do
> this. Also unclear to me whether there was any upside at all to using
> schema.xml? Why degrade functionality? Does the capacity for schema.xml
> only exist for backward compatibility?
>
> If you want to run schema-less, you have to use managed-schema? (I
> didn’t delve too deep into this).
>
> In the end, I used basic-config to create core and then hacked
> managed-schema from there.
>
>
> I would have to say the "basic-config" seems distinctly more than basic.
> It is still a huge file. I thought perhaps I could delete every unused
> field type, but worried there were some "system" dependencies. Ie if you
> want *target type wildcard queries do you need to have text_general_reverse
> and a copy to it? If you always explicitly set only defined fields in a
> custom indexer, then can you dump the whole dynamic fields bit?
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
>


Re: Managed schema vs schema.xml

2017-03-07 Thread Erick Erickson
See SOLR-10241 I just opened for discussion. My first impulse (well
actually second) is to _not_ encourage anyone to hand-edit managed
schema, and especially not put that in the ref guide.

But perhaps put the classic schema factory in a comment in
basic_configs and direct people there (and maybe even from the ref
guide) if they want to do the classic managed schema.

So I think the direction here is to say, basically:
1> if you want to hand-edit, use classic schema factory, see the
comments in configset XX
2> Otherwise use managed schema and modify it via the rest API.

and leave out mention of hand-editing managed-schema, that's expert level stuff.

FWIW,
Erick

Which is entirely separate from clarifying the ref guide

On Tue, Mar 7, 2017 at 12:11 PM, Alexandre Rafalovitch
 wrote:
> On 7 March 2017 at 15:02, OTH  wrote:
>> Specifically, that 'managed-schema' could indeed be modified by hand, or
>> even that what the HTTP API is doing is actually modifying this file.
>
> Thank you for the specific feedback. That is something we should fold
> into the Guide as you are not the only one asking this specific aspect
> of the question. And seeing that you've read the guide first, it is
> obvious that the question is not fully answered.
>
> Regards,
>Alex.
>
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced


RE: Managed schema vs schema.xml

2017-03-07 Thread Phil Scadden
I would second that guide could be clearer on that. I read and reread several 
times trying to get my head around the schema.xml/managed-schema bit. I came 
away from first cursory reading with the idea that managed-schema was mostly 
for schema-less mode and only after some stuff ups and puzzling over comments 
in the basic-config schema file itself did I go back for more careful re-read. 
I am still not sure that I have got all the nuances. My understanding is:

If you don’t want ability to edit it via admin UI or config api, rename to 
schema.xml. Unclear whether you have to make changes to other configs to do 
this. Also unclear to me whether there was any upside at all to using 
schema.xml? Why degrade functionality? Does the capacity for schema.xml only 
exist for backward compatibility?

If you want to run schema-less, you have to use managed-schema? (I didn’t 
delve too deep into this).

In the end, I used basic-config to create core and then hacked managed-schema 
from there.


I would have to say the "basic-config" seems distinctly more than basic. It is 
still a huge file. I thought perhaps I could delete every unused field type, 
but worried there were some "system" dependencies. Ie if you want *target type 
wildcard queries do you need to have text_general_reverse and a copy to it? If 
you always explicitly set only defined fields in a custom indexer, then can you 
dump the whole dynamic fields bit?
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.


Re: Managed schema vs schema.xml

2017-03-07 Thread Alexandre Rafalovitch
On 7 March 2017 at 15:02, OTH  wrote:
> Specifically, that 'managed-schema' could indeed be modified by hand, or
> even that what the HTTP API is doing is actually modifying this file.

Thank you for the specific feedback. That is something we should fold
into the Guide as you are not the only one asking this specific aspect
of the question. And seeing that you've read the guide first, it is
obvious that the question is not fully answered.

Regards,
   Alex.


http://www.solr-start.com/ - Resources for Solr users, new and experienced


Re: Managed schema vs schema.xml

2017-03-07 Thread OTH
Hi,

Thanks, I should've consulted this guide more thoroughly.  I actually had
encountered this section when reading the guide, but somehow forgot about
it when asking this question.  I think, it doesn't clarify some things very
well, which could leave a beginner a bit confused.

Specifically, that 'managed-schema' could indeed be modified by hand, or
even that what the HTTP API is doing is actually modifying this file.
When I was first checking out Solr, I saw this section and remembered
thinking how verbose it was to make changes this way, because I saw on some
website how someone was making changes to a 'schema.xml' file instead, and
that seemed easier.  This file was supposed to be in 'conf' but I couldn't
find it... so I tried making the changes to modified-schema instead and it
worked.  But then I also read somewhere that you aren't supposed to do
that, so I wasn't sure how to do things going forward.

Anyways, I'm clearer now that the managed-schema does safely allow
hand-edits if done properly, which might in some cases be easier than the
HTTP calls; and at the same time it offers the HTTP API as an option as
well when needed / preferred.

Much thanks

On Tue, Mar 7, 2017 at 9:50 PM, Alexandre Rafalovitch 
wrote:

> Yes, it has been asked many times and has been answered both on the
> list and in the - awesome - Reference Guide. I'd recommend reading
> that and then coming back again with more specific question:
> https://cwiki.apache.org/confluence/display/solr/Overview+of+Documents%2C+
> Fields%2C+and+Schema+Design
>
> One confusion to clarify though. API is HTTP API, Admin UI just uses
> it and does not - yet - expose everything possible. You can always
> just hit Solr directly for the missing bits. Again, RTARG (.. Awesome
> Reference Guide) and then come back with specifics:
> https://cwiki.apache.org/confluence/display/solr/Schema+API
>
> Regards,
>Alex.
>
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 7 March 2017 at 11:41, OTH  wrote:
> > Hello
> >
> > I'm sure this has been asked many times but I'm having some confusion
> here.
> >
> > I understand that managed-schema is not supposed to be edited by hand but
> > only via the "API".  All I understand about this "API" however, is that
> it
> > may be referring to the "Schema" page in the Solr browser-based Admin.
> >
> > However, in this "Schema" page, it provides options for "Add Field", "Add
> > Dynamic Field", "Add Copy Field"; but when I was trying to add a
> > "fieldType", I couldn't find any way to do this from this web page.
> >
> > So I instead edited the managed-schema page by hand, which I understand
> can
> > be problematic if the schema is ever edited it via the API later on?
> >
> > I am using v. 6.4.1; when I create a new core, it creates the
> > managed-schema file in the 'conf' folder.  Is there any way to use the
> > older 'schema.xml' format instead?  Because there seems to be more
> > documentation available for that, and like I describe, the browser API
> > seems to perhaps be lacking.
> >
> > If so - what do users usually prefer; schema.xml or managed-schema?  (I'm
> > aware this depends on individual preference, but would be nice to get
> > others' feedback.)
> >
> > Thanks
>


Re: Managed schema vs schema.xml

2017-03-07 Thread Alexandre Rafalovitch
Yes, it has been asked many times and has been answered both on the
list and in the - awesome - Reference Guide. I'd recommend reading
that and then coming back again with more specific question:
https://cwiki.apache.org/confluence/display/solr/Overview+of+Documents%2C+Fields%2C+and+Schema+Design

One confusion to clarify though. API is HTTP API, Admin UI just uses
it and does not - yet - expose everything possible. You can always
just hit Solr directly for the missing bits. Again, RTARG (.. Awesome
Reference Guide) and then come back with specifics:
https://cwiki.apache.org/confluence/display/solr/Schema+API

Regards,
   Alex.


http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 7 March 2017 at 11:41, OTH  wrote:
> Hello
>
> I'm sure this has been asked many times but I'm having some confusion here.
>
> I understand that managed-schema is not supposed to be edited by hand but
> only via the "API".  All I understand about this "API" however, is that it
> may be referring to the "Schema" page in the Solr browser-based Admin.
>
> However, in this "Schema" page, it provides options for "Add Field", "Add
> Dynamic Field", "Add Copy Field"; but when I was trying to add a
> "fieldType", I couldn't find any way to do this from this web page.
>
> So I instead edited the managed-schema page by hand, which I understand can
> be problematic if the schema is ever edited it via the API later on?
>
> I am using v. 6.4.1; when I create a new core, it creates the
> managed-schema file in the 'conf' folder.  Is there any way to use the
> older 'schema.xml' format instead?  Because there seems to be more
> documentation available for that, and like I describe, the browser API
> seems to perhaps be lacking.
>
> If so - what do users usually prefer; schema.xml or managed-schema?  (I'm
> aware this depends on individual preference, but would be nice to get
> others' feedback.)
>
> Thanks


Re: Managed schema vs schema.xml

2017-03-07 Thread OTH
Hi,

Thanks, that sufficiently answers the question.
It's especially good to know now that hand-editing is fine, as long as it's
separated from API calls with restarts in between.

Thanks

On Tue, Mar 7, 2017 at 9:57 PM, Shawn Heisey  wrote:

> On 3/7/2017 9:41 AM, OTH wrote:
> > I understand that managed-schema is not supposed to be edited by hand but
> > only via the "API".  All I understand about this "API" however, is that
> it
> > may be referring to the "Schema" page in the Solr browser-based Admin.
> >
> > However, in this "Schema" page, it provides options for "Add Field", "Add
> > Dynamic Field", "Add Copy Field"; but when I was trying to add a
> > "fieldType", I couldn't find any way to do this from this web page.
>
> The schema page in the admin UI is not actually the Schema API, but it
> USES the Schema API.  The admin UI is a javascript app that runs in your
> browser and makes Solr API requests.  Admin UI URLs are useless outside
> of a full browser.
>
> > So I instead edited the managed-schema page by hand, which I understand
> can
> > be problematic if the schema is ever edited it via the API later on?
>
> Hand-editing is only problematic if you mix those edits with using the
> API and forget to reload or restart after a hand-edit and before using
> the API.  If you are careful to reload/restart before switching editing
> methods, there will be no problems.
>
> > I am using v. 6.4.1; when I create a new core, it creates the
> > managed-schema file in the 'conf' folder.  Is there any way to use the
> > older 'schema.xml' format instead?  Because there seems to be more
> > documentation available for that, and like I describe, the browser API
> > seems to perhaps be lacking.
>
> The "format" of the schema never changes.  It is exactly the same with
> either file.  It is the filename that is different.  Also, the managed
> schema allows the Schema API to be used, so you can edit it with HTTP
> requests.  If you switch to the Classic schema, then it will go back to
> schema.xml.  Depending on which example configuration you start with,
> switching back to Classic may require more config edits beyond just
> changing the schema factory.  There are additional features Solr can use
> that rely on the managed schema.
>
> > If so - what do users usually prefer; schema.xml or managed-schema?  (I'm
> > aware this depends on individual preference, but would be nice to get
> > others' feedback.)
>
> As for what users prefer, I do not know.  I can tell you that the
> default schema factory has been the managed schema since version 5.5,
> and all example configs since that version are using it.  When I upgrade
> to a 6.x version in production, I plan on keeping the managed schema,
> because it's good to go with defaults unless there's a good reason not
> to, but I will continue to hand-edit for all changes.
>
> Thanks,
> Shawn
>
>


Re: Managed schema vs schema.xml

2017-03-07 Thread Ivan Bianchi
Hi OTH,

I personally prefer to use the classic *schema.xml* file as I feel its
better for core creation with the desired fields than dealing with api
calls.

You can use it specifying the schemaFactory class as
ClassicIndexSchemaFactory as follows:



Best regards,
Ivan

2017-03-07 17:41 GMT+01:00 OTH :

> Hello
>
> I'm sure this has been asked many times but I'm having some confusion here.
>
> I understand that managed-schema is not supposed to be edited by hand but
> only via the "API".  All I understand about this "API" however, is that it
> may be referring to the "Schema" page in the Solr browser-based Admin.
>
> However, in this "Schema" page, it provides options for "Add Field", "Add
> Dynamic Field", "Add Copy Field"; but when I was trying to add a
> "fieldType", I couldn't find any way to do this from this web page.
>
> So I instead edited the managed-schema page by hand, which I understand can
> be problematic if the schema is ever edited it via the API later on?
>
> I am using v. 6.4.1; when I create a new core, it creates the
> managed-schema file in the 'conf' folder.  Is there any way to use the
> older 'schema.xml' format instead?  Because there seems to be more
> documentation available for that, and like I describe, the browser API
> seems to perhaps be lacking.
>
> If so - what do users usually prefer; schema.xml or managed-schema?  (I'm
> aware this depends on individual preference, but would be nice to get
> others' feedback.)
>
> Thanks
>



-- 
Ivan


Re: Managed schema vs schema.xml

2017-03-07 Thread Shawn Heisey
On 3/7/2017 9:41 AM, OTH wrote:
> I understand that managed-schema is not supposed to be edited by hand but
> only via the "API".  All I understand about this "API" however, is that it
> may be referring to the "Schema" page in the Solr browser-based Admin.
>
> However, in this "Schema" page, it provides options for "Add Field", "Add
> Dynamic Field", "Add Copy Field"; but when I was trying to add a
> "fieldType", I couldn't find any way to do this from this web page.

The schema page in the admin UI is not actually the Schema API, but it
USES the Schema API.  The admin UI is a javascript app that runs in your
browser and makes Solr API requests.  Admin UI URLs are useless outside
of a full browser.

> So I instead edited the managed-schema page by hand, which I understand can
> be problematic if the schema is ever edited it via the API later on?

Hand-editing is only problematic if you mix those edits with using the
API and forget to reload or restart after a hand-edit and before using
the API.  If you are careful to reload/restart before switching editing
methods, there will be no problems.

> I am using v. 6.4.1; when I create a new core, it creates the
> managed-schema file in the 'conf' folder.  Is there any way to use the
> older 'schema.xml' format instead?  Because there seems to be more
> documentation available for that, and like I describe, the browser API
> seems to perhaps be lacking.

The "format" of the schema never changes.  It is exactly the same with
either file.  It is the filename that is different.  Also, the managed
schema allows the Schema API to be used, so you can edit it with HTTP
requests.  If you switch to the Classic schema, then it will go back to
schema.xml.  Depending on which example configuration you start with,
switching back to Classic may require more config edits beyond just
changing the schema factory.  There are additional features Solr can use
that rely on the managed schema.

> If so - what do users usually prefer; schema.xml or managed-schema?  (I'm
> aware this depends on individual preference, but would be nice to get
> others' feedback.)

As for what users prefer, I do not know.  I can tell you that the
default schema factory has been the managed schema since version 5.5,
and all example configs since that version are using it.  When I upgrade
to a 6.x version in production, I plan on keeping the managed schema,
because it's good to go with defaults unless there's a good reason not
to, but I will continue to hand-edit for all changes.

Thanks,
Shawn