RE: Schema API specifying different analysers for query and index

2021-03-02 Thread ufuk yılmaz
It worked! Thanks Mr. Rafalovitch. I just removed “type”: “query”.. keys from 
the json, and used indexAnalyzer and queryAnalyzer in place of analyzer json 
node.

Sent from Mail for Windows 10

From: Alexandre Rafalovitch
Sent: 03 March 2021 01:19
To: solr-user
Subject: Re: Schema API specifying different analysers for query and index

RefGuide gives this for Adding, I would hope the Replace would be similar:

curl -X POST -H 'Content-type:application/json' --data-binary '{
  "add-field-type":{
 "name":"myNewTextField",
 "class":"solr.TextField",
 "indexAnalyzer":{
"tokenizer":{
   "class":"solr.PathHierarchyTokenizerFactory",
   "delimiter":"/" }},
 "queryAnalyzer":{
    "tokenizer":{
   "class":"solr.KeywordTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema

So, indexAnalyzer/queryAnalyzer, rather than array:
https://lucene.apache.org/solr/guide/8_8/schema-api.html#add-a-new-field-type

Hope this works,
Alex.
P.s. Also check whether you are using matching API and V1/V2 end point.

On Tue, 2 Mar 2021 at 15:25, ufuk yılmaz  wrote:
>
> Hello,
>
> I’m trying to change a field’s query analysers. The following works but it 
> replaces both index and query type analysers:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzer": {
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }
> }
> }
>
> I tried to change analyzer field to analyzers, to specify different analysers 
> for query and index, but it gave error:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzers": [{
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> },{
> "type": "index",
> "tokenizer": {
> "class": "solr.KeywordTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }]
> }
> }
>
> "errorMessages":["Plugin init failure for [schema.xml]
> "msg":"error processing commands",...
>
> How can I specify different analyzers for query and index type when using 
> schema api?
>
> Sent from Mail for Windows 10
>



Re: Schema API specifying different analysers for query and index

2021-03-02 Thread Alexandre Rafalovitch
RefGuide gives this for Adding, I would hope the Replace would be similar:

curl -X POST -H 'Content-type:application/json' --data-binary '{
  "add-field-type":{
 "name":"myNewTextField",
 "class":"solr.TextField",
 "indexAnalyzer":{
"tokenizer":{
   "class":"solr.PathHierarchyTokenizerFactory",
   "delimiter":"/" }},
 "queryAnalyzer":{
    "tokenizer":{
   "class":"solr.KeywordTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema

So, indexAnalyzer/queryAnalyzer, rather than array:
https://lucene.apache.org/solr/guide/8_8/schema-api.html#add-a-new-field-type

Hope this works,
Alex.
P.s. Also check whether you are using matching API and V1/V2 end point.

On Tue, 2 Mar 2021 at 15:25, ufuk yılmaz  wrote:
>
> Hello,
>
> I’m trying to change a field’s query analysers. The following works but it 
> replaces both index and query type analysers:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzer": {
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }
> }
> }
>
> I tried to change analyzer field to analyzers, to specify different analysers 
> for query and index, but it gave error:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzers": [{
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> },{
> "type": "index",
> "tokenizer": {
> "class": "solr.KeywordTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }]
> }
> }
>
> "errorMessages":["Plugin init failure for [schema.xml]
> "msg":"error processing commands",...
>
> How can I specify different analyzers for query and index type when using 
> schema api?
>
> Sent from Mail for Windows 10
>


Schema API specifying different analysers for query and index

2021-03-02 Thread ufuk yılmaz
Hello,

I’m trying to change a field’s query analysers. The following works but it 
replaces both index and query type analysers:

{
"replace-field-type": {
"name": "string_ci",
"class": "solr.TextField",
"sortMissingLast": true,
"omitNorms": true,
"stored": true,
"docValues": false,
"analyzer": {
"type": "query",
"tokenizer": {
"class": "solr.StandardTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
}
]
}
}
}

I tried to change analyzer field to analyzers, to specify different analysers 
for query and index, but it gave error:

{
"replace-field-type": {
"name": "string_ci",
"class": "solr.TextField",
"sortMissingLast": true,
"omitNorms": true,
"stored": true,
"docValues": false,
"analyzers": [{
"type": "query",
"tokenizer": {
"class": "solr.StandardTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
}
]
},{
"type": "index",
"tokenizer": {
    "class": "solr.KeywordTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
}
]
}]
}
}

"errorMessages":["Plugin init failure for [schema.xml]
"msg":"error processing commands",...

How can I specify different analyzers for query and index type when using 
schema api?

Sent from Mail for Windows 10



Re: Meaning of "Index" flag under properties and schema

2021-02-17 Thread Alexandre Rafalovitch
I wonder if looking more directly at the indexes would allow you to
get closer to the problem source.

Have you tried comparing/exploring the indexes with Luke? It is in the
Lucene distribution (not Solr), and there is a small explanation here:
https://mocobeta.medium.com/luke-become-an-apache-lucene-module-as-of-lucene-8-1-7d139c998b2

Regards,
   Alex.

On Wed, 17 Feb 2021 at 16:58, Vivaldi  wrote:
>
> I was getting “illegal argument exception length must be >= 1” when I used 
> significantTerms streaming expression, from this collection and field. I 
> asked about that as a separate question on this list. I will get the whole 
> exception stack trace the next time I am at the customer site.
>
> Why any other field in other collections doesn’t have that flag? We have 
> numerous indexed, non-indexed, docvalues fields in other collections but not 
> that row
>
> Sent from my iPhone
>
> > On 16 Feb 2021, at 20:42, Shawn Heisey  wrote:
> >
> >> On 2/16/2021 9:16 AM, ufuk yılmaz wrote:
> >> I didn’t realise that, sorry. The table is like:
> >> Flags   Indexed Tokenized   Stored  UnInvertible
> >> Properties  YesYesYes Yes
> >> Schema  YesYesYes Yes
> >> Index   YesYesYes NO
> >> Problematic collection has a Index row under Schema row. No other 
> >> collection has it. I was asking about what the “Index” meant
> >
> > I am not completely sure, but I think that row means the field was found in 
> > the actual Lucene index.
> >
> > In the original message you mentioned "weird exceptions" but didn't include 
> > any information about them.  Can you give us those exceptions, and the 
> > requests that caused them?
> >
> > Thanks,
> > Shawn
>


Re: Meaning of "Index" flag under properties and schema

2021-02-17 Thread Vivaldi
I was getting “illegal argument exception length must be >= 1” when I used 
significantTerms streaming expression, from this collection and field. I asked 
about that as a separate question on this list. I will get the whole exception 
stack trace the next time I am at the customer site.

Why any other field in other collections doesn’t have that flag? We have 
numerous indexed, non-indexed, docvalues fields in other collections but not 
that row

Sent from my iPhone

> On 16 Feb 2021, at 20:42, Shawn Heisey  wrote:
> 
>> On 2/16/2021 9:16 AM, ufuk yılmaz wrote:
>> I didn’t realise that, sorry. The table is like:
>> Flags   Indexed Tokenized   Stored  UnInvertible
>> Properties  YesYesYes Yes
>> Schema  YesYesYes Yes
>> Index   YesYesYes NO
>> Problematic collection has a Index row under Schema row. No other collection 
>> has it. I was asking about what the “Index” meant
> 
> I am not completely sure, but I think that row means the field was found in 
> the actual Lucene index.
> 
> In the original message you mentioned "weird exceptions" but didn't include 
> any information about them.  Can you give us those exceptions, and the 
> requests that caused them?
> 
> Thanks,
> Shawn



Re: Meaning of "Index" flag under properties and schema

2021-02-16 Thread Shawn Heisey

On 2/16/2021 9:16 AM, ufuk yılmaz wrote:

I didn’t realise that, sorry. The table is like:

Flags   Indexed Tokenized   Stored  UnInvertible

Properties  YesYesYes Yes
Schema  YesYesYes Yes
Index   YesYesYes NO

Problematic collection has a Index row under Schema row. No other collection 
has it. I was asking about what the “Index” meant


I am not completely sure, but I think that row means the field was found 
in the actual Lucene index.


In the original message you mentioned "weird exceptions" but didn't 
include any information about them.  Can you give us those exceptions, 
and the requests that caused them?


Thanks,
Shawn


RE: Meaning of "Index" flag under properties and schema

2021-02-16 Thread ufuk yılmaz
I didn’t realise that, sorry. The table is like:

Flags   Indexed Tokenized   Stored  UnInvertible

Properties  YesYesYes Yes
Schema  YesYesYes Yes
Index   YesYesYes NO


Problematic collection has a Index row under Schema row. No other collection 
has it. I was asking about what the “Index” meant

-ufuk

Sent from Mail for Windows 10

From: Charlie Hull
Sent: 16 February 2021 18:48
To: solr-user@lucene.apache.org
Subject: Re: Meaning of "Index" flag under properties and schema

This list strips attachments so you'll have to figure out another way to 
show the difference,

Cheers

Charlie

On 16/02/2021 15:16, ufuk yılmaz wrote:
>
> There’s a collection at our customer’s site giving weird exceptions 
> when a particular field is involved (asked another question detailing 
> that).
>
> When I inspected it, there’s only one difference between it and other 
> dozens of fine working collections, which is,
>
> A text_general field in all other collections has the above 
> configuration without my artsy paint edits, but only that problematic 
> collection has an “index” flag with indexed tokenized and stored 
> checked. I never saw this “Index” flag before. What does it mean?
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for 
> Windows 10
>

-- 
Charlie Hull - Managing Consultant at OpenSource Connections Limited 

Founding member of The Search Network <https://thesearchnetwork.com/> 
and co-author of Searching the Enterprise 
<https://opensourceconnections.com/about-us/books-resources/>
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828



Re: Meaning of "Index" flag under properties and schema

2021-02-16 Thread Charlie Hull
This list strips attachments so you'll have to figure out another way to 
show the difference,


Cheers

Charlie

On 16/02/2021 15:16, ufuk yılmaz wrote:


There’s a collection at our customer’s site giving weird exceptions 
when a particular field is involved (asked another question detailing 
that).


When I inspected it, there’s only one difference between it and other 
dozens of fine working collections, which is,


A text_general field in all other collections has the above 
configuration without my artsy paint edits, but only that problematic 
collection has an “index” flag with indexed tokenized and stored 
checked. I never saw this “Index” flag before. What does it mean?


Sent from Mail  for 
Windows 10




--
Charlie Hull - Managing Consultant at OpenSource Connections Limited 

Founding member of The Search Network  
and co-author of Searching the Enterprise 


tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828


Meaning of "Index" flag under properties and schema

2021-02-16 Thread ufuk yılmaz

There’s a collection at our customer’s site giving weird exceptions when a 
particular field is involved (asked another question detailing that).

When I inspected it, there’s only one difference between it and other dozens of 
fine working collections, which is,


A text_general field in all other collections has the above configuration 
without my artsy paint edits, but only that problematic collection has an 
“index” flag with indexed tokenized and stored checked. I never saw this 
“Index” flag before. What does it mean?




Sent from Mail for Windows 10



Atomic Update Failures with Nested Schema and Lazy Field Loading

2021-01-13 Thread Ronen Nussbaum
Hi,



I’ve encountered another issue that might be related to nested schema.

Not always, but many times atomic updates fails for some shards with the
message “TransactionLog doesn't know how to serialize class
org.apache.lucene.document.LazyDocument$LazyField”.

I checked both options:

   1. Set false.
   2. Set true but removed
   child documents.

In both cases atomic update worked without any errors.

This might suggest that there is an issue with this combination.



Thanks in advance,
Ronen.


Atomic Update Failures with Nested Schema and Lazy Field Loading

2021-01-05 Thread Nussbaum, Ronen
Hi,

I've encountered another issue that might be related to nested schema.
Not always, but many times atomic updates fails for some shards with the 
message "TransactionLog doesn't know how to serialize class 
org.apache.lucene.document.LazyDocument$LazyField".
The client retrieves documents, constructs bulk of input documents (id and 
changed field), adds the bulk and finally sends an explicit commit.
I checked both options:

  1.  Set false.
  2.  Set true but removed 
child documents.
In both cases atomic update worked without any errors.
This might suggest that there is an issue with this combination.

Thanks in advance,
Ronen.



This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.


Atomic Update Failures with Nested Schema and Lazy Field Loading

2020-12-29 Thread Nussbaum, Ronen
Hi,

I've encountered another issue that might be related to nested schema.
Not always, but many times atomic updates fails for some shards with the 
message "TransactionLog doesn't know how to serialize class 
org.apache.lucene.document.LazyDocument$LazyField".
I checked both options:

  1.  Set false.
  2.  Set true but removed 
child documents.
In both cases atomic update worked without any errors.
This might suggest that there is an issue with this combination.

Thanks in advance,
Ronen.



This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.


ManagedIndexSchema takes long for larger schema changes

2020-12-10 Thread Tiziano Degaetano
Hello,

I was checking why my initial schema change is taking several minutes using the 
managed schema api.
VisualVm shows that most of the time is used in 
ManagedIndexSchema.postReadInform

[cid:image001.png@01D6CEE9.16DA5EC0]

Looking at the code shows that postReadInform is executed for every 
modification, and performs an inform on all fields.
At the end inform is called ChagesToSchema * Fields times.

I prepared a PR that changes the flow to only postReadInform once after the 
changes are done.
improve speed of large schema changes for ManagedIndexSchema · 
tizianodeg/lucene-solr@54d2161 · 
GitHub<https://github.com/tizianodeg/lucene-solr/commit/54d2161c8192c7f08e705d33f191b5cd9a087cd5>

this can dramatically decrease managed schema change from several minutes to 1 
sec [cid:image002.png@01D6CEEE.143FEB80]

I’m not sure if setLatestSchema is the right place to do the final call to 
postReadInform and also unsure if making the postReadInform public is 
acceptable.
How can I propose such an improvement? – Or should I open a Bug request for 
this?

Kind Regards,
Tiziano






Re: how do you manage your config and schema

2020-11-03 Thread matthew sporleder
Is there a more conservative starting point that is still up to date
than _default?

On Tue, Nov 3, 2020 at 11:13 AM matthew sporleder  wrote:
>
> So _default considered unsafe?  :)
>
> On Tue, Nov 3, 2020 at 11:08 AM Erick Erickson  
> wrote:
> >
> > The caution I would add is that you should be careful
> > that you don’t enable schemaless mode without understanding
> > the consequences in detail.
> >
> > There is, in fact, some discussion of removing schemaless entirely,
> > see:
> > https://issues.apache.org/jira/browse/SOLR-14701
> >
> > Otherwise, I usually recommend that you take the stock ocnfigs and
> > overlay whatever customizations you’ve added in terms of
> > field definitions and the like.
> >
> > Do also be careful, some default field params have changed…
> >
> > Best,
> > Erick
> >
> > > On Nov 3, 2020, at 9:30 AM, matthew sporleder  
> > > wrote:
> > >
> > > Yesterday I realized that we have been carrying forward our configs
> > > since, probably, 4.x days.
> > >
> > > I ran a config set action=create (from _default) and saw files i
> > > didn't recognize, and a lot *fewer* things than I've been uploading
> > > for the last few years.
> > >
> > > Anyway my new plan is to just use _default and keep params.json,
> > > solrconfig.xml, and schema.xml in git and just use the defaults for
> > > the rest.  (modulo synonyms/etc)
> > >
> > > Did everyone move on to managed schema and use some kind of
> > > intermediate format to upload?
> > >
> > > I'm just looking for updated best practices and a little survey of usage 
> > > trends.
> > >
> > > Thanks,
> > > Matt
> >


Re: how do you manage your config and schema

2020-11-03 Thread matthew sporleder
So _default considered unsafe?  :)

On Tue, Nov 3, 2020 at 11:08 AM Erick Erickson  wrote:
>
> The caution I would add is that you should be careful
> that you don’t enable schemaless mode without understanding
> the consequences in detail.
>
> There is, in fact, some discussion of removing schemaless entirely,
> see:
> https://issues.apache.org/jira/browse/SOLR-14701
>
> Otherwise, I usually recommend that you take the stock ocnfigs and
> overlay whatever customizations you’ve added in terms of
> field definitions and the like.
>
> Do also be careful, some default field params have changed…
>
> Best,
> Erick
>
> > On Nov 3, 2020, at 9:30 AM, matthew sporleder  wrote:
> >
> > Yesterday I realized that we have been carrying forward our configs
> > since, probably, 4.x days.
> >
> > I ran a config set action=create (from _default) and saw files i
> > didn't recognize, and a lot *fewer* things than I've been uploading
> > for the last few years.
> >
> > Anyway my new plan is to just use _default and keep params.json,
> > solrconfig.xml, and schema.xml in git and just use the defaults for
> > the rest.  (modulo synonyms/etc)
> >
> > Did everyone move on to managed schema and use some kind of
> > intermediate format to upload?
> >
> > I'm just looking for updated best practices and a little survey of usage 
> > trends.
> >
> > Thanks,
> > Matt
>


Re: how do you manage your config and schema

2020-11-03 Thread Erick Erickson
The caution I would add is that you should be careful 
that you don’t enable schemaless mode without understanding 
the consequences in detail.

There is, in fact, some discussion of removing schemaless entirely, 
see:
https://issues.apache.org/jira/browse/SOLR-14701

Otherwise, I usually recommend that you take the stock ocnfigs and
overlay whatever customizations you’ve added in terms of
field definitions and the like.

Do also be careful, some default field params have changed…

Best,
Erick

> On Nov 3, 2020, at 9:30 AM, matthew sporleder  wrote:
> 
> Yesterday I realized that we have been carrying forward our configs
> since, probably, 4.x days.
> 
> I ran a config set action=create (from _default) and saw files i
> didn't recognize, and a lot *fewer* things than I've been uploading
> for the last few years.
> 
> Anyway my new plan is to just use _default and keep params.json,
> solrconfig.xml, and schema.xml in git and just use the defaults for
> the rest.  (modulo synonyms/etc)
> 
> Did everyone move on to managed schema and use some kind of
> intermediate format to upload?
> 
> I'm just looking for updated best practices and a little survey of usage 
> trends.
> 
> Thanks,
> Matt



how do you manage your config and schema

2020-11-03 Thread matthew sporleder
Yesterday I realized that we have been carrying forward our configs
since, probably, 4.x days.

I ran a config set action=create (from _default) and saw files i
didn't recognize, and a lot *fewer* things than I've been uploading
for the last few years.

Anyway my new plan is to just use _default and keep params.json,
solrconfig.xml, and schema.xml in git and just use the defaults for
the rest.  (modulo synonyms/etc)

Did everyone move on to managed schema and use some kind of
intermediate format to upload?

I'm just looking for updated best practices and a little survey of usage trends.

Thanks,
Matt


Re: Solr Schema API seems broken to me after 8.2.0

2020-09-09 Thread jeanc...@gmail.com
Thanks for the reply,

I didn't see anything in the Solr logs BUT I'm going to recheck it next
week and update you.
Will check this as well:
* It could be that after the upgrade some filesystem permissions do not
work anymore *

Thanks

Best Regards,

*Jean Silva*


https://github.com/jeancsil

https://linkedin.com/in/jeancsil



On Tue, Sep 8, 2020 at 9:39 AM Jörn Franke  wrote:

> Can you check the logfiles of Solr?
>
> It could be that after the upgrade some filesystem permissions do not work
> anymore
>
> > Am 08.09.2020 um 09:27 schrieb "jeanc...@gmail.com"  >:
> >
> > Hey guys, good morning.
> >
> > As I didn't get any reply for this one, is it ok then that I create the
> > Jira ticket?
> >
> > Best Regards,
> >
> > *Jean Silva*
> >
> >
> > https://github.com/jeancsil
> >
> > https://linkedin.com/in/jeancsil
> >
> >
> >
> >> On Fri, Aug 28, 2020 at 11:10 AM jeanc...@gmail.com  >
> >> wrote:
> >>
> >> Hey everybody,
> >>
> >> First of all, I wanted to say that this is my first time writing here. I
> >> hope I don't do anything wrong.
> >> I went to create the "bug" ticket and saw it would be a good idea to
> first
> >> talk to some of you via IRC (didn't work for me or I did something wrong
> >> after 20 years of not using it..)
> >>
> >> I'm currently using Solr 8.1.1 in production and I use the Schema API to
> >> create the necessary fields before starting to index my new data.
> (Reason,
> >> the managed-schema would be big for me to take care of and I decided to
> >> automate this process by using the REST API).
> >>
> >> I started trying to upgrade* from 8.1.1* directly to *8.6.1* and the
> >> python script I use to add some fields and analyzers started to *kill
> >> solr after some successful processes to finish* without issues.
> >>
> >> *Let's put it simply that I have to make sure the fields that contain
> the
> >> word "blablabla" in it need to be deleted and then recreated. I have
> ~33 of
> >> them.*
> >>
> >> The script works as expected but after some successful creations it
> kills
> >> Solr!
> >>
> >> This script was implemented in python and I thought that I might have
> done
> >> something that doesn't work with Solr 8.6.1 anymore and decided to test
> it
> >> with the *proper implementation of the library in Java*, SolrJ 8.6.1 as
> >> well. The same error occurred. I also didn't see any change in the
> >> documentation with regards to the request I was making.
> >>
> >> Unfortunately I don't have any stacktrace from Solr as there were no
> >> errors popping up in the console for me. The only thing I see was the
> >> output of my script, saying that the *Remote closed connection without
> >> response*:
> >> ...
> >> Traceback (most recent call last):
> >>  File
> "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
> >> line 677, in urlopen
> >>chunked=chunked,
> >>  File
> "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
> >> line 426, in _make_request
> >>six.raise_from(e, None)
> >>  File "", line 3, in raise_from
> >>  File
> "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
> >> line 421, in _make_request
> >>httplib_response = conn.getresponse()
> >>  File "/usr/lib/python3.7/http/client.py", line 1336, in getresponse
> >>response.begin()
> >>  File "/usr/lib/python3.7/http/client.py", line 306, in begin
> >>version, status, reason = self._read_status()
> >>  File "/usr/lib/python3.7/http/client.py", line 275, in _read_status
> >>raise RemoteDisconnected("Remote end closed connection without"
> >> http.client.RemoteDisconnected: Remote end closed connection without
> >> response
> >>
> >>
> >> With *Java and SolrJ* matching the Solr version I was using, I got this:
> >>
> >> Deleting field field_name_1
> >> {responseHeader={status=0,QTime=2187}}
> >>
> >> Deleting field field_name_2
> >> {responseHeader={status=0,QTime=1571}}
> >>
> >> Deleting field field_name_3
> >> {responseHeader={status=0,QTime=1587}}
> >>
> >> Deleting field field_name_4
> >> Exception while deleting the field field_name_4:* IOEx

Re: Solr Schema API seems broken to me after 8.2.0

2020-09-08 Thread Jörn Franke
Can you check the logfiles of Solr?

It could be that after the upgrade some filesystem permissions do not work 
anymore 

> Am 08.09.2020 um 09:27 schrieb "jeanc...@gmail.com" :
> 
> Hey guys, good morning.
> 
> As I didn't get any reply for this one, is it ok then that I create the
> Jira ticket?
> 
> Best Regards,
> 
> *Jean Silva*
> 
> 
> https://github.com/jeancsil
> 
> https://linkedin.com/in/jeancsil
> 
> 
> 
>> On Fri, Aug 28, 2020 at 11:10 AM jeanc...@gmail.com 
>> wrote:
>> 
>> Hey everybody,
>> 
>> First of all, I wanted to say that this is my first time writing here. I
>> hope I don't do anything wrong.
>> I went to create the "bug" ticket and saw it would be a good idea to first
>> talk to some of you via IRC (didn't work for me or I did something wrong
>> after 20 years of not using it..)
>> 
>> I'm currently using Solr 8.1.1 in production and I use the Schema API to
>> create the necessary fields before starting to index my new data. (Reason,
>> the managed-schema would be big for me to take care of and I decided to
>> automate this process by using the REST API).
>> 
>> I started trying to upgrade* from 8.1.1* directly to *8.6.1* and the
>> python script I use to add some fields and analyzers started to *kill
>> solr after some successful processes to finish* without issues.
>> 
>> *Let's put it simply that I have to make sure the fields that contain the
>> word "blablabla" in it need to be deleted and then recreated. I have ~33 of
>> them.*
>> 
>> The script works as expected but after some successful creations it kills
>> Solr!
>> 
>> This script was implemented in python and I thought that I might have done
>> something that doesn't work with Solr 8.6.1 anymore and decided to test it
>> with the *proper implementation of the library in Java*, SolrJ 8.6.1 as
>> well. The same error occurred. I also didn't see any change in the
>> documentation with regards to the request I was making.
>> 
>> Unfortunately I don't have any stacktrace from Solr as there were no
>> errors popping up in the console for me. The only thing I see was the
>> output of my script, saying that the *Remote closed connection without
>> response*:
>> ...
>> Traceback (most recent call last):
>>  File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
>> line 677, in urlopen
>>chunked=chunked,
>>  File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
>> line 426, in _make_request
>>six.raise_from(e, None)
>>  File "", line 3, in raise_from
>>  File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
>> line 421, in _make_request
>>httplib_response = conn.getresponse()
>>  File "/usr/lib/python3.7/http/client.py", line 1336, in getresponse
>>response.begin()
>>  File "/usr/lib/python3.7/http/client.py", line 306, in begin
>>version, status, reason = self._read_status()
>>  File "/usr/lib/python3.7/http/client.py", line 275, in _read_status
>>raise RemoteDisconnected("Remote end closed connection without"
>> http.client.RemoteDisconnected: Remote end closed connection without
>> response
>> 
>> 
>> With *Java and SolrJ* matching the Solr version I was using, I got this:
>> 
>> Deleting field field_name_1
>> {responseHeader={status=0,QTime=2187}}
>> 
>> Deleting field field_name_2
>> {responseHeader={status=0,QTime=1571}}
>> 
>> Deleting field field_name_3
>> {responseHeader={status=0,QTime=1587}}
>> 
>> Deleting field field_name_4
>> Exception while deleting the field field_name_4:* IOException occured
>> when talking to server at: http://localhost:32783/solr/my_core_name
>> <http://localhost:32783/solr/my_core_name>*
>> 
>> Deleting field field_name_5
>> Exception while deleting the field field_name_5:* IOException occured
>> when talking to server at: http://localhost:32783/solr/my_core_name
>> <http://localhost:32783/solr/my_core_name>*
>> // THIS REPEATES FOR + 30 TIMES AND THEN THE MESSAGE CHANGES A BIT
>> 
>> Exception while deleting the field field_name_6:* Server refused
>> connection at:
>> http://localhost:32783/solr/my_core_name/schema?wt=javabin=2
>> <http://localhost:32783/solr/my_core_name/schema?wt=javabin=2>*
>> Deleting field field_name_6
>> // REPEATS ALSO MANY TIMES
>> 
>> Maybe I need to run the same thing again with some different configuration
>> to help give you guys a hint on what the problem is?
>> 
>> To finalize, I started to go "back in time" to see when this happened and
>> realized that *I can only upgrade from 8.1.1. to 8.2.1* without this
>> error to happen. (I'm using the docker images in here btw:
>> https://github.com/docker-solr/docker-solr)
>> 
>> Thank you very much and I hope I can also help with this if it's really a
>> bug.
>> 
>> Best Regards,
>> 
>> *Jean Silva*
>> 
>> 
>> https://github.com/jeancsil
>> 
>> https://linkedin.com/in/jeancsil
>> 
>> 


Re: Solr Schema API seems broken to me after 8.2.0

2020-09-08 Thread jeanc...@gmail.com
Hey guys, good morning.

As I didn't get any reply for this one, is it ok then that I create the
Jira ticket?

Best Regards,

*Jean Silva*


https://github.com/jeancsil

https://linkedin.com/in/jeancsil



On Fri, Aug 28, 2020 at 11:10 AM jeanc...@gmail.com 
wrote:

> Hey everybody,
>
> First of all, I wanted to say that this is my first time writing here. I
> hope I don't do anything wrong.
> I went to create the "bug" ticket and saw it would be a good idea to first
> talk to some of you via IRC (didn't work for me or I did something wrong
> after 20 years of not using it..)
>
> I'm currently using Solr 8.1.1 in production and I use the Schema API to
> create the necessary fields before starting to index my new data. (Reason,
> the managed-schema would be big for me to take care of and I decided to
> automate this process by using the REST API).
>
> I started trying to upgrade* from 8.1.1* directly to *8.6.1* and the
> python script I use to add some fields and analyzers started to *kill
> solr after some successful processes to finish* without issues.
>
> *Let's put it simply that I have to make sure the fields that contain the
> word "blablabla" in it need to be deleted and then recreated. I have ~33 of
> them.*
>
> The script works as expected but after some successful creations it kills
> Solr!
>
> This script was implemented in python and I thought that I might have done
> something that doesn't work with Solr 8.6.1 anymore and decided to test it
> with the *proper implementation of the library in Java*, SolrJ 8.6.1 as
> well. The same error occurred. I also didn't see any change in the
> documentation with regards to the request I was making.
>
> Unfortunately I don't have any stacktrace from Solr as there were no
> errors popping up in the console for me. The only thing I see was the
> output of my script, saying that the *Remote closed connection without
> response*:
> ...
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
> line 677, in urlopen
> chunked=chunked,
>   File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
> line 426, in _make_request
> six.raise_from(e, None)
>   File "", line 3, in raise_from
>   File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
> line 421, in _make_request
> httplib_response = conn.getresponse()
>   File "/usr/lib/python3.7/http/client.py", line 1336, in getresponse
> response.begin()
>   File "/usr/lib/python3.7/http/client.py", line 306, in begin
> version, status, reason = self._read_status()
>   File "/usr/lib/python3.7/http/client.py", line 275, in _read_status
> raise RemoteDisconnected("Remote end closed connection without"
> http.client.RemoteDisconnected: Remote end closed connection without
> response
>
>
> With *Java and SolrJ* matching the Solr version I was using, I got this:
>
> Deleting field field_name_1
> {responseHeader={status=0,QTime=2187}}
>
> Deleting field field_name_2
> {responseHeader={status=0,QTime=1571}}
>
> Deleting field field_name_3
> {responseHeader={status=0,QTime=1587}}
>
> Deleting field field_name_4
> Exception while deleting the field field_name_4:* IOException occured
> when talking to server at: http://localhost:32783/solr/my_core_name
> <http://localhost:32783/solr/my_core_name>*
>
> Deleting field field_name_5
> Exception while deleting the field field_name_5:* IOException occured
> when talking to server at: http://localhost:32783/solr/my_core_name
> <http://localhost:32783/solr/my_core_name>*
> // THIS REPEATES FOR + 30 TIMES AND THEN THE MESSAGE CHANGES A BIT
>
> Exception while deleting the field field_name_6:* Server refused
> connection at:
> http://localhost:32783/solr/my_core_name/schema?wt=javabin=2
> <http://localhost:32783/solr/my_core_name/schema?wt=javabin=2>*
> Deleting field field_name_6
> // REPEATS ALSO MANY TIMES
>
> Maybe I need to run the same thing again with some different configuration
> to help give you guys a hint on what the problem is?
>
> To finalize, I started to go "back in time" to see when this happened and
> realized that *I can only upgrade from 8.1.1. to 8.2.1* without this
> error to happen. (I'm using the docker images in here btw:
> https://github.com/docker-solr/docker-solr)
>
> Thank you very much and I hope I can also help with this if it's really a
> bug.
>
> Best Regards,
>
> *Jean Silva*
>
>
> https://github.com/jeancsil
>
> https://linkedin.com/in/jeancsil
>
>


Solr Schema API seems broken to me after 8.2.0

2020-08-28 Thread jeanc...@gmail.com
Hey everybody,

First of all, I wanted to say that this is my first time writing here. I
hope I don't do anything wrong.
I went to create the "bug" ticket and saw it would be a good idea to first
talk to some of you via IRC (didn't work for me or I did something wrong
after 20 years of not using it..)

I'm currently using Solr 8.1.1 in production and I use the Schema API to
create the necessary fields before starting to index my new data. (Reason,
the managed-schema would be big for me to take care of and I decided to
automate this process by using the REST API).

I started trying to upgrade* from 8.1.1* directly to *8.6.1* and the python
script I use to add some fields and analyzers started to *kill solr after
some successful processes to finish* without issues.

*Let's put it simply that I have to make sure the fields that contain the
word "blablabla" in it need to be deleted and then recreated. I have ~33 of
them.*

The script works as expected but after some successful creations it kills
Solr!

This script was implemented in python and I thought that I might have done
something that doesn't work with Solr 8.6.1 anymore and decided to test it
with the *proper implementation of the library in Java*, SolrJ 8.6.1 as
well. The same error occurred. I also didn't see any change in the
documentation with regards to the request I was making.

Unfortunately I don't have any stacktrace from Solr as there were no errors
popping up in the console for me. The only thing I see was the output of my
script, saying that the *Remote closed connection without response*:
...
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
line 677, in urlopen
chunked=chunked,
  File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
line 426, in _make_request
six.raise_from(e, None)
  File "", line 3, in raise_from
  File "/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py",
line 421, in _make_request
httplib_response = conn.getresponse()
  File "/usr/lib/python3.7/http/client.py", line 1336, in getresponse
response.begin()
  File "/usr/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
  File "/usr/lib/python3.7/http/client.py", line 275, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without
response


With *Java and SolrJ* matching the Solr version I was using, I got this:

Deleting field field_name_1
{responseHeader={status=0,QTime=2187}}

Deleting field field_name_2
{responseHeader={status=0,QTime=1571}}

Deleting field field_name_3
{responseHeader={status=0,QTime=1587}}

Deleting field field_name_4
Exception while deleting the field field_name_4:* IOException occured when
talking to server at: http://localhost:32783/solr/my_core_name
<http://localhost:32783/solr/my_core_name>*

Deleting field field_name_5
Exception while deleting the field field_name_5:* IOException occured when
talking to server at: http://localhost:32783/solr/my_core_name
<http://localhost:32783/solr/my_core_name>*
// THIS REPEATES FOR + 30 TIMES AND THEN THE MESSAGE CHANGES A BIT

Exception while deleting the field field_name_6:* Server refused connection
at: http://localhost:32783/solr/my_core_name/schema?wt=javabin=2
<http://localhost:32783/solr/my_core_name/schema?wt=javabin=2>*
Deleting field field_name_6
// REPEATS ALSO MANY TIMES

Maybe I need to run the same thing again with some different configuration
to help give you guys a hint on what the problem is?

To finalize, I started to go "back in time" to see when this happened and
realized that *I can only upgrade from 8.1.1. to 8.2.1* without this error
to happen. (I'm using the docker images in here btw:
https://github.com/docker-solr/docker-solr)

Thank you very much and I hope I can also help with this if it's really a
bug.

Best Regards,

*Jean Silva*


https://github.com/jeancsil

https://linkedin.com/in/jeancsil


Re: Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-08-10 Thread raj.yadav
Hi Chris,


Chris Hostetter-3 wrote
> ...ExternalFileField is "special" and as noted in it's docs it is not 
> searchable -- it doesn't actaully care what the indexed (or "stored") 
> properties are ... but the default values of those properties as assigend 
> by the schema defaults are still there in the metadata of the field -- 
> which is what the schema API/browser are showing you.

As you mentioned above, that the `stored` parameter will also be ignored
(i.e doesn't matter whether its marked as false or true). So when we
retrieve the external fields using the `fl = field(exteranl_field_name)`
solr will always retrieve the field value from the external file.


Regards,
Raj





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-07-22 Thread raj.yadav
Chris Hostetter-3 wrote
> : *
>  : class="solr.ExternalFileField" valType="float"/>
> *
> : 
> : *
> 
> *
>   ...
> : I was expecting that for field "fieldA" indexed will be marked as false
> and
> : it will not be part of the index. But Solr admin "SCHEMA page" (we get
> this
> : option after selecting collection name in the drop-down menu)  is
> showing
> : it as an indexed field (green tick mark under Indexed flag).
> 
> Because, per the docs, the IndexSchema uses a default assumption of "true" 
> for the "indexed" property (if not specified at a field/fieldtype level) 
> ...
> 
> https://lucene.apache.org/solr/guide/8_4/field-type-definitions-and-properties.html#field-default-properties
> 
> Property: indexed
> Descrption: If true, the value of the field can be used in queries to
> retrieve matching documents.
> Values: true or false 
> Implicit Default: true
> 
> ...ExternalFileField is "special" and as noted in it's docs it is not 
> searchable -- it doesn't actaully care what the indexed (or "stored") 
> properties are ... but the default values of those properties as assigend 
> by the schema defaults are still there in the metadata of the field -- 
> which is what the schema API/browser are showing you.
> 
> 
> Imagine you had a a 
> 
>  that was a TextField -- implicitly 
> indexed="true" -- but it was impossible for you to ever put any values 
> in that field (say for hte sake of argument you used an analyzier that 
> threw away all terms).  The schema browser would say: "It's (implicitly) 
> marked indexed=true, therefore it's searchable" even though searching on
> that 
> field would never return anything ... equivilent situation with 
> ExternalFileField.
> 
> (ExternalFileField could be modified to override the implicit default for 
> these properties, but that's not something anyone has ever really worried 
> about because it wouldn't functionally change any of it's behavior)
> 
> 
> -Hoss
> http://www.lucidworks.com/

Thanks Chris.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-07-22 Thread Chris Hostetter
: **
: 
: **
...
: I was expecting that for field "fieldA" indexed will be marked as false and
: it will not be part of the index. But Solr admin "SCHEMA page" (we get this
: option after selecting collection name in the drop-down menu)  is showing
: it as an indexed field (green tick mark under Indexed flag).

Because, per the docs, the IndexSchema uses a default assumption of "true" 
for the "indexed" property (if not specified at a field/fieldtype level) 
...

https://lucene.apache.org/solr/guide/8_4/field-type-definitions-and-properties.html#field-default-properties

Property: indexed
Descrption: If true, the value of the field can be used in queries to retrieve 
matching documents.
Values: true or false   
Implicit Default: true

...ExternalFileField is "special" and as noted in it's docs it is not 
searchable -- it doesn't actaully care what the indexed (or "stored") 
properties are ... but the default values of those properties as assigend 
by the schema defaults are still there in the metadata of the field -- 
which is what the schema API/browser are showing you.


Imagine you had a a  that was a TextField -- implicitly 
indexed="true" -- but it was impossible for you to ever put any values 
in that field (say for hte sake of argument you used an analyzier that 
threw away all terms).  The schema browser would say: "It's (implicitly) 
marked indexed=true, therefore it's searchable" even though searching on that 
field would never return anything ... equivilent situation with 
ExternalFileField.

(ExternalFileField could be modified to override the implicit default for 
these properties, but that's not something anyone has ever really worried 
about because it wouldn't functionally change any of it's behavior)


-Hoss
http://www.lucidworks.com/


RE: Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-07-22 Thread raj.yadav
Vadim Ivanov wrote
> Hello, Raj
> 
> I've just checked my Schema page for external file field
> 
> Solr version 8.3.1 gives only such parameters for externalFileField:
> 
> 
> Field: fff
> 
> Field-Type:
> 
> org.apache.solr.schema.ExternalFileField
> 
> 
> Flags:
> 
> UnInvertible
> 
> Omit Term Frequencies & Positions
> 
> 
> Properties
> 
> √
> 
> √
> 
> 
> Are u sure you don’t have (or had)  fieldA  in main collection schema ?
> 
>  
> 
> externalFileField is not part of the index. It resides in separate file in
> Solr index directory and goes into memory every commit.

Hi Vadim Ivanov,

Earlier the fieldType and field I shared with were from solr_5.4.

I have cross the same thing in solr_8.5.2. I have created the following two
fieldTypes and field.
 








In fieldType `ext_file_fieldA` since I have explicitly mentioned about
indexed and stored parameter. I'm getting the expected result in solr SCHEMA
page. (PFA image file: fieldA_schema)

In fieldType `ext_file_fieldB` not mentioned anything about indexed and
stored parameters. I was expecting that the indexed parameter will be false
by default. But in solr SCHEMA page indexed flag is marked green √ (PFA
image file: fieldB_schema)


Please find attached files.

Regards,
Raj <https://lucene.472066.n3.nabble.com/file/t495721/fieldA_schema.png> 
<https://lucene.472066.n3.nabble.com/file/t495721/fieldB_schema.png> 



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-07-22 Thread Vadim Ivanov
Hello, Raj

I've just checked my Schema page for external file field

Solr version 8.3.1 gives only such parameters for externalFileField:


Field: fff

Field-Type:

org.apache.solr.schema.ExternalFileField


Flags:

UnInvertible

Omit Term Frequencies & Positions


Properties

√

√


Are u sure you don’t have (or had)  fieldA  in main collection schema ?

 

externalFileField is not part of the index. It resides in separate file in Solr 
index directory and goes into memory every commit.

 

> -Original Message-

> From: Raj Yadav [mailto:rajkum...@cse.ism.ac.in]

> Sent: Wednesday, July 22, 2020 3:09 PM

> To: solr-user@lucene.apache.org

> Subject: Why External File Field is marked as indexed in solr admin SCHEMA

> page?

> 

> We have following external file field type and field:

> 

> * class="solr.ExternalFileField" valType="float"/>*

> 

> **

> 

> In solr official documentation is it mentioned that :

> *"*The ExternalFileField type makes it possible to specify the values for a

> field in a file outside the Solr index. *External fields are not searchable. 
> They

> can be used only for function queries or display."*

> 

> I was expecting that for field "fieldA" indexed will be marked as false and it

> will not be part of the index. But Solr admin "SCHEMA page" (we get this

> option after selecting collection name in the drop-down menu)  is showing it

> as an indexed field (green tick mark under Indexed flag).

> 

> We have not explicitly specified indexed=false for this external field in our

> schema. Wanted to know whether this field is really part of the index.

> Or it is just a bug from the admin UI side.

> 

> Regards,

> Raj



Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-07-22 Thread Raj Yadav
We have following external file field type and field:

**

**

In solr official documentation is it mentioned that :
*"*The ExternalFileField type makes it possible to specify the values for a
field in a file outside the Solr index. *External fields are not
searchable. They can be used only for function queries or display."*

I was expecting that for field "fieldA" indexed will be marked as false and
it will not be part of the index. But Solr admin "SCHEMA page" (we get this
option after selecting collection name in the drop-down menu)  is showing
it as an indexed field (green tick mark under Indexed flag).

We have not explicitly specified indexed=false for this external field in
our schema. Wanted to know whether this field is really part of the index.
Or it is just a bug from the admin UI side.

Regards,
Raj


solr query to return matched text to regex with default schema

2020-07-07 Thread Phillip Wu
Hi,
I want to search Solr for server names in a set of Microsoft Word documents, 
PDF, and image files like jpg,gif.
Server names are given by the regular expression(regex)
INFP[a-zA-z0-9]{3,9}
TRKP[a-zA-z0-9]{3,9}
PLCP[a-zA-z0-9]{3,9}
SQRP[a-zA-z0-9]{3,9}


Problem
===
I want to get the text in the documents matching the regex. eg. INFPWSV01, 
PLCPLDB01

I've index the files using Solr/Tikka/Tesseract using the default schema.

I've used the highlight search tool
hl ticked
hl.usePhraseHighlighter ticked

Solr only returns the metadata (presumably) like filename for the file 
containing the pattern(s).

Questions
=
1. Would I have to modify the managed schema?
2. If so would I have to save the file content in the schema
3. If so is this the way to do it:
a. solrconfig.xml <- inside my "core"


true
ignored_
_text_

...
b. Remove line
ignored_
as I want meta data
c. Change this to the managed schema

stored to "true"
curl -X POST -H 'Content-type:application/json' --data-binary '{
  "replace-field":{
 "name":"_text_",
 "type":"text_general",
 "multiValued":true,
 "indexed":true
 "stored":true }
}' http://localhost:8983/api/cores/gettingstarted/schema









Re: 404 response from Schema API

2020-05-15 Thread Mark H. Wood
On Thu, May 14, 2020 at 02:47:57PM -0600, Shawn Heisey wrote:
> On 5/14/2020 1:13 PM, Mark H. Wood wrote:
> > On Fri, Apr 17, 2020 at 10:11:40AM -0600, Shawn Heisey wrote:
> >> On 4/16/2020 10:07 AM, Mark H. Wood wrote:
> >>> I need to ask Solr 4.10 for the name of the unique key field of a
> >>> schema.  So far, no matter what I've done, Solr is returning a 404.
> 
> The Luke Request Handler, normally assigned to the /admin/luke path, 
> will give you the info you're after.  On a stock Solr install, the 
> following URL would work:
> 
> /solr/admin/luke?show=schema
> 
> I have tried this on solr 4.10.4 and can confirm that the response does 
> have the information.

Thank you, for the information and especially for taking the time to test.

> Since you are working with a different context path, you'll need to 
> adjust your URL to match.
> 
> Note that as of Solr 5.0, running with a different context path is not 
> supported.  The admin UI and the more advanced parts of the startup 
> scripts are hardcoded for the /solr context.

Yes.  5.0+ isn't packaged to be run in Tomcat, as we do now, so Big
Changes are coming when we upgrade.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: 404 response from Schema API

2020-05-14 Thread Shawn Heisey

On 5/14/2020 1:13 PM, Mark H. Wood wrote:

On Fri, Apr 17, 2020 at 10:11:40AM -0600, Shawn Heisey wrote:

On 4/16/2020 10:07 AM, Mark H. Wood wrote:

I need to ask Solr 4.10 for the name of the unique key field of a
schema.  So far, no matter what I've done, Solr is returning a 404.


The Luke Request Handler, normally assigned to the /admin/luke path, 
will give you the info you're after.  On a stock Solr install, the 
following URL would work:


/solr/admin/luke?show=schema

I have tried this on solr 4.10.4 and can confirm that the response does 
have the information.


Since you are working with a different context path, you'll need to 
adjust your URL to match.


Note that as of Solr 5.0, running with a different context path is not 
supported.  The admin UI and the more advanced parts of the startup 
scripts are hardcoded for the /solr context.


Thanks,
Shawn


Re: 404 response from Schema API

2020-05-14 Thread Mark H. Wood
On Thu, May 14, 2020 at 03:13:07PM -0400, Mark H. Wood wrote:
> Anyway, I'll be reading up on how to upgrade to 5.  (Hopefully not
> farther, just yet -- changes between, I think, 5 and 6 mean I'd have
> to spend a week reloading 10 years worth of data.  For now I don't
> want to go any farther than I have to, to make this work.)

Nope, my memory was faulty:  those changes happened in 5.0.  (The
schemas I've been given, used since time immemorial, are chock full of
IntField and DateField.)  I'm stuck with reloading.  Might as well go
to 8.x.  Or give up on asking Solr for the schema's uniqueKey,
configure the client with the field name and cross fingers.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: 404 response from Schema API

2020-05-14 Thread Mark H. Wood
On Fri, Apr 17, 2020 at 10:11:40AM -0600, Shawn Heisey wrote:
> On 4/16/2020 10:07 AM, Mark H. Wood wrote:
> > I need to ask Solr 4.10 for the name of the unique key field of a
> > schema.  So far, no matter what I've done, Solr is returning a 404.
> > 
> > This works:
> > 
> >curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/select'
> > 
> > This gets a 404:
> > 
> >curl 
> > 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema/uniquekey'
> > 
> > So does this:
> > 
> >curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema'
> > 
> > We normally use the ClassicIndexSchemaFactory.  I tried switching to
> > ManagedIndexSchemaFactory but it made no difference.  Nothing is
> > logged for the failed requests.
> 
>  From what I can see, the schema API handler was introduced in version 
> 5.0.  The SchemaHandler class exists in the released javadoc for the 5.0 
> version, but not the 4.10 version.  You'll need a newer version of Solr.

*sigh*  That's what I see too, when I dig through the JARs.  For some
reason, many folks believe that the Schema API existed at least as
far back as 4.2:

  
https://stackoverflow.com/questions/7247221/does-solr-has-api-to-read-solr-schema-xml

Perhaps because the _Apache Solr Reference Guide 4.10_ says so, on
page 53.

This writer thinks it worked, read-only, on 4.10.3:

  
https://stackoverflow.com/questions/33784998/solr-rest-api-for-schema-updates-returns-method-not-allowed-405

But it doesn't work here, on 4.10.4:

  curl 'https://toolshed.wood.net:8443/isw6/solr/statistics/schema?wt=json'
  14-May-2020 15:07:03.805 INFO 
[https-jsse-nio-fec0:0:0:1:0:0:0:7-8443-exec-60] 
org.restlet.engine.log.LogFilter.afterHandle 2020-05-14  15:07:03
fec0:0:0:1:0:0:0:7  -   fec0:0:0:1:0:0:0:7  8443GET 
/isw6/solr/schema   wt=json 404 0   0   0   
https://toolshed.wood.net:8443  curl/7.69.1 -

Strangely, Solr dropped the core-name element of the path!

Any idea what happened?

Anyway, I'll be reading up on how to upgrade to 5.  (Hopefully not
farther, just yet -- changes between, I think, 5 and 6 mean I'd have
to spend a week reloading 10 years worth of data.  For now I don't
want to go any farther than I have to, to make this work.)

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: Dynamic schema failure for child docs not using "_childDocuments_" key

2020-05-13 Thread ART GALLERY
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219

On Tue, May 5, 2020 at 8:32 PM mmb1234  wrote:
>
> I am running into a exception where creating child docs fails unless the
> field already exists in the schema (stacktrace is at the bottom of this
> post). My solr is v8.5.1 running in standard/non-cloud mode.
>
> $> curl -X POST -H 'Content-Type: application/json'
> 'http://localhost:8983/solr/mycore/update' --data-binary '[{
>   "id": "3dae27db6ee43e878b9d0e8e",
>   "phone": "+1 (123) 456-7890",
>   "myChildDocuments": [{
> "id": "3baf27db6ee43387849d0e8e",
>  "enabled": false
>}]
> }]'
>
> {
>   "responseHeader":{
> "status":400,
> "QTime":285},
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"ERROR: [doc=3baf27db6ee43387849d0e8e] unknown field 'enabled'",
> "code":400}}
>
>
> However using "_childDocuments_" key, it succeeds and child doc fields get
> created in the managed-schema
>
> $> curl -X POST -H 'Content-Type: application/json'
> 'http://localhost:8983/solr/mycore/update' --data-binary '[{
>   "id": "6dae27db6ee43e878b9d0e8e",
>   "phone": "+1 (123) 456-7890",
>   "_childDocuments_": [{
> "id": "6baf27db6ee43387849d0e8e",
>  "enabled": false
>}]
> }]'
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":285}}
>
>
> == stacktrace ==
> 2020-05-06 01:01:26.762 ERROR (qtp1569435561-19) [   x:standalone]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: ERROR:
> [doc=3baf27db6ee43387849d0e8e] unknown field 'enabled'
> at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:226)
> at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:100)
> at
> org.apache.solr.update.AddUpdateCommand.lambda$null$0(AddUpdateCommand.java:224)
> at
> java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
> at
> java.base/java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1631)
> at
> java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
> at
> java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
> at
> java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
> at
> java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
> at 
> java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
> at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:282)
> at
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:451)
> at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1284)
> at
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1277)
> at
> org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:975)
> at
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:345)
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:292)
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:239)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76)
> at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
> at
> org.apache.solr.update.processor.NestedUpdateProcessorFactory$NestedUpdateProcessor.processAdd(NestedUpdateProcessorFactory.java:79)
> at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedU

Re: Which solrconfig.xml and schema file should I start with?

2020-05-10 Thread Jan Høydahl
Choose whichever example is closest to what you want to do. Then strip it down 
removing everything you don’t use. Note that _default configset has schema 
guessing enabled which you don’t want in production.

Jan Høydahl

> 9. mai 2020 kl. 22:34 skrev Steven White :
> 
> Hi everyone,
> 
> There are multiple copies with each a bit different of the
> files solrconfig.xml and the various schema files.  Should I be using
> what's under \solr-8.5.1\server\solr\configsets\_default\conf as my
> foundation to build on?
> 
> Thanks
> 
> Steve


Which solrconfig.xml and schema file should I start with?

2020-05-09 Thread Steven White
Hi everyone,

There are multiple copies with each a bit different of the
files solrconfig.xml and the various schema files.  Should I be using
what's under \solr-8.5.1\server\solr\configsets\_default\conf as my
foundation to build on?

Thanks

Steve


Dynamic schema failure for child docs not using "_childDocuments_" key

2020-05-05 Thread mmb1234
I am running into a exception where creating child docs fails unless the
field already exists in the schema (stacktrace is at the bottom of this
post). My solr is v8.5.1 running in standard/non-cloud mode.

$> curl -X POST -H 'Content-Type: application/json'
'http://localhost:8983/solr/mycore/update' --data-binary '[{
  "id": "3dae27db6ee43e878b9d0e8e",
  "phone": "+1 (123) 456-7890",
  "myChildDocuments": [{
"id": "3baf27db6ee43387849d0e8e",
 "enabled": false
   }]
}]'

{
  "responseHeader":{
"status":400,
"QTime":285},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"ERROR: [doc=3baf27db6ee43387849d0e8e] unknown field 'enabled'",
"code":400}}


However using "_childDocuments_" key, it succeeds and child doc fields get
created in the managed-schema

$> curl -X POST -H 'Content-Type: application/json'
'http://localhost:8983/solr/mycore/update' --data-binary '[{
  "id": "6dae27db6ee43e878b9d0e8e",
  "phone": "+1 (123) 456-7890",
  "_childDocuments_": [{
"id": "6baf27db6ee43387849d0e8e",
 "enabled": false
   }]
}]'

{
  "responseHeader":{
"status":0,
"QTime":285}}


== stacktrace ==
2020-05-06 01:01:26.762 ERROR (qtp1569435561-19) [   x:standalone]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: ERROR:
[doc=3baf27db6ee43387849d0e8e] unknown field 'enabled'
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:226)
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:100)
at
org.apache.solr.update.AddUpdateCommand.lambda$null$0(AddUpdateCommand.java:224)
at
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at
java.base/java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1631)
at
java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
at
java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
at
java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
at
java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
at 
java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:282)
at
org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:451)
at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1284)
at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1277)
at
org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:975)
at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:345)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:292)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:239)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at
org.apache.solr.update.processor.NestedUpdateProcessorFactory$NestedUpdateProcessor.processAdd(NestedUpdateProcessorFactory.java:79)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:259)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:489)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:339)
at 
org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:339)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:225)
at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.p

Adding several new fields to managed-schema by sorlj

2020-04-29 Thread Szűcs Roland
Hi folks,

I am using solr 8.5.0 in standalone mode and use the CoreAdmin API and
Schema API of solrj to create new core and its fields in managed-schema
Is there any way to add several fields to managed-schema by solrj without
processing each by each?

The following two rows make the job done by 4sec/field which is
extremely slow:
SchemaRequest.AddField schemaRequest = new
SchemaRequest.AddField(fieldAttributes);
SchemaResponse.UpdateResponse response =  schemaRequest.process(solrC);

The core is empty as the field creation is the part of the core creation
process. The schema API docs says:
It is possible to perform one or more add requests in a single command. The
API is transactional and all commands in a single call either succeed or
fail together.
I am looking for the equivalent of this approach in solrj.
Is there any?

Cheers,
Roland


Re: 404 response from Schema API

2020-04-17 Thread Shawn Heisey

On 4/16/2020 10:07 AM, Mark H. Wood wrote:

I need to ask Solr 4.10 for the name of the unique key field of a
schema.  So far, no matter what I've done, Solr is returning a 404.

This works:

   curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/select'

This gets a 404:

   curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema/uniquekey'

So does this:

   curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema'

We normally use the ClassicIndexSchemaFactory.  I tried switching to
ManagedIndexSchemaFactory but it made no difference.  Nothing is
logged for the failed requests.


From what I can see, the schema API handler was introduced in version 
5.0.  The SchemaHandler class exists in the released javadoc for the 5.0 
version, but not the 4.10 version.  You'll need a newer version of Solr.


Thanks,
Shawn


Re: 404 response from Schema API

2020-04-16 Thread Mark H. Wood
On Thu, Apr 16, 2020 at 02:00:06PM -0400, Erick Erickson wrote:
> Assuming isw6_3 is your collection name, you have
> “solr” and “isw6_3” reversed in the URL.

No.  Solr's context is '/isw6_3/solr' and the core is 'statistics'.

> Should be something like:
> https://toolshed.wood.net:8443/solr/isw6_3/schema/uniquekey
> 
> If that’s not the case you need to mention your collection. But in
> either case your collection name comes after /solr/.

Thank you.  I think that's what I have now.

> > On Apr 16, 2020, at 12:07 PM, Mark H. Wood  wrote:
> > 
> > I need to ask Solr 4.10 for the name of the unique key field of a
> > schema.  So far, no matter what I've done, Solr is returning a 404.
> > 
> > This works:
> > 
> >  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/select'
> > 
> > This gets a 404:
> > 
> >  curl 
> > 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema/uniquekey'
> > 
> > So does this:
> > 
> >  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema'
> > 
> > We normally use the ClassicIndexSchemaFactory.  I tried switching to
> > ManagedIndexSchemaFactory but it made no difference.  Nothing is
> > logged for the failed requests.
> > 
> > Ideas?
> > 
> > -- 
> > Mark H. Wood
> > Lead Technology Analyst
> > 
> > University Library
> > Indiana University - Purdue University Indianapolis
> > 755 W. Michigan Street
> > Indianapolis, IN 46202
> > 317-274-0749
> > www.ulib.iupui.edu
> 
> 

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: 404 response from Schema API

2020-04-16 Thread Erick Erickson
Assuming isw6_3 is your collection name, you have
“solr” and “isw6_3” reversed in the URL.

Should be something like:
https://toolshed.wood.net:8443/solr/isw6_3/schema/uniquekey

If that’s not the case you need to mention your collection. But in
either case your collection name comes after /solr/.

Best,
Erick

> On Apr 16, 2020, at 12:07 PM, Mark H. Wood  wrote:
> 
> I need to ask Solr 4.10 for the name of the unique key field of a
> schema.  So far, no matter what I've done, Solr is returning a 404.
> 
> This works:
> 
>  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/select'
> 
> This gets a 404:
> 
>  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema/uniquekey'
> 
> So does this:
> 
>  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema'
> 
> We normally use the ClassicIndexSchemaFactory.  I tried switching to
> ManagedIndexSchemaFactory but it made no difference.  Nothing is
> logged for the failed requests.
> 
> Ideas?
> 
> -- 
> Mark H. Wood
> Lead Technology Analyst
> 
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu



404 response from Schema API

2020-04-16 Thread Mark H. Wood
I need to ask Solr 4.10 for the name of the unique key field of a
schema.  So far, no matter what I've done, Solr is returning a 404.

This works:

  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/select'

This gets a 404:

  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema/uniquekey'

So does this:

  curl 'https://toolshed.wood.net:8443/isw6_3/solr/statistics/schema'

We normally use the ClassicIndexSchemaFactory.  I tried switching to
ManagedIndexSchemaFactory but it made no difference.  Nothing is
logged for the failed requests.

Ideas?

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: Proper way to manage managed-schema file

2020-04-13 Thread Alexandre Rafalovitch
If you are using API (which AdminUI does), the regenerated file will
loose comments and sort everything in particular order. That's just
the implementation at the moment.

If you don't like that, you can always modify the schema file by hand
and reload the core to notice the changes. You can even set the schema
to be immutable to avoid accidentally doing it.

The other option is not to have the comments in that file and then,
after first rewrite, the others are quite incremental and make it easy
to track the changes.

Regards,
   Alex.

On Mon, 6 Apr 2020 at 14:11, TK Solr  wrote:
>
> I am using Solr 8.3.1 in non-SolrCloud mode (what should I call this mode?) 
> and
> modifying managed-schema.
>
> I noticed that Solr does override this file wiping out all my comments and
> rearranging the order. I noticed there is a "DO NOT EDIT" comment. Then, what 
> is
> the proper/expected way to manage this file? Admin UI can add fields but 
> cannot
> edit existing one or add new field types. Do I keep a script of many schema
> calls? (Then how do I reset the default to the initial one, which would be
> needed before re-re-playing the schema calls.)
>
> TK
>
>


Re: Schema Browser API

2020-04-09 Thread Kevin Risden
The Luke request handler may do what you are asking for already? This
is coming directly from Lucene and doesn't rely on what Solr has in
the schema information.

/admin/luke

https://lucene.apache.org/solr/guide/7_7/implicit-requesthandlers.html
https://cwiki.apache.org/confluence/display/SOLR/LukeRequestHandler

PS - There is also the ability to run Luke standalone over Lucene indices.

Kevin Risden

On Thu, Apr 9, 2020 at 3:34 PM Webster Homer
 wrote:
>
> I was just looking at the Schema Browser for one of our collections. It's 
> pretty handy. I was thinking that it would be useful to create a tool that 
> would create a report about what fields were indexed had docValues, were 
> multivalued etc...
>
> Has someone built such a tool? I want it to aid in estimating memory 
> requirements for our collections.
>
> I'm currently running solr 7.7.2
>
>
>
> This message and any attachment are confidential and may be privileged or 
> otherwise protected from disclosure. If you are not the intended recipient, 
> you must not copy this message or attachment or disclose the contents to any 
> other person. If you have received this transmission in error, please notify 
> the sender immediately and delete the message and any attachment from your 
> system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> accept liability for any omissions or errors in this message which may arise 
> as a result of E-Mail-transmission or for damages resulting from any 
> unauthorized changes of the content of this message and any attachment 
> thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> guarantee that this message is free of viruses and does not accept liability 
> for any damages caused by any virus transmitted therewith.
>
>
>
> Click http://www.merckgroup.com/disclaimer to access the German, French, 
> Spanish and Portuguese versions of this disclaimer.


Schema Browser API

2020-04-09 Thread Webster Homer
I was just looking at the Schema Browser for one of our collections. It's 
pretty handy. I was thinking that it would be useful to create a tool that 
would create a report about what fields were indexed had docValues, were 
multivalued etc...

Has someone built such a tool? I want it to aid in estimating memory 
requirements for our collections.

I'm currently running solr 7.7.2



This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, you 
must not copy this message or attachment or disclose the contents to any other 
person. If you have received this transmission in error, please notify the 
sender immediately and delete the message and any attachment from your system. 
Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not accept 
liability for any omissions or errors in this message which may arise as a 
result of E-Mail-transmission or for damages resulting from any unauthorized 
changes of the content of this message and any attachment thereto. Merck KGaA, 
Darmstadt, Germany and any of its subsidiaries do not guarantee that this 
message is free of viruses and does not accept liability for any damages caused 
by any virus transmitted therewith.



Click http://www.merckgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


Re: Proper way to manage managed-schema file

2020-04-06 Thread Jörn Franke
You can use the Solr rest services to do all those operations. 

https://lucene.apache.org/solr/guide/8_3/schema-api.html

Normally in a productive environment you don’t use the UI but do all changes in 
a controlled automated fashion using the REST APIs.



> Am 06.04.2020 um 20:11 schrieb TK Solr :
> 
> I am using Solr 8.3.1 in non-SolrCloud mode (what should I call this mode?) 
> and modifying managed-schema.
> 
> I noticed that Solr does override this file wiping out all my comments and 
> rearranging the order. I noticed there is a "DO NOT EDIT" comment. Then, what 
> is the proper/expected way to manage this file? Admin UI can add fields but 
> cannot edit existing one or add new field types. Do I keep a script of many 
> schema calls? (Then how do I reset the default to the initial one, which 
> would be needed before re-re-playing the schema calls.)
> 
> TK
> 
> 


Proper way to manage managed-schema file

2020-04-06 Thread TK Solr
I am using Solr 8.3.1 in non-SolrCloud mode (what should I call this mode?) and 
modifying managed-schema.


I noticed that Solr does override this file wiping out all my comments and 
rearranging the order. I noticed there is a "DO NOT EDIT" comment. Then, what is 
the proper/expected way to manage this file? Admin UI can add fields but cannot 
edit existing one or add new field types. Do I keep a script of many schema 
calls? (Then how do I reset the default to the initial one, which would be 
needed before re-re-playing the schema calls.)


TK




Re: Solr 8.2.0 - Schema issue

2020-03-10 Thread Joe Obernberger
Erick - tried this, had to run it async, but it's been running for over 
24 hours on one collection with:


{
  "responseHeader":{
"status":0,
"QTime":18326},
  "status":{
"state":"submitted",
"msg":"found [5] in submitted tasks"}}

I don't see anything in the logs.

-Joe

On 3/6/2020 1:43 PM, Joe Obernberger wrote:
Thank you Erick - I have no record of that, but will absolutely give 
the API RELOAD a shot!  Thank you!


-Joe

On 3/6/2020 10:26 AM, Erick Erickson wrote:
Didn’t we talk about reloading the collections that share the schema 
after the schema change via the collections API RELOAD command?


Best,
Erick

On Mar 6, 2020, at 05:34, Joe Obernberger 
 wrote:


Hi All - any ideas on this?  Anything I can try?

Thank you!

-Joe


On 2/26/2020 9:01 AM, Joe Obernberger wrote:
Hi All - I have several solr collections all with the same schema.  
If I add a field to the schema and index it into the collection on 
which I added the field, it works fine. However, if I try to add a 
document to a different solr collection that contains the new field 
(and is using the same schema), I get an error that the field 
doesn't exist.


If I restart the cluster, this problem goes away and I can add a 
document with the new field to any solr collection that has the 
schema.  Any work-arounds that don't involve a restart?


Thank you!

-Joe Obernberger



Re: Solr 8.2.0 - Schema issue

2020-03-06 Thread Joe Obernberger
Thank you Erick - I have no record of that, but will absolutely give the 
API RELOAD a shot!  Thank you!


-Joe

On 3/6/2020 10:26 AM, Erick Erickson wrote:

Didn’t we talk about reloading the collections that share the schema after the 
schema change via the collections API RELOAD command?

Best,
Erick


On Mar 6, 2020, at 05:34, Joe Obernberger  wrote:

Hi All - any ideas on this?  Anything I can try?

Thank you!

-Joe


On 2/26/2020 9:01 AM, Joe Obernberger wrote:
Hi All - I have several solr collections all with the same schema.  If I add a 
field to the schema and index it into the collection on which I added the 
field, it works fine.  However, if I try to add a document to a different solr 
collection that contains the new field (and is using the same schema), I get an 
error that the field doesn't exist.

If I restart the cluster, this problem goes away and I can add a document with 
the new field to any solr collection that has the schema.  Any work-arounds 
that don't involve a restart?

Thank you!

-Joe Obernberger



Re: Solr 8.2.0 - Schema issue

2020-03-06 Thread Erick Erickson
Didn’t we talk about reloading the collections that share the schema after the 
schema change via the collections API RELOAD command?

Best,
Erick

> On Mar 6, 2020, at 05:34, Joe Obernberger  
> wrote:
> 
> Hi All - any ideas on this?  Anything I can try?
> 
> Thank you!
> 
> -Joe
> 
>> On 2/26/2020 9:01 AM, Joe Obernberger wrote:
>> Hi All - I have several solr collections all with the same schema.  If I add 
>> a field to the schema and index it into the collection on which I added the 
>> field, it works fine.  However, if I try to add a document to a different 
>> solr collection that contains the new field (and is using the same schema), 
>> I get an error that the field doesn't exist.
>> 
>> If I restart the cluster, this problem goes away and I can add a document 
>> with the new field to any solr collection that has the schema.  Any 
>> work-arounds that don't involve a restart?
>> 
>> Thank you!
>> 
>> -Joe Obernberger
>> 


Re: Solr 8.2.0 - Schema issue

2020-03-06 Thread Joe Obernberger

Hi All - any ideas on this?  Anything I can try?

Thank you!

-Joe

On 2/26/2020 9:01 AM, Joe Obernberger wrote:
Hi All - I have several solr collections all with the same schema.  If 
I add a field to the schema and index it into the collection on which 
I added the field, it works fine.  However, if I try to add a document 
to a different solr collection that contains the new field (and is 
using the same schema), I get an error that the field doesn't exist.


If I restart the cluster, this problem goes away and I can add a 
document with the new field to any solr collection that has the 
schema.  Any work-arounds that don't involve a restart?


Thank you!

-Joe Obernberger



Re: Solr 8.2.0 - Schema issue

2020-02-26 Thread Jörn Franke
Not sure i understood the whole scenario. However did you try to reload (not 
reindex) the collection 

> Am 26.02.2020 um 15:02 schrieb Joe Obernberger :
> 
> Hi All - I have several solr collections all with the same schema.  If I add 
> a field to the schema and index it into the collection on which I added the 
> field, it works fine.  However, if I try to add a document to a different 
> solr collection that contains the new field (and is using the same schema), I 
> get an error that the field doesn't exist.
> 
> If I restart the cluster, this problem goes away and I can add a document 
> with the new field to any solr collection that has the schema.  Any 
> work-arounds that don't involve a restart?
> 
> Thank you!
> 
> -Joe Obernberger
> 


Solr 8.2.0 - Schema issue

2020-02-26 Thread Joe Obernberger
Hi All - I have several solr collections all with the same schema.  If I 
add a field to the schema and index it into the collection on which I 
added the field, it works fine.  However, if I try to add a document to 
a different solr collection that contains the new field (and is using 
the same schema), I get an error that the field doesn't exist.


If I restart the cluster, this problem goes away and I can add a 
document with the new field to any solr collection that has the schema.  
Any work-arounds that don't involve a restart?


Thank you!

-Joe Obernberger



Re: Would changing the schema version from 1.5 to 1.6 require a reindex

2020-02-23 Thread Paras Lehana
Hi Karl,

Maybe someone else could help if reindexing is needed if we upgrade Schema
version. However, I guess, useDocValuesAsStored only impacts the query side
assuming docValues had already been stored during indexing. It's actually
easier to try querying the fields after enabling this parameter and see if
that works without reindexing!


On Thu, 13 Feb 2020 at 14:21, Karl Stoney
 wrote:

> Hey,
> I’m going to bump our schema version from 1.5 to 1.6 to get the implicit
> useDocValuesAsStored=true, would this require a reindex?
>
> Thanks
> Karl
> This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office:
> 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England
> No. 9439967). This email and any files transmitted with it are confidential
> and may be legally privileged, and intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> email in error please notify the sender. This email message has been swept
> for the presence of computer viruses.
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, *Auto-Suggest*,
IndiaMART InterMESH Ltd,

11th Floor, Tower 2, Assotech Business Cresterra,
Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305

Mob.: +91-9560911996
Work: 0120-4056700 | Extn:
*11096*

-- 
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>


Would changing the schema version from 1.5 to 1.6 require a reindex

2020-02-13 Thread Karl Stoney
Hey,
I’m going to bump our schema version from 1.5 to 1.6 to get the implicit 
useDocValuesAsStored=true, would this require a reindex?

Thanks
Karl
This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 
Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England No. 
9439967). This email and any files transmitted with it are confidential and may 
be legally privileged, and intended solely for the use of the individual or 
entity to whom they are addressed. If you have received this email in error 
please notify the sender. This email message has been swept for the presence of 
computer viruses.


Re: In-place re-indexing after DocValue schema change

2020-01-29 Thread moscovig
Tank you Emir.

I tried this locally (changing schema, re-index all implace)
and I wasn't able to sort on the doc value fields anymore (someone actually
mentioned this before on that forum -
https://lucene.472066.n3.nabble.com/DocValues-error-td4240116.html)
with the next error
"Error from server at
http://10.150.197.29:8961/solr/accountmaster_shard1_replica1: unexpected
docvalues type NONE for field 'key' (expected=SORTED). Re-index with correct
docvalues type."

Also, having this great overhead you mention, is another reason not to
reindex inplace.

Thanks!








--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: In-place re-indexing after DocValue schema change

2020-01-29 Thread Emir Arnautović
Hi,
1. No, it’s not valid. Solr will look at schema to see if it can use docValues 
or if it has to uninvert field and it assumes that all fields will have doc 
values. You might expect from wrong results to errors if you do something like 
that.
2. Not sure if it would work, but It is not better than reindexing everything. 
Lucene segments are immutable and it needs to create new document and flag 
existing as deleted and purge it at segment merge time. If you are trying to 
avoid changing collection name, maybe you could do something like that by using 
aliases: index into new collection, delete existing collection, create alias 
with old collection name pointing to new collection.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 29 Jan 2020, at 09:37, moscovig  wrote:
> 
> Hi all
> 
> We are about to alter our schema with some DocValue annotations. 
> According to docs, we should whether delete all docs and re-insert, or
> create a new collection with the new schema.
> 
> 1. Is it valid to modify the schema in the current collection, where all
> documents were created without docValue, and having docValue for new docs?
> 
> 2. Is it valid to upsert all documents onto the same collection, having all
> docs re-indexed in-place? It does sound risky, but would it work if we will
> take care of *all* documents?
> 
> Thanks!
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



In-place re-indexing after DocValue schema change

2020-01-29 Thread moscovig
Hi all

We are about to alter our schema with some DocValue annotations. 
According to docs, we should whether delete all docs and re-insert, or
create a new collection with the new schema.

1. Is it valid to modify the schema in the current collection, where all
documents were created without docValue, and having docValue for new docs?

2. Is it valid to upsert all documents onto the same collection, having all
docs re-indexed in-place? It does sound risky, but would it work if we will
take care of *all* documents?

Thanks!



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: need for re-indexing when using managed schema

2019-12-16 Thread Erick Erickson
That’s a little overstated, a full explanation of what’s safe and what’s not is 
several pages and depends on what you mean by “safe”.

Any modification to a schema, even if they don’t cause something to outright 
break, may leave the index in an inconsistent state. For instance, remember 
that Lucene and Solr really don’t care if doc1 doesn’t have a particular field 
X and doc2 does. If you do something as “safe” as add a new field, only 
documents indexed after that change will have the field. Your index will 
continue to function with no errors in that case, but any searches on the new 
field won’t return any docs indexed before the change until the older docs are 
re-indexed.

So you can see where this is going. “If you add a field _and then reindex all 
your documents_, it’s perfectly safe. However, between the time you add the 
field and the re-indexing is complete, you results may be inconsistent.

On the other hand,  if you change, say, a DocValues field from 
multValued="true" to multiValued=“false” the results are undefined _even if you 
reindex all your docs_.

On the other, other hand, if you delete a field, the meta-data is still in your 
index, the only way to get rid of it is to delete your index and re-index or 
index to a new collection and searches may return docs on the deleted field if 
it was created with a dynamic field definition that’s still in the schema”.

On the other, other, other hand… the list goes on and on.

So since even something as non-breaking as adding a new field requires you to 
re-index all your older docs anyway to get back to a consistent state, so it’s 
just easiest to plan on re-indexing all your docs whenever you change the 
schema. And, I’d also advise, index to a new collection…

Best,
Erick

> On Dec 16, 2019, at 12:57 PM, Joseph Lorenzini  wrote:
> 
> Hi all,
> 
> I have question about the managed schema functionality.  According to the
> docs, "All changes to a collection’s schema require reindexing". This would
> imply that if you use a managed schema and you use the schema API to update
> the schema, then doing a full re-index is necessary each time.
> 
> Is this accurate or can a full re-index be avoided?
> 
> Thanks,
> Joe



need for re-indexing when using managed schema

2019-12-16 Thread Joseph Lorenzini
Hi all,

I have question about the managed schema functionality.  According to the
docs, "All changes to a collection’s schema require reindexing". This would
imply that if you use a managed schema and you use the schema API to update
the schema, then doing a full re-index is necessary each time.

Is this accurate or can a full re-index be avoided?

Thanks,
Joe


Re: creating a core with a custom managed-schema

2019-11-04 Thread Erick Erickson
You’re confusing the admin API with the bin/solr script. If you look at
the examples on that page, the form is:

admin/cores?action=CREATE=core-name=path/to/dir=solrconfig.xml=data

what you used was:
/opt/solr/bin/solr create….

Totally different beasts.


> On Nov 4, 2019, at 1:43 PM, rhys J  wrote:
> 
> On Mon, Nov 4, 2019 at 1:36 PM Erick Erickson 
> wrote:
> 
>> Well, just what it says. -schema isn’t a recognized parameter, where did
>> you get it? Did you try bin/solr create -help and follow the instructions
>> there?
>> 
>> I am confused.
> 
> This page:
> https://lucene.apache.org/solr/guide/7_0/coreadmin-api.html#coreadmin-create
> 
> says that schema is a valid parameter, and it explains how to use it.
> 
> But when I use the command create, I get an error.
> 
> Is there no way to use a custom schema to create a core from the command
> line? Will I always have to either hand edit the managed-schema, or use the
> API?
> 
> Thanks,
> 
> Rhys



Re: creating a core with a custom managed-schema

2019-11-04 Thread rhys J
On Mon, Nov 4, 2019 at 1:36 PM Erick Erickson 
wrote:

> Well, just what it says. -schema isn’t a recognized parameter, where did
> you get it? Did you try bin/solr create -help and follow the instructions
> there?
>
> I am confused.

This page:
https://lucene.apache.org/solr/guide/7_0/coreadmin-api.html#coreadmin-create

says that schema is a valid parameter, and it explains how to use it.

But when I use the command create, I get an error.

Is there no way to use a custom schema to create a core from the command
line? Will I always have to either hand edit the managed-schema, or use the
API?

Thanks,

Rhys


Re: creating a core with a custom managed-schema

2019-11-04 Thread Erick Erickson
Well, just what it says. -schema isn’t a recognized parameter, where did you 
get it? Did you try bin/solr create -help and follow the instructions there?

Best,
Erick

> On Nov 4, 2019, at 12:34 PM, rhys J  wrote:
> 
> I have created a tmp directory where I want to have reside custom
> managed-schemas to use when creating cores.
> 
> /tmp/solr_schema/CORENAME/managed-schema
> 
> Based on this page:
> https://lucene.apache.org/solr/guide/7_0/coreadmin-api.html#coreadmin-create
> , I am running the following command:
> 
> sudo -u solr /opt/solr/bin/solr create -c dbtrphon -schema
> /tmp/solr_schemas/dbtrphon/managed-schema
> 
> I get this error:
> 
> ERROR: Unrecognized or misplaced argument: -schema!
> 
> How can I create a core with a custom managed-schema?
> 
> I'm trying to implement solr in a development environment, but I would like
> to have custom schemas, so that when we move it to live, we don't have to
> recreate the schemas by hand again.
> 
> Thanks,
> 
> Rhys
> 
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=icon>
> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=link>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



creating a core with a custom managed-schema

2019-11-04 Thread rhys J
I have created a tmp directory where I want to have reside custom
managed-schemas to use when creating cores.

/tmp/solr_schema/CORENAME/managed-schema

Based on this page:
https://lucene.apache.org/solr/guide/7_0/coreadmin-api.html#coreadmin-create
, I am running the following command:

sudo -u solr /opt/solr/bin/solr create -c dbtrphon -schema
/tmp/solr_schemas/dbtrphon/managed-schema

I get this error:

ERROR: Unrecognized or misplaced argument: -schema!

How can I create a core with a custom managed-schema?

I'm trying to implement solr in a development environment, but I would like
to have custom schemas, so that when we move it to live, we don't have to
recreate the schemas by hand again.

Thanks,

Rhys

<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=icon>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: Updating Solr schema doesn't work

2019-10-04 Thread Shawn Heisey

On 10/4/2019 2:45 PM, Shawn Heisey wrote:
It's probably not the way I would do it.  I would update a local copy of 
the config and then re-upload the entire config rather than dealing with 
a single file.


You will also need to reload the collection or restart Solr, and then as 
already mentioned, reindex your documents with the new fields or index 
some new ones with the new fields.


Thanks,
Shawn


Re: Updating Solr schema doesn't work

2019-10-04 Thread Shawn Heisey

On 10/4/2019 10:22 AM, amruth wrote:

*- /opt/zookeeper/bin/zkCli.sh delete
/configs/collection1/managed-schema - /opt/zookeeper/bin/zkCli.sh
create /configs/collection1/managed-schema "`cat
/var/solr/data/collection1/conf/managed-schema`"*

I could see fields on managed-schema on Solr UI and when I start
deltas and query, I am not able to see those fields in the result
set.


In order to see fields in the result set, the field must either be 
stored, or it must have docValues defined and be set to use docValues as 
stored.  And that field must be present when documents in the result set 
are indexed.


If you did not reindex your documents, with the new fields added, then 
you will not see those fields in results when those documents are in the 
result set.



Please let me know if this is the way to update schema in SolrCloud


It's probably not the way I would do it.  I would update a local copy of 
the config and then re-upload the entire config rather than dealing with 
a single file.


I do not know how the zkCli.sh included with zookeeper works, so I do 
not know if those are correct commands.  I know how the zkcli.sh script 
included with Solr works.


Thanks,
Shawn


Updating Solr schema doesn't work

2019-10-04 Thread amruth
Hello,

I am running SolrCloud 6.6.0 and trying to add new fields to Solr Schema. I
have added fields to /var/solr/data/collection1/conf/managed-schema and
executed,

*- /opt/zookeeper/bin/zkCli.sh delete /configs/collection1/managed-schema
- /opt/zookeeper/bin/zkCli.sh create /configs/collection1/managed-schema
"`cat /var/solr/data/collection1/conf/managed-schema`"*

I could see fields on managed-schema on Solr UI and when I start deltas and
query, I am not able to see those fields in the result set. 

Please let me know if this is the way to update schema in SolrCloud



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Dynamic field schema

2019-07-10 Thread Shawn Heisey

On 7/10/2019 6:52 AM, ericstein wrote:

the documents are in both cores. the official title field data exists in
both. However, it only gives me the official_title_s field in the second
core when I query. When I look at the schema in the admin it only shows the
official_title_s field


The schema browser examines the Lucene index.  Lucene doesn't know about 
the schema, so only fields that have actually been used at the Lucene 
layer will be shown.


If the schema browser does not have a field mentioned, it means that 
when the indexed documents made it down to the Lucene layer, that field 
was not present in any of them.


Thanks,
Shawn


Re: Dynamic field schema

2019-07-10 Thread ericstein
Hi Shawn,

the documents are in both cores. the official title field data exists in
both. However, it only gives me the official_title_s field in the second
core when I query. When I look at the schema in the admin it only shows the
official_title_s field



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Dynamic field schema

2019-07-09 Thread Shawn Heisey

On 7/9/2019 5:42 PM, ericstein wrote:

I am expecting both cores to have the following fields:
official_title_s
official_title_t

However, the second core only recognizes:
official_title_s

It seems that the schema doesn't recognize my field the same in both cores.


What do you mean by "recognize"?  Be very specific about what you are 
examining that has led to this conclusion.


Search results will only contain fields that were actually included in 
the documents when they were indexed.  Maybe documents in the second 
core do not have the official_title_t field.


Thanks,
Shawn


Dynamic field schema

2019-07-09 Thread ericstein
Hi all, 
I am new to the SOLR world, so bear with me. I have currently have 2 cores
that share the same schema, or so I think? I have noticed that certain
fields don't exist in both cores even though they are set to look at the
same config sets (S:\solr-6.1.0\server\solr\configsets\sitecore_configs).

I am expecting both cores to have the following fields: 
official_title_s
official_title_t

However, the second core only recognizes:
official_title_s

It seems that the schema doesn't recognize my field the same in both cores.

My dynamic fields are:





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Derived Field Solr Schema

2019-06-21 Thread Muaawia Bin Arshad
Thank you so much! This is very helpful

On 6/21/19, 12:35 PM, "Alexandre Rafalovitch"  wrote:

The easiest way is to do that with Update Request Processors:
https://lucene.apache.org/solr/guide/7_7/update-request-processors.html

Usually, you would clone a field and then do your transformations. For your
specific example, you could use:
*) FieldLengthUpdateProcessorFactory - int rather than boolean
*) StatelessScriptUpdateProcessorFactory - if you want to have full freedom
(but a bit slower indexing)
*) RegexReplaceProcessorFactory - maybe a bit trickier but also interesting.

You may also find this (mine) resource list interesting:
http://www.solr-start.com/info/update-request-processors/ (it is a bit out
of date).

Regards,
  Alex.

On Fri, Jun 21, 2019, 2:19 PM Muaawia Bin Arshad, 
wrote:

> Hi Everyone,
>
> I am fairly new to solr and I was wondering if there is a way in solr 7.7
> to populate fields based on some pre-processing on other field. So let’s
> say I have a field called fieldX defined in the schema, I want to define
> another field called isFieldXgood which is just a Boolean field that shows
> whether the length of fieldX is larger than 10.
>
>
> Thanks,
> Muaawia Bin Arshad
>




Re: Derived Field Solr Schema

2019-06-21 Thread Alexandre Rafalovitch
The easiest way is to do that with Update Request Processors:
https://lucene.apache.org/solr/guide/7_7/update-request-processors.html

Usually, you would clone a field and then do your transformations. For your
specific example, you could use:
*) FieldLengthUpdateProcessorFactory - int rather than boolean
*) StatelessScriptUpdateProcessorFactory - if you want to have full freedom
(but a bit slower indexing)
*) RegexReplaceProcessorFactory - maybe a bit trickier but also interesting.

You may also find this (mine) resource list interesting:
http://www.solr-start.com/info/update-request-processors/ (it is a bit out
of date).

Regards,
  Alex.

On Fri, Jun 21, 2019, 2:19 PM Muaawia Bin Arshad, 
wrote:

> Hi Everyone,
>
> I am fairly new to solr and I was wondering if there is a way in solr 7.7
> to populate fields based on some pre-processing on other field. So let’s
> say I have a field called fieldX defined in the schema, I want to define
> another field called isFieldXgood which is just a Boolean field that shows
> whether the length of fieldX is larger than 10.
>
>
> Thanks,
> Muaawia Bin Arshad
>


Derived Field Solr Schema

2019-06-21 Thread Muaawia Bin Arshad
Hi Everyone,

I am fairly new to solr and I was wondering if there is a way in solr 7.7 to 
populate fields based on some pre-processing on other field. So let’s say I 
have a field called fieldX defined in the schema, I want to define another 
field called isFieldXgood which is just a Boolean field that shows whether the 
length of fieldX is larger than 10.


Thanks,
Muaawia Bin Arshad


Schema API Version 2 - 7.6.0

2019-05-22 Thread Joe Obernberger

Hi - according to the documentation here:

https://lucene.apache.org/solr/guide/7_6/schema-api.html

The V2 API is located at api/cores/collection/schema
However the documentation here:
https://lucene.apache.org/solr/guide/7_6/v2-api.html

has it at api/c/collection/schema

I believe the later is correct - true?  Thank you!

-Joe Obernberger




How to define nested document schema

2019-05-20 Thread derrick cui
Hi, I have a nested document, how should I define this schema?
How to use addChildDocument in solr-solrj?
Thanks
Derrick

Sent from Yahoo Mail for iPhone


Re: Schema configuration field defaults

2019-02-25 Thread Erick Erickson
Sure. In both cases define a fieldType with those attributes set however you 
want. Any field that is defined with that fieldType will have the defaults you 
specify unless overridden on the field definition itself.

Best,
Erick

> On Feb 25, 2019, at 9:08 AM, Dionte Smith  wrote:
> 
> Hi,
> 
> I have two questions about the field default values for multivalued and 
> indexed.
> 
> 
>  1.  Is it possible to make new fields have the indexed attribute set to 
> false by default for a schema? I understand this wouldn't normally be the 
> case, but we have a use case where it would be preferable as many fields may 
> be dynamically added via JSON.
>  2.  Is it possible to do the same for the multivalued attribute? If so, what 
> would happen if a field was dynamically added via JSON and it was contained 
> an array? Would Solr be able to determine that the field should instead be 
> created with the multivalued attribute set to true?
> 
> Kind Regards,
> 
> Dionté Smith
> Software Developer
> dionte.sm...@gm.com<mailto:dionte.sm...@gm.com>
> 
> 
> 
> Nothing in this message is intended to constitute an electronic signature 
> unless a specific statement to the contrary is included in this message.
> 
> Confidentiality Note: This message is intended only for the person or entity 
> to which it is addressed. It may contain confidential and/or privileged 
> material. Any review, transmission, dissemination or other use, or taking of 
> any action in reliance upon this message by persons or entities other than 
> the intended recipient is prohibited and may be unlawful. If you received 
> this message in error, please contact the sender and delete it from your 
> computer.



Schema configuration field defaults

2019-02-25 Thread Dionte Smith
Hi,

I have two questions about the field default values for multivalued and indexed.


  1.  Is it possible to make new fields have the indexed attribute set to false 
by default for a schema? I understand this wouldn't normally be the case, but 
we have a use case where it would be preferable as many fields may be 
dynamically added via JSON.
  2.  Is it possible to do the same for the multivalued attribute? If so, what 
would happen if a field was dynamically added via JSON and it was contained an 
array? Would Solr be able to determine that the field should instead be created 
with the multivalued attribute set to true?

Kind Regards,

Dionté Smith
Software Developer
dionte.sm...@gm.com<mailto:dionte.sm...@gm.com>



Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or privileged material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited and may be unlawful. If you received this message in 
error, please contact the sender and delete it from your computer.


Re: _version_ field missing in schema?

2019-01-24 Thread Aleksandar Dimitrov
Finally, since you are trying to really tweak the schema and 
general
configuration right from the start, you may find some of my 
presentations
useful, as they show the minimal configuration. Not perfect for 
your needs,
as I do skip _version, but as an additional data point. The 
recent one is:

https://www.slideshare.net/arafalov/rapid-solr-schema-development-phone-directory
and the Git repo is at:
https://github.com/arafalov/solr-presentation-2018-may . This 
one may be

useful as well:
https://www.slideshare.net/arafalov/from-content-to-search-speeddating-apache-solr-apachecon-2018-116330553


Thanks for the pointers. I've finally managed to get my schema to 
work ☺


Cheers,
Aleks



Regards,
   Alex.

On Wed, Jan 23, 2019, 5:50 AM Aleksandar Dimitrov <
a.dimit...@seidemann-web.com wrote:


Hi Alex,

thanks for you answer. I took the lines directly from the
managed-schema, deleted the managed-schema, and pasted those 
lines

into
my schema.xml.

If I have other errors in the schema.xml (such as a missing 
field

type),
solr complains about those until I fix them. So I would guess 
that

the
schema is at least *read*, but unsure if it is in fact used. 
I've

not
used solr before.

I cannot use the admin UI, at least not while the core with the
faulty
schema is used.

I wanted to use schema.xml because it allows for version 
control,

and
because it's easier for me to just use xml to define my schema. 
Is

there
a preferred approach? I don't (want to) use solr cloud, as for 
our

use
case a single instance of solr is more than enough.

Thanks for your help,
Aleks

Alexandre Rafalovitch  writes:

> What do you mean schema.xml from managed-schema? schema.xml 
> is

> old
> non-managed approach. If you have both, schema.xml will be
> ignored.
>
> I suspect you are not running with the schema you think you 
> do.

> You can
> check that with API or in Admin UI if you get that far.
>
> Regards,
> Alex
>
> On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov <
> a.dimit...@seidemann-web.com wrote:
>
>> Hi,
>>
>> I'm using solr 7.5, in my schema.xml I have this, which I 
>> took

>> from the
>> managed-schema:
>>
>>   
>>   
>>   >   stored="false" />
>>   >   docValues="true" />
>>
>> However, on startup, solr complains:
>>
>>  Caused by: org.apache.solr.common.SolrException: _version_
>>  field
>>  must exist in schema and be searchable (indexed or 
>>  docValues)

>>  and
>>  retrievable(stored or docValues) and not multiValued
>>  (_version_
>>  does not exist)
>>   at
>>
>>
org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69)
>>
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>>   2018-09-18 13:07:55]
>>   at
>>   org.apache.solr.update.VersionInfo.(VersionInfo.java:95)
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>>   2018-09-18 13:07:55]
>>   at
>>   org.apache.solr.update.UpdateLog.init(UpdateLog.java:404)
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>>   2018-09-18 13:07:55]
>>   at
>>
 org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161)
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>>   2018-09-18 13:07:55]
>>   at
>>
 org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116)
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>>   2018-09-18 13:07:55]
>>   at
>>
>>
org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119)
>>
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>>   2018-09-18 13:07:55]
>>   at
>>
>> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>   Method) ~[?:?]
>>   at
>>
>>
jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>>
>>   ~[?:?]
>>   at
>>
>>
jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>
>>   ~[?:?]
>>   at
>>   java.lang.reflect.Constructor.newInstance(Constructor.java:488)
>>   ~[?:?]
>>   at
>>   org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799)
>>   ~[solr-core-7.5.0.jar:7.5.0
>>   b5bf70b7e32d7ddd9742cc821

Re: _version_ field missing in schema?

2019-01-24 Thread Aleksandar Dimitrov

Shawn Heisey  writes:


On 1/23/2019 3:49 AM, Aleksandar Dimitrov wrote:

Hi Alex,

thanks for you answer. I took the lines directly from the
managed-schema, deleted the managed-schema, and pasted those 
lines into

my schema.xml.


Unless you have changed the solrconfig.xml to refer to the 
classic schema, the

file named schema.xml is not used.


Yup, that was the mistake. I had to use

 

in my solrconfig, and then it worked. I think the classic schema 
factory

should be enough for our use case.

Thanks!
Aleks

With the standard schema factory, on core startup, if schema.xml 
is found, it is
copied to managed-schema and then renamed to a backup filename. 
This would also

happen on reload, I believe.

Recommendation: unless you're using the classic schema, never 
use the schema.xml

file. Only work with managed-schema.

Thanks,
Shawn




Re: _version_ field missing in schema?

2019-01-23 Thread Shawn Heisey

On 1/23/2019 3:49 AM, Aleksandar Dimitrov wrote:

Hi Alex,

thanks for you answer. I took the lines directly from the
managed-schema, deleted the managed-schema, and pasted those lines into
my schema.xml.


Unless you have changed the solrconfig.xml to refer to the classic 
schema, the file named schema.xml is not used.


With the standard schema factory, on core startup, if schema.xml is 
found, it is copied to managed-schema and then renamed to a backup 
filename.  This would also happen on reload, I believe.


Recommendation: unless you're using the classic schema, never use the 
schema.xml file.  Only work with managed-schema.


Thanks,
Shawn



Re: _version_ field missing in schema?

2019-01-23 Thread Alexandre Rafalovitch
If you do not use API or Admin to change schema, it will not get
automatically rewritten. So you can just stay with managed-shema file and
version that. You can even disable write changes in solrconfig.xml:
http://lucene.apache.org/solr/guide/7_6/schema-factory-definition-in-solrconfig.html#solr-uses-managed-schema-by-default


Also, the file format of managed-schema is XML even if it does not have the
appropriate extension.

Finally, since you are trying to really tweak the schema and general
configuration right from the start, you may find some of my presentations
useful, as they show the minimal configuration. Not perfect for your needs,
as I do skip _version, but as an additional data point. The recent one is:
https://www.slideshare.net/arafalov/rapid-solr-schema-development-phone-directory
and the Git repo is at:
https://github.com/arafalov/solr-presentation-2018-may . This one may be
useful as well:
https://www.slideshare.net/arafalov/from-content-to-search-speeddating-apache-solr-apachecon-2018-116330553

Regards,
   Alex.

On Wed, Jan 23, 2019, 5:50 AM Aleksandar Dimitrov <
a.dimit...@seidemann-web.com wrote:

> Hi Alex,
>
> thanks for you answer. I took the lines directly from the
> managed-schema, deleted the managed-schema, and pasted those lines
> into
> my schema.xml.
>
> If I have other errors in the schema.xml (such as a missing field
> type),
> solr complains about those until I fix them. So I would guess that
> the
> schema is at least *read*, but unsure if it is in fact used. I've
> not
> used solr before.
>
> I cannot use the admin UI, at least not while the core with the
> faulty
> schema is used.
>
> I wanted to use schema.xml because it allows for version control,
> and
> because it's easier for me to just use xml to define my schema. Is
> there
> a preferred approach? I don't (want to) use solr cloud, as for our
> use
> case a single instance of solr is more than enough.
>
> Thanks for your help,
> Aleks
>
> Alexandre Rafalovitch  writes:
>
> > What do you mean schema.xml from managed-schema? schema.xml is
> > old
> > non-managed approach. If you have both, schema.xml will be
> > ignored.
> >
> > I suspect you are not running with the schema you think you do.
> > You can
> > check that with API or in Admin UI if you get that far.
> >
> > Regards,
> > Alex
> >
> > On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov <
> > a.dimit...@seidemann-web.com wrote:
> >
> >> Hi,
> >>
> >> I'm using solr 7.5, in my schema.xml I have this, which I took
> >> from the
> >> managed-schema:
> >>
> >>   
> >>   
> >>>>   stored="false" />
> >>>>   docValues="true" />
> >>
> >> However, on startup, solr complains:
> >>
> >>  Caused by: org.apache.solr.common.SolrException: _version_
> >>  field
> >>  must exist in schema and be searchable (indexed or docValues)
> >>  and
> >>  retrievable(stored or docValues) and not multiValued
> >>  (_version_
> >>  does not exist)
> >>   at
> >>
> >>
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69)
> >>
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>   org.apache.solr.update.VersionInfo.(VersionInfo.java:95)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>   org.apache.solr.update.UpdateLog.init(UpdateLog.java:404)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
>  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
>  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116)
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
> >>   2018-09-18 13:07:55]
> >>   at
> >>
> >>
> org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119)
> >>
> >>   ~[solr-core-7.5.0.jar:7.5.0
> >>   b5bf70b7e32d7ddd9742cc821d471c

Re: _version_ field missing in schema?

2019-01-23 Thread Aleksandar Dimitrov

Hi Alex,

thanks for you answer. I took the lines directly from the
managed-schema, deleted the managed-schema, and pasted those lines 
into

my schema.xml.

If I have other errors in the schema.xml (such as a missing field 
type),
solr complains about those until I fix them. So I would guess that 
the
schema is at least *read*, but unsure if it is in fact used. I've 
not

used solr before.

I cannot use the admin UI, at least not while the core with the 
faulty

schema is used.

I wanted to use schema.xml because it allows for version control, 
and
because it's easier for me to just use xml to define my schema. Is 
there
a preferred approach? I don't (want to) use solr cloud, as for our 
use

case a single instance of solr is more than enough.

Thanks for your help,
Aleks

Alexandre Rafalovitch  writes:

What do you mean schema.xml from managed-schema? schema.xml is 
old
non-managed approach. If you have both, schema.xml will be 
ignored.


I suspect you are not running with the schema you think you do. 
You can

check that with API or in Admin UI if you get that far.

Regards,
Alex

On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov <
a.dimit...@seidemann-web.com wrote:


Hi,

I'm using solr 7.5, in my schema.xml I have this, which I took
from the
managed-schema:

  
  
  
  

However, on startup, solr complains:

 Caused by: org.apache.solr.common.SolrException: _version_ 
 field
 must exist in schema and be searchable (indexed or docValues) 
 and
 retrievable(stored or docValues) and not multiValued 
 (_version_

 does not exist)
  at

org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.update.VersionInfo.(VersionInfo.java:95)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at 
  org.apache.solr.update.UpdateLog.init(UpdateLog.java:404)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at

org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at

jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
  Method) ~[?:?]
  at

jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

  ~[?:?]
  at

jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

  ~[?:?]
  at
  java.lang.reflect.Constructor.newInstance(Constructor.java:488)
  ~[?:?]
  at
  org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:861)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at
  org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1114)
  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at 
  org.apache.solr.core.SolrCore.(SolrCore.java:984)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at 
  org.apache.solr.core.SolrCore.(SolrCore.java:869)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  at

org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138)

  ~[solr-core-7.5.0.jar:7.5.0
  b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
  2018-09-18 13:07:55]
  ... 7 more

Anyone know what I'm doing wrong?
I've tried having the _version_ field be string, and indexed 
and

stored,
but that didn't help.

Thanks!

Aleks







Re: _version_ field missing in schema?

2019-01-22 Thread Alexandre Rafalovitch
What do you mean schema.xml from managed-schema? schema.xml is old
non-managed approach. If you have both, schema.xml will be ignored.

I suspect you are not running with the schema you think you do. You can
check that with API or in Admin UI if you get that far.

Regards,
Alex

On Tue, Jan 22, 2019, 11:39 AM Aleksandar Dimitrov <
a.dimit...@seidemann-web.com wrote:

> Hi,
>
> I'm using solr 7.5, in my schema.xml I have this, which I took
> from the
> managed-schema:
>
>   
>   
>  stored="false" />
>  docValues="true" />
>
> However, on startup, solr complains:
>
>  Caused by: org.apache.solr.common.SolrException: _version_ field
>  must exist in schema and be searchable (indexed or docValues) and
>  retrievable(stored or docValues) and not multiValued (_version_
>  does not exist)
>   at
>
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69)
>
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>   org.apache.solr.update.VersionInfo.(VersionInfo.java:95)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at org.apache.solr.update.UpdateLog.init(UpdateLog.java:404)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>   org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>   org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>
> org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119)
>
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>   Method) ~[?:?]
>   at
>
> jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>
>   ~[?:?]
>   at
>
> jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>
>   ~[?:?]
>   at
>   java.lang.reflect.Constructor.newInstance(Constructor.java:488)
>   ~[?:?]
>   at
>   org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>   org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:861)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>   org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1114)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at org.apache.solr.core.SolrCore.(SolrCore.java:984)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at org.apache.solr.core.SolrCore.(SolrCore.java:869)
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   at
>
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138)
>
>   ~[solr-core-7.5.0.jar:7.5.0
>   b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi -
>   2018-09-18 13:07:55]
>   ... 7 more
>
> Anyone know what I'm doing wrong?
> I've tried having the _version_ field be string, and indexed and
> stored,
> but that didn't help.
>
> Thanks!
>
> Aleks
>
>


_version_ field missing in schema?

2019-01-22 Thread Aleksandar Dimitrov

Hi,

I'm using solr 7.5, in my schema.xml I have this, which I took 
from the

managed-schema:

 
 
  stored="false" />
  docValues="true" />


However, on startup, solr complains:

Caused by: org.apache.solr.common.SolrException: _version_ field 
must exist in schema and be searchable (indexed or docValues) and 
retrievable(stored or docValues) and not multiValued (_version_ 
does not exist)
 at 
 org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:69) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.update.VersionInfo.(VersionInfo.java:95) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at org.apache.solr.update.UpdateLog.init(UpdateLog.java:404) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.update.UpdateHandler.(UpdateHandler.java:161) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.update.UpdateHandler.(UpdateHandler.java:116) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:119) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method) ~[?:?]
 at 
 jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
 ~[?:?]
 at 
 jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
 ~[?:?]
 at 
 java.lang.reflect.Constructor.newInstance(Constructor.java:488) 
 ~[?:?]
 at 
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:799) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:861) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1114) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at org.apache.solr.core.SolrCore.(SolrCore.java:984) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at org.apache.solr.core.SolrCore.(SolrCore.java:869) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]
 at 
 org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1138) 
 ~[solr-core-7.5.0.jar:7.5.0 
 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 
 2018-09-18 13:07:55]

 ... 7 more

Anyone know what I'm doing wrong?
I've tried having the _version_ field be string, and indexed and 
stored,

but that didn't help.

Thanks!

Aleks



Re: Solr schema to parse and search source code

2018-12-16 Thread Alexandre Rafalovitch
https://microsoft.github.io/language-server-protocol/ is probably your best bet.

But also, perhaps you can just use or start from SourceGraph?
https://sourcegraph.com/start

Regards,
   Alex.
On Sat, 15 Dec 2018 at 20:26, Steven White  wrote:
>
> Hi everyone,
>
> I'm in need of providing a search engine to search source code such as
> Java, C++, PERL, Visual Basic, etc.
>
> My questions are:
> 1) Are there existing open source source code parsers I can use that will
> scan a source code and return it as elements such as comments, function
> names, class names, variables, etc. and
> 2) What should my Solr schema configuration should be like?
>
> If you have done this and can share you solution, that would be great.
>
> I will be using the latest version of Solr for this project.
>
> Thanks
>
> Steve


Solr schema to parse and search source code

2018-12-15 Thread Steven White
Hi everyone,

I'm in need of providing a search engine to search source code such as
Java, C++, PERL, Visual Basic, etc.

My questions are:
1) Are there existing open source source code parsers I can use that will
scan a source code and return it as elements such as comments, function
names, class names, variables, etc. and
2) What should my Solr schema configuration should be like?

If you have done this and can share you solution, that would be great.

I will be using the latest version of Solr for this project.

Thanks

Steve


Using streaming expressions with a pre-existing schema (aka migrating a standalone instance to SolrCloud)

2018-12-10 Thread Guillaume Rossolini
Hi there,

This is about undocumented restrictions about using streaming expressions
(in the sense that I haven't found the right documentation).

** Setup
I just followed the documentation to start SolrCloud on my local machine,
and I made it so it would replace the previous standalone server I have
been using for the last several years. Had some trouble understanding how
to use my current solrconfig and schema, but I got it working and I could
index all the documents without any trouble.
After indexing both collections, the Admin > Cloud > Nodes reports the
expected number of documents in each.

_Before I go into too many details, it may be useful to include a warning
in the chapter called "the well-configured Solr instance" so that new users
are aware that streaming expressions won't work in standalone mode. I know
it is explained elsewhere, but that page would be a good fit for such a
warning: SolrCloud is not only about scaling, as the page would suggest._

** The issue
Now, about my question:
I went the SolrCloud route because I wanted to try streaming expressions
and eventually the classifiers.
To that end, I loaded the web admin in Firefox, went to the right
collection, then Stream, and pasted the adapted examples.

I have been trying the query examples from this page:
https://lucene.apache.org/solr/guide/7_5/stream-source-reference.html

A couple things worth mentioning about the admin UI:

   - the URL that is presented doesn't work:
  - 
http://127.0.0.1:8983products/stream?explain=true=search(products,q=*:*,fl="ref,createdAt",
  sort="createdAt desc",qt="catalogFr")
  - it is missing part of the path, in my case it should start with
  http://127.0.0.1:8983/solr/products/stream
   - that box's UI could be improved by making the URL clickable or at
   least selectable
   - the output misses a lot of metadata, as I can't see much of the
   "explanation" that is available when requesting the actual URL


Most notably, I was only able to get a response from search(),
significantTerms() and stats(). The other sources failed with an exception.

Here is a log from features() which ends with a
java.util.concurrent.ExecutionException:
https://paste.linux.community/view/6676fc3c

It seems that the collection is looking for the 192.168.56.1 IP and I have
no idea why it would switch to another network (that Solr may not be bound
to) or how to change this.
At setup, I only used "localhost" so I would expect all URL to reflect
this, not even 127.0.0.1.

Here is another, this one about timeseries() and ends with a
java.lang.NullPointerException:
https://paste.linux.community/view/f62bebf6


Are there configuration considerations I haven't accounted for, like maybe
field attributes that should be present and that are documented elsewhere?

Best regards,


Need a way to get notification when a new field is added into managed schema

2018-12-06 Thread Michael Hu
Environment: Solr 7.4


I use mutable managed schema. I need a way for getting notification when a new 
field is added into schema.


First, I try to extend "org.apache.solr.schema.ManagedIndexSchema". 
Unfortunately, it is defined as final class, so that I am not able to extend it.


Then, I try to implement my own IndexSchemaFactory and IndexSchema by extending 
"org.apache.solr.schema.IndexSchema" and wrapping an instance of 
ManagedIndexSchema, and delegating all methods to the wrapped instance. 
However, when I test the implementation, I find out that 
"com.vmware.ops.data.solr.processor.AddSchemaFieldsUpdateProcessor" casts 
IndexSchema to ManagedIndexSchema at 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/core/src/java/org/apache/solr/update/processor/AddSchemaFieldsUpdateProcessorFactory.java#L456
 . So it does not work with the update processor. (NOTE: the cast happens for 
Solr 7.5 as well)


Can you suggest a way so that I can get notification when a new field is added 
into schema.?


Thank you for your help!


--Michael







Re: Is reload necessary for updates to files referenced in schema, like synonyms, protwords, etc?

2018-11-28 Thread Shawn Heisey

On 11/28/2018 6:37 AM, Vincenzo D'Amore wrote:

Very likely I'm late to this party :) not sure with solr standalone, but
with solrcloud (7.3.1) you have to reload the core every time synonyms
referenced by a schema are changed.


I have a 7.5.0 download on my workstation, so I fired that up, created a 
core, and tried it out.  I did learn that a reload is required when 
changing files referenced by analysis components in the schema.  That's 
what I had thought was probably the case, now I know for sure.


Thanks,
Shawn



Re: Is reload necessary for updates to files referenced in schema, like synonyms, protwords, etc?

2018-11-28 Thread Vincenzo D'Amore
Very likely I'm late to this party :) not sure with solr standalone, but
with solrcloud (7.3.1) you have to reload the core every time synonyms
referenced by a schema are changed.

On Mon, Nov 26, 2018 at 8:51 PM Walter Underwood 
wrote:

> Should be easy to check with the analysis UI. Add a synonym and see if it
> is used.
>
> I seem to remember some work on reloading synonyms on the fly without a
> core reload. These seem related...
>
> https://issues.apache.org/jira/browse/SOLR-5200
> https://issues.apache.org/jira/browse/SOLR-5234
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Nov 26, 2018, at 11:43 AM, Shawn Heisey  wrote:
> >
> > I know that changes to the schema require a reload.  But do changes to
> files referenced by a schema also require a reload?  So if for instance I
> were to change the contents of a synonym file, would I need to reload the
> core before Solr would use the new file?  Synonyms in this case are at
> query time, but other files like protwords are used at index time.
> >
> > I *THINK* that a reload is required, but I can't be sure without
> checking the code, and it would probably take me more than a couple of
> hours to unravel the code enough to answer the question myself.
> >
> > It is not SolrCloud, so there's no ZK to worry about.
> >
> > Thanks,
> > Shawn
> >
>
>

-- 
Vincenzo D'Amore


Re: Is reload necessary for updates to files referenced in schema, like synonyms, protwords, etc?

2018-11-26 Thread Walter Underwood
Should be easy to check with the analysis UI. Add a synonym and see if it is 
used.

I seem to remember some work on reloading synonyms on the fly without a core 
reload. These seem related...

https://issues.apache.org/jira/browse/SOLR-5200
https://issues.apache.org/jira/browse/SOLR-5234

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Nov 26, 2018, at 11:43 AM, Shawn Heisey  wrote:
> 
> I know that changes to the schema require a reload.  But do changes to files 
> referenced by a schema also require a reload?  So if for instance I were to 
> change the contents of a synonym file, would I need to reload the core before 
> Solr would use the new file?  Synonyms in this case are at query time, but 
> other files like protwords are used at index time.
> 
> I *THINK* that a reload is required, but I can't be sure without checking the 
> code, and it would probably take me more than a couple of hours to unravel 
> the code enough to answer the question myself.
> 
> It is not SolrCloud, so there's no ZK to worry about.
> 
> Thanks,
> Shawn
> 



Is reload necessary for updates to files referenced in schema, like synonyms, protwords, etc?

2018-11-26 Thread Shawn Heisey
I know that changes to the schema require a reload.  But do changes to 
files referenced by a schema also require a reload?  So if for instance 
I were to change the contents of a synonym file, would I need to reload 
the core before Solr would use the new file?  Synonyms in this case are 
at query time, but other files like protwords are used at index time.


I *THINK* that a reload is required, but I can't be sure without 
checking the code, and it would probably take me more than a couple of 
hours to unravel the code enough to answer the question myself.


It is not SolrCloud, so there's no ZK to worry about.

Thanks,
Shawn



Re: Exporting results and schema design

2018-11-15 Thread Erick Erickson
NP, having something in the manual is A Good Thing, but it's very,
very easy to not find a paragraph in a 1,000+ page doc!

Oh, and I recommend downloading the PDF version of the Solr ref guide
for your version of solr for locally-searchable reference FWIW.

Best,
Erick
On Thu, Nov 15, 2018 at 1:26 AM Dwane Hall  wrote:
>
>
> Thanks Erick that's great advice as always it's very much appreciated. I've 
> never seen an example of that pattern used before 
> (stored=false,indexed=false,useDocValuesAsStored=true) on any of the 
> fantastic solr blogs I've read and I've read lot of them many times (all of 
> your excellent Lucidworks posts, Yonik's personal blog, Rafal's sematext 
> posts, Tote from the RDL, and most of the Lucene/Solr revolution youtube 
> clips  etc.).  I found the relevant section in the solr doco it's really a 
> little gem of a pattern I did not know docValues could provide that feature 
> and I've always believed I needed stored=true if I required the fields 
> returned.
>
> Many thanks once again for taking the time to respond,
>
> Dwane
>
> 
> From: Erick Erickson 
> Sent: Thursday, 15 November 2018 1:55 PM
> To: solr-user
> Subject: Re: Exporting results and schema design
>
> Well, docValues doesn't necessarily waste much index space if you
> don't store the field and useDocValuesAsStored. It also won't beat up
> your machine as badly if you fetch all your fields from DV fields. To
> fetch a stored field, you need to
>
> > seek to the stored data on disk
> > decompress a 16K block minimum
> > fetch the stored fields.
>
> So using docvalues rather than stored for "1000s" of rows will avoid that 
> cycle.
>
> You can use the cursorMark to page efficiently, your middleware would
> have to be in charge of that.
>
> Best,
> Erick
> On Wed, Nov 14, 2018 at 6:35 PM Dwane Hall  wrote:
> >
> > Good afternoon Solr community,
> >
> > I have a situation where I require the following solr features.
> >
> > 1.   Highlighting must be available for the matched search results
> >
> > 2.   After a user performs a regular solr search (/select, rows=10) I 
> > require a drill down which has the potential to export a large number of 
> > results (1000s +).
> >
> >
> > Here is how I’ve approached each of the two problems above
> >
> >
> > 1.Highlighting must be available for the matched search results
> >
> > My implementation is in pattern with the recommend approach.  A 
> > stored=false indexed=true copy field with the individual fields to 
> > highlight analysed, stored=true, indexed=false.
> >
> >  > multiValued="true"/>
> >
> >  > multiValued="true"/>
> >
> > 
> >
> > 
> >
> >  > multiValued="true"/>
> >
> >
> > 2.After a user performs a regular solr search (/select, rows=10) I require 
> > a drill down which has the potential to export a large number of results 
> > (1000s+ with no searching required over the fields)
> >
> >
> > From all the documentation the recommended approach for returning large 
> > result sets is using the /export request handler.  As none of my fields 
> > qualify for using then /export handler (i.e. docValues=true) is my only 
> > option to have additional duplicated fields mapped as strings so they can 
> > be used in the export process?
> >
> > i.e. using my example above now managed-schema now becomes
> >
> >  > multiValued="true"/>
> >
> >  > multiValued="true"/>
> >
> > 
> >
> > 
> >
> >  > multiValued="true"/>
> >
> >
> > 
> >
> > 
> >
> >  > multiValued="true"/>
> >
> >  > multiValued="true"/>
> >
> >
> > If I did not require highlighting I could change the initial mapped fields 
> > (First_Names, Last_Names) from type=text_general to type=string and save 
> > the additional storage in the index but in my current situation I can’t see 
> > a way around having to duplicate all the fields required for the /export 
> > handler as strings. Is this how people typically handle this problem or am 
> > I completely off the mark with my design?
> >
> >
> > Any advice would be greatly appreciated,
> >
> >
> > Thanks
> >
> >
> > Dwane


Re: Exporting results and schema design

2018-11-15 Thread Dwane Hall

Thanks Erick that's great advice as always it's very much appreciated. I've 
never seen an example of that pattern used before 
(stored=false,indexed=false,useDocValuesAsStored=true) on any of the fantastic 
solr blogs I've read and I've read lot of them many times (all of your 
excellent Lucidworks posts, Yonik's personal blog, Rafal's sematext posts, Tote 
from the RDL, and most of the Lucene/Solr revolution youtube clips  etc.).  I 
found the relevant section in the solr doco it's really a little gem of a 
pattern I did not know docValues could provide that feature and I've always 
believed I needed stored=true if I required the fields returned.

Many thanks once again for taking the time to respond,

Dwane


From: Erick Erickson 
Sent: Thursday, 15 November 2018 1:55 PM
To: solr-user
Subject: Re: Exporting results and schema design

Well, docValues doesn't necessarily waste much index space if you
don't store the field and useDocValuesAsStored. It also won't beat up
your machine as badly if you fetch all your fields from DV fields. To
fetch a stored field, you need to

> seek to the stored data on disk
> decompress a 16K block minimum
> fetch the stored fields.

So using docvalues rather than stored for "1000s" of rows will avoid that cycle.

You can use the cursorMark to page efficiently, your middleware would
have to be in charge of that.

Best,
Erick
On Wed, Nov 14, 2018 at 6:35 PM Dwane Hall  wrote:
>
> Good afternoon Solr community,
>
> I have a situation where I require the following solr features.
>
> 1.   Highlighting must be available for the matched search results
>
> 2.   After a user performs a regular solr search (/select, rows=10) I 
> require a drill down which has the potential to export a large number of 
> results (1000s +).
>
>
> Here is how I’ve approached each of the two problems above
>
>
> 1.Highlighting must be available for the matched search results
>
> My implementation is in pattern with the recommend approach.  A stored=false 
> indexed=true copy field with the individual fields to highlight analysed, 
> stored=true, indexed=false.
>
>  multiValued="true"/>
>
>  multiValued="true"/>
>
> 
>
> 
>
>  multiValued="true"/>
>
>
> 2.After a user performs a regular solr search (/select, rows=10) I require a 
> drill down which has the potential to export a large number of results 
> (1000s+ with no searching required over the fields)
>
>
> From all the documentation the recommended approach for returning large 
> result sets is using the /export request handler.  As none of my fields 
> qualify for using then /export handler (i.e. docValues=true) is my only 
> option to have additional duplicated fields mapped as strings so they can be 
> used in the export process?
>
> i.e. using my example above now managed-schema now becomes
>
>  multiValued="true"/>
>
>  multiValued="true"/>
>
> 
>
> 
>
>  multiValued="true"/>
>
>
> 
>
> 
>
>  multiValued="true"/>
>
>  multiValued="true"/>
>
>
> If I did not require highlighting I could change the initial mapped fields 
> (First_Names, Last_Names) from type=text_general to type=string and save the 
> additional storage in the index but in my current situation I can’t see a way 
> around having to duplicate all the fields required for the /export handler as 
> strings. Is this how people typically handle this problem or am I completely 
> off the mark with my design?
>
>
> Any advice would be greatly appreciated,
>
>
> Thanks
>
>
> Dwane


Re: Exporting results and schema design

2018-11-14 Thread Erick Erickson
Well, docValues doesn't necessarily waste much index space if you
don't store the field and useDocValuesAsStored. It also won't beat up
your machine as badly if you fetch all your fields from DV fields. To
fetch a stored field, you need to

> seek to the stored data on disk
> decompress a 16K block minimum
> fetch the stored fields.

So using docvalues rather than stored for "1000s" of rows will avoid that cycle.

You can use the cursorMark to page efficiently, your middleware would
have to be in charge of that.

Best,
Erick
On Wed, Nov 14, 2018 at 6:35 PM Dwane Hall  wrote:
>
> Good afternoon Solr community,
>
> I have a situation where I require the following solr features.
>
> 1.   Highlighting must be available for the matched search results
>
> 2.   After a user performs a regular solr search (/select, rows=10) I 
> require a drill down which has the potential to export a large number of 
> results (1000s +).
>
>
> Here is how I’ve approached each of the two problems above
>
>
> 1.Highlighting must be available for the matched search results
>
> My implementation is in pattern with the recommend approach.  A stored=false 
> indexed=true copy field with the individual fields to highlight analysed, 
> stored=true, indexed=false.
>
>  multiValued="true"/>
>
>  multiValued="true"/>
>
> 
>
> 
>
>  multiValued="true"/>
>
>
> 2.After a user performs a regular solr search (/select, rows=10) I require a 
> drill down which has the potential to export a large number of results 
> (1000s+ with no searching required over the fields)
>
>
> From all the documentation the recommended approach for returning large 
> result sets is using the /export request handler.  As none of my fields 
> qualify for using then /export handler (i.e. docValues=true) is my only 
> option to have additional duplicated fields mapped as strings so they can be 
> used in the export process?
>
> i.e. using my example above now managed-schema now becomes
>
>  multiValued="true"/>
>
>  multiValued="true"/>
>
> 
>
> 
>
>  multiValued="true"/>
>
>
> 
>
> 
>
>  multiValued="true"/>
>
>  multiValued="true"/>
>
>
> If I did not require highlighting I could change the initial mapped fields 
> (First_Names, Last_Names) from type=text_general to type=string and save the 
> additional storage in the index but in my current situation I can’t see a way 
> around having to duplicate all the fields required for the /export handler as 
> strings. Is this how people typically handle this problem or am I completely 
> off the mark with my design?
>
>
> Any advice would be greatly appreciated,
>
>
> Thanks
>
>
> Dwane


Exporting results and schema design

2018-11-14 Thread Dwane Hall
Good afternoon Solr community,

I have a situation where I require the following solr features.

1.   Highlighting must be available for the matched search results

2.   After a user performs a regular solr search (/select, rows=10) I 
require a drill down which has the potential to export a large number of 
results (1000s +).


Here is how I’ve approached each of the two problems above


1.Highlighting must be available for the matched search results

My implementation is in pattern with the recommend approach.  A stored=false 
indexed=true copy field with the individual fields to highlight analysed, 
stored=true, indexed=false.












2.After a user performs a regular solr search (/select, rows=10) I require a 
drill down which has the potential to export a large number of results (1000s+ 
with no searching required over the fields)


>From all the documentation the recommended approach for returning large result 
>sets is using the /export request handler.  As none of my fields qualify for 
>using then /export handler (i.e. docValues=true) is my only option to have 
>additional duplicated fields mapped as strings so they can be used in the 
>export process?

i.e. using my example above now managed-schema now becomes





















If I did not require highlighting I could change the initial mapped fields 
(First_Names, Last_Names) from type=text_general to type=string and save the 
additional storage in the index but in my current situation I can’t see a way 
around having to duplicate all the fields required for the /export handler as 
strings. Is this how people typically handle this problem or am I completely 
off the mark with my design?


Any advice would be greatly appreciated,


Thanks


Dwane


Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-29 Thread Chris Hostetter

:  Hi Erick,Thanks for your reply.No, we aren't using schemaless 
: mode.   is not explicitly declared in 
: our solrconfig.xmlAlso we have only one replica and one shard.

ManagedIndexSchemaFactory has been the default since 6.0 unless an 
explicit schemaFactory is defined...

https://lucene.apache.org/solr/guide/7_5/major-changes-from-solr-5-to-solr-6.html

https://lucene.apache.org/solr/guide/7_5/schema-factory-definition-in-solrconfig.html


-Hoss 

http://www.lucidworks.com/

Re: Casting from schemaless to classic schema

2018-10-24 Thread Zahra Aminolroaya
Thanks Alexandre and Shawn. 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Casting from schemaless to classic schema

2018-10-17 Thread Alexandre Rafalovitch
It is not clear what you are trying to achieve?
1) Disable schemaless mode to be more explicit about new fields? New
Solr has a switch for that. Older one, you can just disable the chain
invocation in solrconfig.xml
2) Stop schema being editing at all, whether schemaless or via API?
There is a flag in solrconfig.xml to make it immutable.
https://lucene.apache.org/solr/guide/7_5/schema-factory-definition-in-solrconfig.html
- this also has instructions that seem even more relevant to your
specific question.

Regards,
   Alex.

On Wed, 17 Oct 2018 at 07:36, Zahra Aminolroaya  wrote:
>
> I want to change my Solr from schemaless to classic schema.
>
> I read
> https://stackoverflow.com/questions/29819854/how-does-solrs-schema-less-feature-work-how-to-revert-it-to-classic-schema
> <https://stackoverflow.com/questions/29819854/how-does-solrs-schema-less-feature-work-how-to-revert-it-to-classic-schema>
> .
>
>
> What would be the challenges that I will confront with as my schemaless
> collection has some indexed documents in it?
>
> In solr 7, is commenting add-unknown-fields-to-the-schema the only
> difference between schemaless and classic schema?
>
>
> Best,
>
> Zahra
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Casting from schemaless to classic schema

2018-10-17 Thread Shawn Heisey

On 10/17/2018 5:36 AM, Zahra Aminolroaya wrote:

What would be the challenges that I will confront with as my schemaless
collection has some indexed documents in it?


If the schema itself (the file named managed-schema that you might be 
renaming to schema.xml) hasn't changed, then the existing index will 
continue to work just fine, regardless of whether you're using the 
classic schema factory or not.



In solr 7, is commenting add-unknown-fields-to-the-schema the only
difference between schemaless and classic schema?


The add-unknown-fields update chain is not part of the schema.  It's in 
solrconfig.xml.  But it does *require* the managed schema in order to 
work.  The classic schema is immutable, can only be changed externally.


My personal opinion -- let Solr continue to use its default of the 
managed schema, but remove and disable the add-unknown-fields update 
chain.  It is the update chain that creates schemaless mode.


Thanks,
Shawn



  1   2   3   4   5   6   7   8   9   10   >