RE: Solr 8.5.2: DataImportHandler failed to instantiate org.apache.solr.request.SolrRequestHandler

2020-06-26 Thread Peter van de Kerk
Ok, thanks. I deduped the jarfiles and now only have the 
solr-dataimporthandler-8.5.2.jar in server\lib folder.
The errors are now gone on the admin page.
But it also states "No cores available", and when I try to create a new core 
`mytest` (whose files are already on my disk) I get the error:

Error CREATEing SolrCore 'mytest': Unable to create core [mytest] Caused by: 
org.apache.solr.util.plugin.SolrCoreAware

name:mytest
instanceDir:mytest
dataDir:data
config:solrconfig.xml
schema:schema.xml
instanceDir and dataDir need to exist before you can create the core

Even though my folders are there:

server
solr
mytest
   conf
   data-config.xml
   managed-schema
   solrconfig.xml
   data

I don't see any further detailed logging (no "i" icon either)

Thanks!
Pete

From: Shawn Heisey
Sent: Friday, June 26, 2020 19:57
To: solr-user@lucene.apache.org
Subject: Re: Solr 8.5.2: DataImportHandler failed to instantiate 
org.apache.solr.request.SolrRequestHandler

On 6/24/2020 1:59 PM, Peter van de Kerk wrote:
> So I copied files from C:\solr-8.5.2\dist to C:\solr-8.5.2\server\lib
>
> But then I get error
>
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
>> Error Instantiating requestHandler, 
>> org.apache.solr.handler.dataimport.DataImportHandler failed to instantiate 
>> org.apache.solr.request.SolrRequestHandler

All of the errors will be MUCH longer than what you have included here,
and we need that detail to diagnose anything.  If you are seeing these
in the admin UI "logging" tab, you can click on the little "i" icon to
expand them.  But be aware that you'll have to read/copy the expanded
data very quickly - the admin UI will quickly close the expansion.  It's
much better to go to the actual logfile than use the admin UI.

There are better locations than server/lib for jars, but I don't think
that's causing the problem.  You should definitely NOT copy ALL of the
jars in the dist directory -- this places an additional copy of the main
Solr jars on the classpath, and having the same jar accessible from two
places is a VERY bad thing for Java software.  It causes some really
weird problems, and I can see this issue being a result of that.  For
most DIH uses, you only need the "solr-dataimporthandler-X.Y.Z.jar"
file.  For some DIH use cases (but not most of them) you might also need
the extras jar.

Thanks,
Shawn



Re: Developing update processor/Query Parser

2020-06-26 Thread Vincenzo D'Amore
Sharing a static object between URP and QParser it's easy. But when the
core is reloaded, this shared static object is rewritten from scratch.
An old QParser that references that object could have serious problems,
inconsistencies, concurrent modification exceptions, etc.
On the other hand, trying to solve this problem using synchronization,
mutex or semaphores will lead to a non performing solution.
Another real problem I have, is that it is not clear what happens
internally.

On Fri, Jun 26, 2020 at 9:19 PM Vincenzo D'Amore  wrote:

> Hi Gus, thanks for the thorough explanation.
> Infact, I was concerned about how to hold the shared information (between
> URP and QParser), for example when a core is reloaded.
> What happens when a core is reloaded? Very likely I have a couple of new
> URP/QParser but an older QParser may still be serving a request.
> If my assumption is true, now I have a clearer idea.
> So the question is, how to share an object (CustomConfig) between URP and
> QParser.
> And this CustomConfig object is created by URP every time the core is
> reloaded.
>
> On Fri, Jun 26, 2020 at 8:47 PM Gus Heck  wrote:
>
>> During the request, the parser plugin is retrieved from a PluginBag on the
>> SolrCore object, so it should be reloaded at the same time as the update
>> component (which comes from another PluginBag on SolrCore). If the
>> components are deployed with consistent configuration in solrconfig.xml,
>> any given SolrCore instance should have a consistent set of both. If you
>> want to avoid repeating the information, one possibility is to use a
>> system
>> property
>>
>> https://lucene.apache.org/solr/guide/8_4/configuring-solrconfig-xml.html#jvm-system-properties
>> though
>> the suitability of that may depend on the size of your cluster and your
>> deployment infrastructure.
>>
>> On Thu, Jun 25, 2020 at 2:47 PM Mikhail Khludnev  wrote:
>>
>> > Hello, Vincenzo.
>> > Please find above about a dedicated component doing nothing, but just
>> > holding a config.
>> > Also you may extract config into a file and load it by
>> > SolrResourceLoaderAware.
>> >
>> > On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore 
>> > wrote:
>> >
>> > > Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
>> > > Just to be a little bit more specific, consider that if the update
>> > factory
>> > > writes a field that has a size of 50.
>> > > The QParser should be aware of the current size when writing a query.
>> > >
>> > > Is it possible to have in solrconfig.xml file a shared configuration?
>> > >
>> > > I mean a snippet of configuration shared between update processor
>> factory
>> > > and QParser.
>> > >
>> > >
>> > > On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev 
>> > wrote:
>> > >
>> > > > Hello, Vincenzo.
>> > > > Presumably you can introduce a component which just holds a config
>> > data,
>> > > > and then this component might be lookedup from QParser and
>> > UpdateFactory.
>> > > > Overall, it seems like embedding logic into Solr core, which rarely
>> > works
>> > > > well.
>> > > >
>> > > > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore <
>> v.dam...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > I've started to work on a couple of components very tight
>> together.
>> > > > > An update processor that writes few fields in the solr index and a
>> > > Query
>> > > > > Parser that, well, then reads such fields from the index.
>> > > > >
>> > > > > Such components share few configuration parameters together, I'm
>> > asking
>> > > > if
>> > > > > there is a pattern, a draft, a sample, some guidelines or best
>> > > practices
>> > > > > that explains how to properly save configuration parameters.
>> > > > >
>> > > > > The configuration is written into the solrconfig.xml file, for
>> > example:
>> > > > >
>> > > > >
>> > > > >  
>> > > > >x1
>> > > > >x2
>> > > > >  
>> > > > >
>> > > > >
>> > > > > And then query parser :
>> > > > >
>> > > > > > > > > > class="com.example.query.MyCustomQueryParserPlugin" />
>> > > > >
>> > > > > I'm struggling because the change of configuration on the updated
>> > > > processor
>> > > > > has an impact on the query parser.
>> > > > > For example the configuration info shared between those two
>> > components
>> > > > can
>> > > > > be overwritten during a core reload.
>> > > > > Basically, during an update or a core reload, there is a query
>> parser
>> > > > that
>> > > > > is serving requests while some other component is updating the
>> index.
>> > > > > So I suppose there should be a pattern, an approach, a common
>> > solution
>> > > > when
>> > > > > a piece of configuration has to be loaded at boot, or when the
>> core
>> > is
>> > > > > loaded.
>> > > > > Or when, after an update a new searcher is created and a new query
>> > > parser
>> > > > > is created.
>> > > > >
>> > > > > Any suggestion is really appreciated.
>> > > > >
>> > > > > Best regards,
>> > > > > Vincenzo
>> > > > >

Re: Developing update processor/Query Parser

2020-06-26 Thread Vincenzo D'Amore
Hi Gus, thanks for the thorough explanation.
Infact, I was concerned about how to hold the shared information (between
URP and QParser), for example when a core is reloaded.
What happens when a core is reloaded? Very likely I have a couple of new
URP/QParser but an older QParser may still be serving a request.
If my assumption is true, now I have a clearer idea.
So the question is, how to share an object (CustomConfig) between URP and
QParser.
And this CustomConfig object is created by URP every time the core is
reloaded.

On Fri, Jun 26, 2020 at 8:47 PM Gus Heck  wrote:

> During the request, the parser plugin is retrieved from a PluginBag on the
> SolrCore object, so it should be reloaded at the same time as the update
> component (which comes from another PluginBag on SolrCore). If the
> components are deployed with consistent configuration in solrconfig.xml,
> any given SolrCore instance should have a consistent set of both. If you
> want to avoid repeating the information, one possibility is to use a system
> property
>
> https://lucene.apache.org/solr/guide/8_4/configuring-solrconfig-xml.html#jvm-system-properties
> though
> the suitability of that may depend on the size of your cluster and your
> deployment infrastructure.
>
> On Thu, Jun 25, 2020 at 2:47 PM Mikhail Khludnev  wrote:
>
> > Hello, Vincenzo.
> > Please find above about a dedicated component doing nothing, but just
> > holding a config.
> > Also you may extract config into a file and load it by
> > SolrResourceLoaderAware.
> >
> > On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore 
> > wrote:
> >
> > > Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
> > > Just to be a little bit more specific, consider that if the update
> > factory
> > > writes a field that has a size of 50.
> > > The QParser should be aware of the current size when writing a query.
> > >
> > > Is it possible to have in solrconfig.xml file a shared configuration?
> > >
> > > I mean a snippet of configuration shared between update processor
> factory
> > > and QParser.
> > >
> > >
> > > On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev 
> > wrote:
> > >
> > > > Hello, Vincenzo.
> > > > Presumably you can introduce a component which just holds a config
> > data,
> > > > and then this component might be lookedup from QParser and
> > UpdateFactory.
> > > > Overall, it seems like embedding logic into Solr core, which rarely
> > works
> > > > well.
> > > >
> > > > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore  >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I've started to work on a couple of components very tight together.
> > > > > An update processor that writes few fields in the solr index and a
> > > Query
> > > > > Parser that, well, then reads such fields from the index.
> > > > >
> > > > > Such components share few configuration parameters together, I'm
> > asking
> > > > if
> > > > > there is a pattern, a draft, a sample, some guidelines or best
> > > practices
> > > > > that explains how to properly save configuration parameters.
> > > > >
> > > > > The configuration is written into the solrconfig.xml file, for
> > example:
> > > > >
> > > > >
> > > > >  
> > > > >x1
> > > > >x2
> > > > >  
> > > > >
> > > > >
> > > > > And then query parser :
> > > > >
> > > > >  > > > > class="com.example.query.MyCustomQueryParserPlugin" />
> > > > >
> > > > > I'm struggling because the change of configuration on the updated
> > > > processor
> > > > > has an impact on the query parser.
> > > > > For example the configuration info shared between those two
> > components
> > > > can
> > > > > be overwritten during a core reload.
> > > > > Basically, during an update or a core reload, there is a query
> parser
> > > > that
> > > > > is serving requests while some other component is updating the
> index.
> > > > > So I suppose there should be a pattern, an approach, a common
> > solution
> > > > when
> > > > > a piece of configuration has to be loaded at boot, or when the core
> > is
> > > > > loaded.
> > > > > Or when, after an update a new searcher is created and a new query
> > > parser
> > > > > is created.
> > > > >
> > > > > Any suggestion is really appreciated.
> > > > >
> > > > > Best regards,
> > > > > Vincenzo
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Vincenzo D'Amore
> > > > >
> > > >
> > > >
> > > > --
> > > > Sincerely yours
> > > > Mikhail Khludnev
> > > >
> > >
> > >
> > > --
> > > Vincenzo D'Amore
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


-- 
Vincenzo D'Amore


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-06-26 Thread Odysci
Thanks.
The heapdump indicated that most of the space was occupied by the caches
(filter and documentCache in my case).
I followed your suggestion of removing the limit on maxRAMMB on filterCache
and documentCache and decreasing the number of entries allowed.
It did have a significant impact on the used heap size. So I guess, I have
to find the sweet spot between hit ratio and size
Still, the OldGeneration does not seem to fall significantly even if I
force a full GC (using jvisualvm).

Any other suggestions are welcome!
Thanks

Reinaldo

On Fri, Jun 26, 2020 at 5:05 AM Zisis T.  wrote:

> I have faced similar issues and the culprit was filterCache when using
> maxRAMMB. More specifically on a sharded Solr cluster with lots of faceting
> during search (which makes use of the filterCache in a distributed setting)
> I noticed that maxRAMMB value was not respected. I had a value of 300MB set
> but I witnessed an instance sized a couple of GBs in a heap dump at some
> point. The thing that I found was that because the keys of the Map
> (BooleanQuery or something if I recall correctly) was not implementing the
> Accountable interface it was NOT taken into account when calculating the
> cache's size. But all that was on a 7.5 cluster using FastLRUCache.
>
> There's also https://issues.apache.org/jira/browse/SOLR-12743 on caches
> memory leak which does not seem to have been fixed yet although the trigger
> points of this memory leak are not clear. I've witnessed this as well on a
> 7.5 cluster with multiple (>10) filter cache objects for a single core each
> holding from a few MBs to GBs.
>
> Try to get a heap dump from your cluster, the truth is almost always hidden
> there.
>
> One workaround which seems to alleviate the problem is to check you running
> Solr cluster and see in reality how many cache entries actually give you a
> good hit ratio and get rid of the maxRAMMB attribute. Play only with the
> size.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Developing update processor/Query Parser

2020-06-26 Thread Gus Heck
During the request, the parser plugin is retrieved from a PluginBag on the
SolrCore object, so it should be reloaded at the same time as the update
component (which comes from another PluginBag on SolrCore). If the
components are deployed with consistent configuration in solrconfig.xml,
any given SolrCore instance should have a consistent set of both. If you
want to avoid repeating the information, one possibility is to use a system
property
https://lucene.apache.org/solr/guide/8_4/configuring-solrconfig-xml.html#jvm-system-properties
though
the suitability of that may depend on the size of your cluster and your
deployment infrastructure.

On Thu, Jun 25, 2020 at 2:47 PM Mikhail Khludnev  wrote:

> Hello, Vincenzo.
> Please find above about a dedicated component doing nothing, but just
> holding a config.
> Also you may extract config into a file and load it by
> SolrResourceLoaderAware.
>
> On Thu, Jun 25, 2020 at 2:06 PM Vincenzo D'Amore 
> wrote:
>
> > Hi Mikhail, yup, I was trying to avoid putting logic in Solr.
> > Just to be a little bit more specific, consider that if the update
> factory
> > writes a field that has a size of 50.
> > The QParser should be aware of the current size when writing a query.
> >
> > Is it possible to have in solrconfig.xml file a shared configuration?
> >
> > I mean a snippet of configuration shared between update processor factory
> > and QParser.
> >
> >
> > On Wed, Jun 24, 2020 at 10:33 PM Mikhail Khludnev 
> wrote:
> >
> > > Hello, Vincenzo.
> > > Presumably you can introduce a component which just holds a config
> data,
> > > and then this component might be lookedup from QParser and
> UpdateFactory.
> > > Overall, it seems like embedding logic into Solr core, which rarely
> works
> > > well.
> > >
> > > On Wed, Jun 24, 2020 at 8:00 PM Vincenzo D'Amore 
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I've started to work on a couple of components very tight together.
> > > > An update processor that writes few fields in the solr index and a
> > Query
> > > > Parser that, well, then reads such fields from the index.
> > > >
> > > > Such components share few configuration parameters together, I'm
> asking
> > > if
> > > > there is a pattern, a draft, a sample, some guidelines or best
> > practices
> > > > that explains how to properly save configuration parameters.
> > > >
> > > > The configuration is written into the solrconfig.xml file, for
> example:
> > > >
> > > >
> > > >  
> > > >x1
> > > >x2
> > > >  
> > > >
> > > >
> > > > And then query parser :
> > > >
> > > >  > > > class="com.example.query.MyCustomQueryParserPlugin" />
> > > >
> > > > I'm struggling because the change of configuration on the updated
> > > processor
> > > > has an impact on the query parser.
> > > > For example the configuration info shared between those two
> components
> > > can
> > > > be overwritten during a core reload.
> > > > Basically, during an update or a core reload, there is a query parser
> > > that
> > > > is serving requests while some other component is updating the index.
> > > > So I suppose there should be a pattern, an approach, a common
> solution
> > > when
> > > > a piece of configuration has to be loaded at boot, or when the core
> is
> > > > loaded.
> > > > Or when, after an update a new searcher is created and a new query
> > parser
> > > > is created.
> > > >
> > > > Any suggestion is really appreciated.
> > > >
> > > > Best regards,
> > > > Vincenzo
> > > >
> > > >
> > > >
> > > > --
> > > > Vincenzo D'Amore
> > > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
> >
> > --
> > Vincenzo D'Amore
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)


Re: Solr 8.5.2: DataImportHandler failed to instantiate org.apache.solr.request.SolrRequestHandler

2020-06-26 Thread Shawn Heisey

On 6/24/2020 1:59 PM, Peter van de Kerk wrote:

So I copied files from C:\solr-8.5.2\dist to C:\solr-8.5.2\server\lib

But then I get error


org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
Error Instantiating requestHandler, 
org.apache.solr.handler.dataimport.DataImportHandler failed to instantiate 
org.apache.solr.request.SolrRequestHandler


All of the errors will be MUCH longer than what you have included here, 
and we need that detail to diagnose anything.  If you are seeing these 
in the admin UI "logging" tab, you can click on the little "i" icon to 
expand them.  But be aware that you'll have to read/copy the expanded 
data very quickly - the admin UI will quickly close the expansion.  It's 
much better to go to the actual logfile than use the admin UI.


There are better locations than server/lib for jars, but I don't think 
that's causing the problem.  You should definitely NOT copy ALL of the 
jars in the dist directory -- this places an additional copy of the main 
Solr jars on the classpath, and having the same jar accessible from two 
places is a VERY bad thing for Java software.  It causes some really 
weird problems, and I can see this issue being a result of that.  For 
most DIH uses, you only need the "solr-dataimporthandler-X.Y.Z.jar" 
file.  For some DIH use cases (but not most of them) you might also need 
the extras jar.


Thanks,
Shawn


SolrCloud with custom package in dataimport

2020-06-26 Thread stefan
Hey,

Is it possible to reference a custom java class during the dataimport? The 
dataimport looks something like this:

```






db-data-config.xml


```

Sadly I was unable to find any information on this topic.

Thanks for your help!


Re: Prevent Re-indexing if Doc Fields are Same

2020-06-26 Thread Walter Underwood
If you don’t want to buy disk space for deleted docs, you should not be 
using Solr. That is an essential part of a reliable Solr installation.

To avoid reindexing unchanged documents, use a bookkeeping RDBMS
table. In that table, put the document ID and the most recent successful
update to Solr. You can check if the fields are the same with a checksum
of the data. MD5 is fine for that. Check that database before sending the
document and update it after new documents are indexed.

You may also want to record deletes in the database.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 26, 2020, at 1:12 AM, Anshuman Singh  wrote:
> 
> I was reading about in-place updates
> https://lucene.apache.org/solr/guide/7_4/updating-parts-of-documents.html,
> In my use case I have to update the field "LASTUPDATETIME", all other
> fields are same. Updates are very frequent and I can't bear the cost of
> deleted docs.
> 
> If I provide all the fields, it deletes the document and re-index it. But
> if I just "set" the "LASTUPDATETIME" field (non-indexed, non-stored,
> docValue field), it does an in-place update without deletion. But the
> problem is I don't know if the document is present or I'm indexing it the
> first time.
> 
> Is there a way to prevent re-indexing if other fields are the same?
> 
> *P.S. I'm looking for a solution that doesn't require looking up if doc is
> present in the Collection or not.*



Re: Solr Upgrade Issue

2020-06-26 Thread Jan Høydahl
Hi,

There is no enough information in your email for any of us to help you.
Sounds like your company has created some custom integrations and perhaps 
servlet filters.
We do not know anything about your custom «createcore» functionality, so we 
cannot know why it does not work.
I’d recommend you ask the architects in your company who set up these cusom 
components.
Or if you believe there is a Solr issue and you have a log message to share, we 
might be able to help.

Jan

> 26. jun. 2020 kl. 10:13 skrev Ashok Mahendran 
> :
> 
> Hi Team,
> 
> We are upgrading from Solr 7.5.0 version to 8.5.2 version.
> 
> We are doing custom /createcore functionality from our web application. In 
> 7.5.0 version we mentioned the that Filter in web.xml for create core and it 
> is working fine.
> 
> For 8.5.2 that customized create core filter not calling. Is there anything 
> we restricting in 8.5.2 version.
> 
> Please confirm
> 
> Regards,
> Ashokkumar M
> 
> [Aspire Systems]
> 
> This e-mail message and any attachments are for the sole use of the intended 
> recipient(s) and may contain proprietary, confidential, trade secret or 
> privileged information. Any unauthorized review, use, disclosure or 
> distribution is prohibited and may be a violation of law. If you are not the 
> intended recipient, please contact the sender by reply e-mail and destroy all 
> copies of the original message.



Re: Solr basic authentication and authorization Document

2020-06-26 Thread Jan Høydahl
Hi,

There is little context in your question. We don’t know how you deploy Solr 
(via solr-operator  or manually),
we don’t know if you deploy in Zookeeper or standalone mode, we don’t know if 
you tried enabling basic auth
already using the documentation at 
https://lucene.apache.org/solr/guide/8_5/basic-authentication-plugin.html 

and if that failed, what error messages are you seeing.

If you give some more context I’m sure there are many who can help.

Jan

> 26. jun. 2020 kl. 08:03 skrev Roshan Naik :
> 
> Hello Team ,
> 
> We are deploying the solr cluster on GCP kubernetes .
> We are not getting a clear idea in the documents for how we can
> implement the solr basic authentication and authorization in kubernetes
> cluster.
> 
> 
> Could you please provide the documents for the same?
> 
> 
> 
> -- 
> Thanks & Regards,
> Roshan Naik
> Cloud Engineer
> Email : rosh...@mactores.com
> Contact : +91 22 61123015
> 
> -- 
> This email and any files transmitted with it are confidential and intended 
> solely for the use of the individual or entity to whom they are addressed. 
> If you have received this email in error please notify the system manager. 
> This message contains confidential information and is intended only for the 
> individual named. If you are not the named addressee you should not 
> disseminate, distribute or copy this e-mail. Please notify the sender 
> immediately by e-mail if you have received this e-mail by mistake and 
> delete this e-mail from your system. If you are not the intended recipient 
> you are notified that disclosing, copying, distributing or taking any 
> action in reliance on the contents of this information is strictly 
> prohibited. WARNING!! : Computer viruses can be transmitted via email. The 
> recipient should check this email and any attachments for the presence of 
> viruses. The company accepts no liability for any damage caused by any 
> virus transmitted by this email. E-mail transmission cannot be guaranteed 
> to be secure or error-free as information could be intercepted, corrupted, 
> lost, destroyed, arrive late or incomplete, or contain viruses. The sender 
> therefore does not accept liability for any errors or omissions in the 
> contents of this message, which arise as a result of e-mail transmission. 
> Warning!!: Although the company has taken reasonable precautions to ensure 
> no viruses are present in this email, the company cannot accept 
> responsibility for any loss or damage arising from the use of this email or 
> attachments. Negligent misstatement: Our company accepts no liability for 
> the content of this email, or for the consequences of any actions taken on 
> the basis of the information provided, unless that information is 
> subsequently confirmed in writing. If you are not the intended recipient 
> you are notified that disclosing, copying, distributing or taking any 
> action in reliance on the contents of this information is strictly 
> prohibited.



Solr Upgrade Issue

2020-06-26 Thread Ashok Mahendran
Hi Team,

We are upgrading from Solr 7.5.0 version to 8.5.2 version.

We are doing custom /createcore functionality from our web application. In 
7.5.0 version we mentioned the that Filter in web.xml for create core and it is 
working fine.

For 8.5.2 that customized create core filter not calling. Is there anything we 
restricting in 8.5.2 version.

Please confirm

Regards,
Ashokkumar M

[Aspire Systems]

This e-mail message and any attachments are for the sole use of the intended 
recipient(s) and may contain proprietary, confidential, trade secret or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited and may be a violation of law. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy all 
copies of the original message.


Solr basic authentication and authorization Document

2020-06-26 Thread Roshan Naik
Hello Team ,

We are deploying the solr cluster on GCP kubernetes .
We are not getting a clear idea in the documents for how we can
implement the solr basic authentication and authorization in kubernetes
cluster.


Could you please provide the documents for the same?



-- 
Thanks & Regards,
Roshan Naik
Cloud Engineer
Email : rosh...@mactores.com
Contact : +91 22 61123015

-- 
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. 
If you have received this email in error please notify the system manager. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and 
delete this e-mail from your system. If you are not the intended recipient 
you are notified that disclosing, copying, distributing or taking any 
action in reliance on the contents of this information is strictly 
prohibited. WARNING!! : Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any 
virus transmitted by this email. E-mail transmission cannot be guaranteed 
to be secure or error-free as information could be intercepted, corrupted, 
lost, destroyed, arrive late or incomplete, or contain viruses. The sender 
therefore does not accept liability for any errors or omissions in the 
contents of this message, which arise as a result of e-mail transmission. 
Warning!!: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email, the company cannot accept 
responsibility for any loss or damage arising from the use of this email or 
attachments. Negligent misstatement: Our company accepts no liability for 
the content of this email, or for the consequences of any actions taken on 
the basis of the information provided, unless that information is 
subsequently confirmed in writing. If you are not the intended recipient 
you are notified that disclosing, copying, distributing or taking any 
action in reliance on the contents of this information is strictly 
prohibited.


RE: Unexpected results using Block Join Parent Query Parser

2020-06-26 Thread Tor-Magne Stien Hagen
Alright, that solved the problem. Thank you very much!

Tor-Magne Stien Hagen

-Original Message-
From: Mikhail Khludnev  
Sent: Thursday, June 25, 2020 12:13 PM
To: solr-user 
Subject: Re: Unexpected results using Block Join Parent Query Parser

Ok. My fault. Old sport, you know. When retrieving  intermediate scopes, 
parents bitmask shout include all enclosing scopes as well. It's a dark side of 
the BJQ.
 {!parent which=`class:(section OR composition)`} I'm not 
sure what you try to achieve specifying grandchildren as a parent-bitmask. 
Note, the algorithm assumes that parents' bitmask has the last doc in the 
segment set. I.e. 'which' query supplied in runtime should strictly correspond 
to the block structure indexed before.

On Thu, Jun 25, 2020 at 12:05 PM Tor-Magne Stien Hagen  wrote:

> If I modify the query like this:
>
> {!parent which='class:instruction'}class:observation
>
> It still returns a result for the instruction document, even though 
> the document with class instruction does not have any children...
>
> Tor-Magne Stien Hagen
>
> -Original Message-
> From: Mikhail Khludnev 
> Sent: Wednesday, June 24, 2020 2:14 PM
> To: solr-user 
> Subject: Re: Unexpected results using Block Join Parent Query Parser
>
> Jan, thanks for the clarification.
> Sure you can use {!parent which=class:section} for return children, 
> which has a garndchildren matching subordinate query.
> Note: there's something about named scopes, which I didn't get into 
> yet, but it might be relevant to the problem.
>
> On Wed, Jun 24, 2020 at 1:43 PM Jan Høydahl  wrote:
>
> > I guess the key question here is whether «parent» in BlockJoin is 
> > strictly top-level parent/root, i.e. class:composition for the 
> > example in this tread? Or can {!parent} parser also be used to 
> > select the «child» level in a child/grandchild relationship inside a block?
> >
> > Jan
> >
> > > 24. jun. 2020 kl. 11:36 skrev Tor-Magne Stien Hagen :
> > >
> > > Thanks for your answer,
> > >
> > > What kind of rules exists for the which clause? In other words, 
> > > how can
> > you identify parents without using some sort of filtering?
> > >
> > > Tor-Magne Stien Hagen
> > >
> > > -Original Message-
> > > From: Mikhail Khludnev 
> > > Sent: Wednesday, June 24, 2020 10:01 AM
> > > To: solr-user 
> > > Subject: Re: Unexpected results using Block Join Parent Query 
> > > Parser
> > >
> > > Hello,
> > >
> > > Please check warning box titled Using which
> > >
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flu
> > ce 
> > ne.apache.org%2Fsolr%2Fguide%2F8_5%2Fother-parsers.html%23block-join
> > -p 
> > arent-query-parserdata=02%7C01%7Ctsh%40dips.no%7C9201c7db5ed34b
> > af
> > 864808d818383e50%7C2f46c9197c11446584b2e354fb809979%7C0%7C0%7C637285
> > 97 
> > 7131631165sdata=0kMYuLmBcziHdzOucKA7Vx63Xr7a90dqOsplNteRbvE%3D&
> > am
> > p;reserved=0
> > >
> > > On Wed, Jun 24, 2020 at 10:01 AM Tor-Magne Stien Hagen 
> > > 
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I have indexed the following nested document in Solr:
> > >>
> > >> {
> > >>"id": "1",
> > >>"class": "composition",
> > >>"children": [
> > >>{
> > >>"id": "2",
> > >>"class": "section",
> > >>"children": [
> > >>{
> > >>"id": "3",
> > >>"class": "observation"
> > >>}
> > >>]
> > >>},
> > >>{
> > >>"id": "4",
> > >>"class": "section",
> > >>"children": [
> > >>{
> > >>"id": "5",
> > >>"class": "instruction"
> > >>}
> > >>]
> > >>}
> > >>]
> > >> }
> > >>
> > >> Given the following query:
> > >>
> > >> {!parent which='id:4'}id:3
> > >>
> > >> I expect the result to be empty as document 3 is not a child 
> > >> document of document 4.
> > >>
> > >> To reproduce, use the docker container provided here:
> > >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2
> > >> Fg
> > >> ith
> > >> ub.com%2Ftormsh%2FSolr-Exampledata=02%7C01%7Ctsh%40dips.no%7
> > >> C5
> > >> fef
> > >> 4e9a68cc41c72fd208d81814e93e%7C2f46c9197c11446584b2e354fb809979%7
> > >> C0
> > >> %7C
> > >> 0%7C637285825378470570sdata=OyjBalFeXfb0W2euL76L%2BNyRDg9ukv
> > >> T8
> > >> TNI
> > >> aODCmV30%3Dreserved=0
> > >>
> > >> Have I misunderstood something regarding the Block Join Parent 
> > >> Query Parser?
> > >>
> > >> Tor-Magne Stien Hagen
> > >>
> > >>
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> >
> >
>
> --
> Sincerely yours
> Mikhail Khludnev
>


--
Sincerely yours
Mikhail Khludnev


Prevent Re-indexing if Doc Fields are Same

2020-06-26 Thread Anshuman Singh
I was reading about in-place updates
https://lucene.apache.org/solr/guide/7_4/updating-parts-of-documents.html,
In my use case I have to update the field "LASTUPDATETIME", all other
fields are same. Updates are very frequent and I can't bear the cost of
deleted docs.

If I provide all the fields, it deletes the document and re-index it. But
if I just "set" the "LASTUPDATETIME" field (non-indexed, non-stored,
docValue field), it does an in-place update without deletion. But the
problem is I don't know if the document is present or I'm indexing it the
first time.

Is there a way to prevent re-indexing if other fields are the same?

*P.S. I'm looking for a solution that doesn't require looking up if doc is
present in the Collection or not.*


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-06-26 Thread Zisis T.
I have faced similar issues and the culprit was filterCache when using
maxRAMMB. More specifically on a sharded Solr cluster with lots of faceting
during search (which makes use of the filterCache in a distributed setting)
I noticed that maxRAMMB value was not respected. I had a value of 300MB set
but I witnessed an instance sized a couple of GBs in a heap dump at some
point. The thing that I found was that because the keys of the Map
(BooleanQuery or something if I recall correctly) was not implementing the
Accountable interface it was NOT taken into account when calculating the
cache's size. But all that was on a 7.5 cluster using FastLRUCache. 

There's also https://issues.apache.org/jira/browse/SOLR-12743 on caches
memory leak which does not seem to have been fixed yet although the trigger
points of this memory leak are not clear. I've witnessed this as well on a
7.5 cluster with multiple (>10) filter cache objects for a single core each
holding from a few MBs to GBs. 

Try to get a heap dump from your cluster, the truth is almost always hidden
there. 

One workaround which seems to alleviate the problem is to check you running
Solr cluster and see in reality how many cache entries actually give you a
good hit ratio and get rid of the maxRAMMB attribute. Play only with the
size. 



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Parallel SQL join on multivalue fields

2020-06-26 Thread Piero Scrima
Hi,

Although there is no trace of join functionality in the official Solr
documentation
(https://lucene.apache.org/solr/guide/7_4/parallel-sql-interface.html),
joining in parallel sql works in practice. It only works if the field is
not a multivalued field. For my project it would be fantastic if it also
worked with multivalued fields.
Is there any way to do it? working with the streaming expression I managed
to do it with the following expression:

innerJoin(
sort(
cartesianProduct(

search(census_defence_system,q="*:*",fl="id,defence_system,description,supplier",sort="id
asc",qt="/select",rows="1000"),
  supplier
),
by="supplier asc"
),
sort(
  cartesianProduct(

search(census_components,q="*:*",fl="id,compoenent_name,supplier",sort="id
asc",qt="/select",rows="1"),
supplier
),
by="supplier asc"
),
  on="supplier"
)

suplier of course is a multivalued field.

Is there a way to do this with parallel sql, and if not can we plan a new
feature to add it? I could also work on it .

(version 7.4)

Thank you


Re: SOLR CDCR fails with JWT authorization configuration

2020-06-26 Thread Jan Høydahl
I found this in the documentation 
https://lucene.apache.org/solr/guide/8_5/cdcr-architecture.html#cdcr-limitations
 

 :

CDCR doesn’t support Basic Authentication features across clusters.

The JIRA for adding this capability is 
https://issues.apache.org/jira/browse/SOLR-11959 
 but it went stale in 2019.
You may add a comment there and hope for some traction, but don’t hold your 
breath...

Jan

> 26. jun. 2020 kl. 06:34 skrev Phatkar, Swapnil (Contractor) 
> :
> 
> Hi,
> 
> CDCR might be deprecated really soon now -->  In this case will it be there 
> any alternate to this. 
> 
> However, if this turns out to be not supported or a bug, then we can file a 
> JIRA issue.  --> it will be great if you raise the JIRA ticket for it. So 
> that we will be more clear that how does it response 
> To such scenario : 1. CDCR with https and JWT authentication   and the 
> necessary settings for it including security.json.
> 
> 
> Thanks 
> Swapnil 
> 
> 
> 
> -Original Message-
> From: Jan Høydahl  
> Sent: Thursday, June 25, 2020 6:50 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR CDCR fails with JWT authorization configuration
> 
> EXTERNAL SENDER:   Exercise caution with links and attachments.
> 
> I’m mostly trying to identify whether what you are trying to to is a 
> supported option at all, or of perhaps CDCR is only tested without 
> authentication in place.
> You would also be interested in the fact that CDCR might be deprecated really 
> soon now, see 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SOLR-2D11718=DwIFaQ=7gn0PlAmraV3zr-k385KhKAz9NTx0dwockj5vIsr5Sw=wQj2B5ci2ikx0AXWDp1ftYhkwteAsJcW-MBY4WoYz1A=VnSEvEi02eWt0BicxkJixew62AkT8xPFWcdVyny0UOc=xaIigYTYxurNRitDyLsqVfTreB0Kz15mR69HnhGbKSI=
>   
>   > CDCR is complex. JWT is complex. Combining the two might bee too much 
> unknown territory for beginners.
> 
> However, if this turns out to be not supported or a bug, then we can file a 
> JIRA issue. So far I hope that someone else with CDCR can give JWT a try to 
> reproduce what you are seeing.
> 
> Jan
> 
>> 25. jun. 2020 kl. 15:06 skrev Phatkar, Swapnil (Contractor) 
>> :
>> 
>> Hi,
>> 
>> 
>> 1. Solr is relying on PKI for the request (one cluster sends PKI 
>> header to the node in the other cluster)
>> -- > I have not configured anything explicitly. Just followed the steps 
>> mention 
>> @https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_8-5F4_cdcr-2Dconfig.html=DwIFaQ=7gn0PlAmraV3zr-k385KhKAz9NTx0dwockj5vIsr5Sw=wQj2B5ci2ikx0AXWDp1ftYhkwteAsJcW-MBY4WoYz1A=VnSEvEi02eWt0BicxkJixew62AkT8xPFWcdVyny0UOc=EHiRmseTycUfJdAgsdWoz1qiE9Y3DATFD4qPh0CkSig=
>>  . Is there any additional step ?
>> 
>> 2. That fails since the sending node is unknown to the receiving node 
>> since it is in another cluster
>> -->  I think that obvious because Source cluster and Target clusters 
>> --> are different. What I know is once we configure zkhost of Target 
>> --> cluster in Source cluster in solrconfig.xml it establish 
>> --> connection. But I will
>> like to know is there any other setting ?
>> 
>> 3. Have you tried BasicAuth and do you have the same issue then?
>> --> Nope . We were using  "class":"solr.JWTAuthPlugin" . Do I need to add 
>> authorization also to overcome JWT authorization ??
>> 
>> 
>> Can you please guide me considering me as newbie :) . And it will be 
>> also good to get sample security.json
>> 
>> Thanks
>> 
>> -Original Message-
>> From: Jan Høydahl 
>> Sent: Thursday, June 25, 2020 5:25 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: SOLR CDCR fails with JWT authorization configuration
>> 
>> EXTERNAL SENDER:   Exercise caution with links and attachments.
>> 
>> Sorry, there is no forwardCredentials parameter for JWT, it is implicit. 
>> 
>> But from the response we can see two things:
>> 
>> 1. Solr is relying on PKI for the request (one cluster sends PKI 
>> header to the node in the other cluster) 2. That fails since the 
>> sending node is unknown to the receiving node since it is in another 
>> cluster
>> 
>> I’m not familiar with the CDCR code used here. Have you tried BasicAuth and 
>> do you have the same issue then?
>> 
>> Jan
>> 
>> 
>>> 25. jun. 2020 kl. 13:20 skrev Phatkar, Swapnil (Contractor) 
>>> :
>>> 
>>> 
>>> 
>>> Whoever is sending calls to /solr/express_shard1_replica_n3/cdcr will have 
>>> to make sure to forward JWT -- How do I forward JWT from source to target 
>>> server ??
>>> You could try 'forwardCredentials:true' in security.json -- How can I try  
>>> 

Nested grouping

2020-06-26 Thread Srinivas Kashyap
Hi All,

I have below requirement for my business:

select?fl=*=MODIFY_TS:[2020-06-23T18:30:00Z TO *]=PHY_KEY2: "HQ010699" OR 
PHY_KEY2: "HQ010377" OR PHY_KEY2: "HQ010396" OR PHY_KEY2: "HQ010399" OR 
PHY_KEY2: "HQ010404" OR PHY_KEY2: "HQ010419" OR PHY_KEY2: "HQ010426" OR 
PHY_KEY2: "HQ010452" OR PHY_KEY2: "HQ010463" OR PHY_KEY2: "HQ010466" OR 
PHY_KEY2: "HQ010469" OR PHY_KEY2: "HQ010476" OR PHY_KEY2: "HQ010480" OR 
PHY_KEY2: "HQ010481" OR PHY_KEY2: "HQ010496" OR PHY_KEY2: "HQ010500" OR 
PHY_KEY2: "HQ010501" OR PHY_KEY2: "HQ010502" OR PHY_KEY2: "HQ010503" OR 
PHY_KEY2: "HQ010504"

Above query lists all the changes mentioned for 20 documents.

If I add below to query:

group=true=PHY_KEY2=true

Below is the response:

"grouped":{
"PHY_KEY2":{
  "matches":23,
  "ngroups":4,
  "groups":[{
  "groupValue":"HQ010500",
  "doclist":{"numFound":3,"start":0,"docs":[
  {
"PHY_KEY2":"HQ010500"}]
  }},
{
  "groupValue":"HQ010399",
  "doclist":{"numFound":4,"start":0,"docs":[
  {
"PHY_KEY2":"HQ010399"}]
  }},
{
  "groupValue":"HQ010377",
  "doclist":{"numFound":8,"start":0,"docs":[
  {
"PHY_KEY2":"HQ010377"}]
  }},
{
  "groupValue":"HQ010699",
  "doclist":{"numFound":8,"start":0,"docs":[
  {
"PHY_KEY2":"HQ010699"}]
  }}]}}}

Take the case of last entry, HQ010699. It says numFound=8 (8 docs). But all of 
these 8 documents have the same value for field called TRACK_ID.  So can I 
group again on the TRACK_ID to get the count as 1.


Thanks and Regards,
Srinivas Kashyap


DISCLAIMER:
E-mails and attachments from Bamboo Rose, LLC are confidential.
If you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way.
No representation is made that this email or any attachments are free of 
viruses. Virus scanning is recommended and is the responsibility of the 
recipient.

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for use by the recipient and others 
authorized to receive it. If you are not the recipient, you are hereby notified 
that any disclosure, copying, distribution or taking action in relation of the 
contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been 
automatically archived by Mimecast Ltd, an innovator in Software as a Service 
(SaaS) for business. Providing a safer and more useful place for your human 
generated data. Specializing in; Security, archiving and compliance. To find 
out more visit the Mimecast website.