Re: Clustering error in Solr 8.2.0

2019-08-08 Thread Kevin Risden
According to the stack trace:

java.lang.NoClassDefFoundError: org/apache/commons/lang/ObjectUtils
at lingo3g.s.hashCode(Unknown Source)

It looks like lingo3g - lingo3g isn't on Maven central and looks like it
requires a license to download. You would have to contact them to see if it
still uses commons-lang. You could also copy in commons-lang dependency.

Kevin Risden


On Thu, Aug 8, 2019 at 10:23 PM Zheng Lin Edwin Yeo 
wrote:

> Hi Erick,
>
> Thanks for your reply.
>
> My clustering code is taken as it is from the Solr package, only the codes
> related to lingo3g is taken from previous version.
>
> Below are the 3 files that I have taken from previous version:
> - lingo3g-1.15.0
> - morfologik-fsa-2.1.1
> - morfologik-stemming-2.1.1
>
> Does anyone of these could have caused the error?
>
> Regards,
> Edwin
>
> On Thu, 8 Aug 2019 at 19:56, Erick Erickson 
> wrote:
>
> > This dependency was removed as part of
> > https://issues.apache.org/jira/browse/SOLR-9079, so my guess is you’re
> > pointing to an old version of the clustering code.
> >
> > Best,
> > Erick
> >
> > > On Aug 8, 2019, at 4:22 AM, Zheng Lin Edwin Yeo 
> > wrote:
> > >
> > > ObjectUtils
> >
> >
>


Re: Clustering error in Solr 8.2.0

2019-08-08 Thread Zheng Lin Edwin Yeo
Hi Erick,

Thanks for your reply.

My clustering code is taken as it is from the Solr package, only the codes
related to lingo3g is taken from previous version.

Below are the 3 files that I have taken from previous version:
- lingo3g-1.15.0
- morfologik-fsa-2.1.1
- morfologik-stemming-2.1.1

Does anyone of these could have caused the error?

Regards,
Edwin

On Thu, 8 Aug 2019 at 19:56, Erick Erickson  wrote:

> This dependency was removed as part of
> https://issues.apache.org/jira/browse/SOLR-9079, so my guess is you’re
> pointing to an old version of the clustering code.
>
> Best,
> Erick
>
> > On Aug 8, 2019, at 4:22 AM, Zheng Lin Edwin Yeo 
> wrote:
> >
> > ObjectUtils
>
>


Re: Indexed Data Size

2019-08-08 Thread Shawn Heisey

On 8/8/2019 3:17 PM, Moyer, Brett wrote:

In our data/solr//data/index on the filesystem, we have files 
that go back 1 year. I don’t understand why and I doubt they are in use. Files with 
extensions like fdx,cfe,doc,pos,tip,dvm etc. Some of these are very large and running 
us out of server space. Our search indexes themselves are not large, in total we 
might have 50k documents.  How can I reduce this /data/solr space? Is this what the 
Solr Optimize command is for? Thanks!


+1 to everything Erick said.

Another piece of information that could be helpful is a screenshot of 
the core overview in the admin UI.  It would look something like this:


https://www.dropbox.com/s/mbh6ll1v8ghloko/solr-core-overview.png?dl=0

To get that, just go to the admin UI and choose one of the big cores 
from the core dropdown.  That should put you on the overview tab for the 
core.  Then grab a screenshot and use a file sharing site to share it.


Thanks,
Shawn


Re: modify query response plugin

2019-08-08 Thread Maria Muslea
I am not able to use the highlighter because in my example the text
contains "LAX",
 and the search is on "airport" on a different field than the text field.

I was able to use SearchComponent to add my own highlighting section.

Thank you,
Maria


On Thu, Aug 8, 2019 at 2:20 PM Moyer, Brett  wrote:

> Highlight? What about using the Highlighter?
> https://lucene.apache.org/solr/guide/6_6/highlighting.html
>
> Brett Moyer
> Manager, Sr. Technical Lead | TFS Technology
>   Public Production Support
>   Digital Search & Discovery
>
> 8625 Andrew Carnegie Blvd | 4th floor
> Charlotte, NC 28263
> Tel: 704.988.4508
> Fax: 704.988.4907
> bmo...@tiaa.org
>
>
> -Original Message-
> From: Maria Muslea 
> Sent: Thursday, August 8, 2019 1:28 PM
> To: solr-user@lucene.apache.org
> Subject: Re: modify query response plugin
>
> Thank you for your response. I believe that the Tagger is used for NER,
> which is different than what I am trying to do.
> It is also available only with Solr 7 and I would need this to work with
> version 6.5.0.
>
> I am trying to manipulate the data that I already have in the response,
> and I can't find a good example of a plugin that does something similar, so
> I can see how I can access the response and construct a new one.
>
> Your help is greatly appreciated.
>
> Thank you,
> Maria
>
> On Tue, Aug 6, 2019 at 3:19 PM Erik Hatcher 
> wrote:
>
> > I think you’re looking for the Solr Tagger, described here:
> > https://lucidworks.com/post/solr-tagger-improving-relevancy/
> >
> > > On Aug 6, 2019, at 16:04, Maria Muslea  wrote:
> > >
> > > Hi,
> > >
> > > I am trying to implement a plugin that will modify my query
> > > response. For example, I would like to execute a query that will
> return something like:
> > >
> > > {...
> > > "description":"flights at LAX",
> > > "highlight":"airport;11;3"
> > > ...}
> > > This is information that I have in my document, so I can return it.
> > >
> > > Now, I would like the plugin to intercept the result, do some
> > > processing
> > on
> > > it, and return something like:
> > >
> > > {...
> > > "description":"flights at LAX",
> > > "highlight":{
> > >   "concept":"airport",
> > >   "description":"flights at LAX"
> > > ...}
> > >
> > > I looked at some RequestHandler implementations, but I can't find
> > > any sample code that would help me with this. Would this type of
> > > plugin be handled by a RequestHandler? Could you maybe point me to a
> > > sample plugin that does something similar?
> > >
> > > I would really appreciate your help.
> > >
> > > Thank you,
> > > Maria
> >
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA
> *
>


Re: Indexed Data Size

2019-08-08 Thread Erick Erickson
On the surface, this makes no sense at all, so there’s something I don’t 
understand here ;). 

How often do you update your index? Having files from a long time ago is 
perfectly reasonable if you’re not updating regularly.

But your statement that some of these are huge for just a 50K document index is 
odd unless they’re _huge_ documents.

I wouldn’t optimize, unless you’re on Solr 7.5+ as that’ll create a single 
segment, see: 
https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/
and
https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/

The extensions you mentioned are perfectly reasonable. Each segment is made up 
of multiple files. .fdt for instance contains stored data. See: 
https://lucene.apache.org/core/6_6_0/core/org/apache/lucene/codecs/lucene62/package-summary.html

Can you give us a long listing of one of your index directories?

Best,
Erick

> On Aug 8, 2019, at 5:17 PM, Moyer, Brett  wrote:
> 
> In our data/solr//data/index on the filesystem, we have files 
> that go back 1 year. I don’t understand why and I doubt they are in use. 
> Files with extensions like fdx,cfe,doc,pos,tip,dvm etc. Some of these are 
> very large and running us out of server space. Our search indexes themselves 
> are not large, in total we might have 50k documents.  How can I reduce this 
> /data/solr space? Is this what the Solr Optimize command is for? Thanks!
> 
> Brett
> 
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately 
> and then delete it.
> 
> TIAA
> *



Re: modify query response plugin

2019-08-08 Thread Andrea Gazzarini
I'm not really sure I got what you want to achieve, but after reading

" I am trying to manipulate the data that I already have in the response,
...
I can see how I can access the response and construct a new one."

an option to consider could be a query response writer [1].

Cheers,
Andrea

[1] https://lucene.apache.org/solr/guide/6_6/response-writers.html


On Thu, 8 Aug 2019, 23:34 Atita Arora,  wrote:

> Isn't it resolved by simply adding the desired pre/post tags in highlighter
> request?
>
> On Thu, Aug 8, 2019 at 11:20 PM Moyer, Brett  wrote:
>
> > Highlight? What about using the Highlighter?
> > https://lucene.apache.org/solr/guide/6_6/highlighting.html
> >
> > Brett Moyer
> > Manager, Sr. Technical Lead | TFS Technology
> >   Public Production Support
> >   Digital Search & Discovery
> >
> > 8625 Andrew Carnegie Blvd | 4th floor
> > Charlotte, NC 28263
> > Tel: 704.988.4508
> > Fax: 704.988.4907
> > bmo...@tiaa.org
> >
> >
> > -Original Message-
> > From: Maria Muslea 
> > Sent: Thursday, August 8, 2019 1:28 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: modify query response plugin
> >
> > Thank you for your response. I believe that the Tagger is used for NER,
> > which is different than what I am trying to do.
> > It is also available only with Solr 7 and I would need this to work with
> > version 6.5.0.
> >
> > I am trying to manipulate the data that I already have in the response,
> > and I can't find a good example of a plugin that does something similar,
> so
> > I can see how I can access the response and construct a new one.
> >
> > Your help is greatly appreciated.
> >
> > Thank you,
> > Maria
> >
> > On Tue, Aug 6, 2019 at 3:19 PM Erik Hatcher 
> > wrote:
> >
> > > I think you’re looking for the Solr Tagger, described here:
> > > https://lucidworks.com/post/solr-tagger-improving-relevancy/
> > >
> > > > On Aug 6, 2019, at 16:04, Maria Muslea 
> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I am trying to implement a plugin that will modify my query
> > > > response. For example, I would like to execute a query that will
> > return something like:
> > > >
> > > > {...
> > > > "description":"flights at LAX",
> > > > "highlight":"airport;11;3"
> > > > ...}
> > > > This is information that I have in my document, so I can return it.
> > > >
> > > > Now, I would like the plugin to intercept the result, do some
> > > > processing
> > > on
> > > > it, and return something like:
> > > >
> > > > {...
> > > > "description":"flights at LAX",
> > > > "highlight":{
> > > >   "concept":"airport",
> > > >   "description":"flights at LAX"
> > > > ...}
> > > >
> > > > I looked at some RequestHandler implementations, but I can't find
> > > > any sample code that would help me with this. Would this type of
> > > > plugin be handled by a RequestHandler? Could you maybe point me to a
> > > > sample plugin that does something similar?
> > > >
> > > > I would really appreciate your help.
> > > >
> > > > Thank you,
> > > > Maria
> > >
> > *
> > This e-mail may contain confidential or privileged information.
> > If you are not the intended recipient, please notify the sender
> > immediately and then delete it.
> >
> > TIAA
> > *
> >
>


Re: modify query response plugin

2019-08-08 Thread Atita Arora
Isn't it resolved by simply adding the desired pre/post tags in highlighter
request?

On Thu, Aug 8, 2019 at 11:20 PM Moyer, Brett  wrote:

> Highlight? What about using the Highlighter?
> https://lucene.apache.org/solr/guide/6_6/highlighting.html
>
> Brett Moyer
> Manager, Sr. Technical Lead | TFS Technology
>   Public Production Support
>   Digital Search & Discovery
>
> 8625 Andrew Carnegie Blvd | 4th floor
> Charlotte, NC 28263
> Tel: 704.988.4508
> Fax: 704.988.4907
> bmo...@tiaa.org
>
>
> -Original Message-
> From: Maria Muslea 
> Sent: Thursday, August 8, 2019 1:28 PM
> To: solr-user@lucene.apache.org
> Subject: Re: modify query response plugin
>
> Thank you for your response. I believe that the Tagger is used for NER,
> which is different than what I am trying to do.
> It is also available only with Solr 7 and I would need this to work with
> version 6.5.0.
>
> I am trying to manipulate the data that I already have in the response,
> and I can't find a good example of a plugin that does something similar, so
> I can see how I can access the response and construct a new one.
>
> Your help is greatly appreciated.
>
> Thank you,
> Maria
>
> On Tue, Aug 6, 2019 at 3:19 PM Erik Hatcher 
> wrote:
>
> > I think you’re looking for the Solr Tagger, described here:
> > https://lucidworks.com/post/solr-tagger-improving-relevancy/
> >
> > > On Aug 6, 2019, at 16:04, Maria Muslea  wrote:
> > >
> > > Hi,
> > >
> > > I am trying to implement a plugin that will modify my query
> > > response. For example, I would like to execute a query that will
> return something like:
> > >
> > > {...
> > > "description":"flights at LAX",
> > > "highlight":"airport;11;3"
> > > ...}
> > > This is information that I have in my document, so I can return it.
> > >
> > > Now, I would like the plugin to intercept the result, do some
> > > processing
> > on
> > > it, and return something like:
> > >
> > > {...
> > > "description":"flights at LAX",
> > > "highlight":{
> > >   "concept":"airport",
> > >   "description":"flights at LAX"
> > > ...}
> > >
> > > I looked at some RequestHandler implementations, but I can't find
> > > any sample code that would help me with this. Would this type of
> > > plugin be handled by a RequestHandler? Could you maybe point me to a
> > > sample plugin that does something similar?
> > >
> > > I would really appreciate your help.
> > >
> > > Thank you,
> > > Maria
> >
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA
> *
>


RE: modify query response plugin

2019-08-08 Thread Moyer, Brett
Highlight? What about using the Highlighter? 
https://lucene.apache.org/solr/guide/6_6/highlighting.html

Brett Moyer
Manager, Sr. Technical Lead | TFS Technology
  Public Production Support
  Digital Search & Discovery

8625 Andrew Carnegie Blvd | 4th floor
Charlotte, NC 28263
Tel: 704.988.4508
Fax: 704.988.4907
bmo...@tiaa.org


-Original Message-
From: Maria Muslea  
Sent: Thursday, August 8, 2019 1:28 PM
To: solr-user@lucene.apache.org
Subject: Re: modify query response plugin

Thank you for your response. I believe that the Tagger is used for NER, which 
is different than what I am trying to do.
It is also available only with Solr 7 and I would need this to work with 
version 6.5.0.

I am trying to manipulate the data that I already have in the response, and I 
can't find a good example of a plugin that does something similar, so I can see 
how I can access the response and construct a new one.

Your help is greatly appreciated.

Thank you,
Maria

On Tue, Aug 6, 2019 at 3:19 PM Erik Hatcher  wrote:

> I think you’re looking for the Solr Tagger, described here:
> https://lucidworks.com/post/solr-tagger-improving-relevancy/
>
> > On Aug 6, 2019, at 16:04, Maria Muslea  wrote:
> >
> > Hi,
> >
> > I am trying to implement a plugin that will modify my query 
> > response. For example, I would like to execute a query that will return 
> > something like:
> >
> > {...
> > "description":"flights at LAX",
> > "highlight":"airport;11;3"
> > ...}
> > This is information that I have in my document, so I can return it.
> >
> > Now, I would like the plugin to intercept the result, do some 
> > processing
> on
> > it, and return something like:
> >
> > {...
> > "description":"flights at LAX",
> > "highlight":{
> >   "concept":"airport",
> >   "description":"flights at LAX"
> > ...}
> >
> > I looked at some RequestHandler implementations, but I can't find 
> > any sample code that would help me with this. Would this type of 
> > plugin be handled by a RequestHandler? Could you maybe point me to a 
> > sample plugin that does something similar?
> >
> > I would really appreciate your help.
> >
> > Thank you,
> > Maria
>
*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA
*


Indexed Data Size

2019-08-08 Thread Moyer, Brett
In our data/solr//data/index on the filesystem, we have files 
that go back 1 year. I don’t understand why and I doubt they are in use. Files 
with extensions like fdx,cfe,doc,pos,tip,dvm etc. Some of these are very large 
and running us out of server space. Our search indexes themselves are not 
large, in total we might have 50k documents.  How can I reduce this /data/solr 
space? Is this what the Solr Optimize command is for? Thanks!

Brett

*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA
*


Re: modify query response plugin

2019-08-08 Thread Maria Muslea
Thank you for your response. I believe that the Tagger is used for NER,
which is different than what I am trying to do.
It is also available only with Solr 7 and I would need this to work with
version 6.5.0.

I am trying to manipulate the data that I already have in the response, and
I can't find a good example of a plugin that does
something similar, so I can see how I can access the response and construct
a new one.

Your help is greatly appreciated.

Thank you,
Maria

On Tue, Aug 6, 2019 at 3:19 PM Erik Hatcher  wrote:

> I think you’re looking for the Solr Tagger, described here:
> https://lucidworks.com/post/solr-tagger-improving-relevancy/
>
> > On Aug 6, 2019, at 16:04, Maria Muslea  wrote:
> >
> > Hi,
> >
> > I am trying to implement a plugin that will modify my query response. For
> > example, I would like to execute a query that will return something like:
> >
> > {...
> > "description":"flights at LAX",
> > "highlight":"airport;11;3"
> > ...}
> > This is information that I have in my document, so I can return it.
> >
> > Now, I would like the plugin to intercept the result, do some processing
> on
> > it, and return something like:
> >
> > {...
> > "description":"flights at LAX",
> > "highlight":{
> >   "concept":"airport",
> >   "description":"flights at LAX"
> > ...}
> >
> > I looked at some RequestHandler implementations, but I can't find any
> > sample code that would help me with this. Would this type of plugin be
> > handled by a RequestHandler? Could you maybe point me to a sample plugin
> > that does something similar?
> >
> > I would really appreciate your help.
> >
> > Thank you,
> > Maria
>


SQL equality predicate escaping single quotes

2019-08-08 Thread Kyle Lilly
Hi,

When using the SQL handler is there any way to escape single quotes in
boolean predicates? A query like:

SELECT title FROM books WHERE author_lastname = 'O''Reilly'

Will return no results for authors with the last name "O'Reilly" but will
return hits for books with a last name of "OReilly". I can perform a
standard Solr term search using "lastname:O'Reilly" and get back the
expected results. Looking through the code it appears all single quotes are
stripped from term values in the SQL handler -
https://github.com/apache/lucene-solr/blame/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/sql/SolrFilter.java#L136.
If this is by design is there any way to use single quotes in a term
predicate with SQL?

Thanks.

- Kyle


Re: Using custom scoring formula

2019-08-08 Thread Chee Yee Lim
Hi Arnold,

One way to approach this is to store the topic vector you calculated with
each of the associated Solr document into a pseudo-vector field (i.e.
formatted string field). Then parse the string field into actual vector for
calculation when you need it. Something similar to this,
https://github.com/saaay71/solr-vector-scoring. But note that the plugin
will not work out of the box for latest Solr version.

Best wishes,
Chee Yee

On Thu, 8 Aug 2019 at 01:07, Arnold Bronley  wrote:

> Hi,
>
> I have a topic verctor calculated for each of the Solr document in a
> collection. Topic vector is calculated using LDA (
> https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation).  Now I want to
> return the similar document to a given document from this collection. I can
> simply use normalized dot product between the given vector and all other
> vectors to see which one has product of ~1. That will tell me that those
> are very similar documents. Is there a way to achieve this using Solr?
>


Re: Solr index

2019-08-08 Thread Dario Rigolin
Do you know that your solr is open to the internet? It's better to filter
the port or at least not put here the full address...

Il giorno gio 8 ago 2019 alle ore 15:58 HTMLServices.it <
i...@htmlservices.it> ha scritto:

> Hi everyone
> I installed Solr on a test server (centos 7) to get the fastest searches
> on dovecot, Solr and new for me and I think I didn't understand how it
> works perfectly.
> I installed following the official guide on the dovecot wiki:
> https://wiki2.dovecot.org/Plugins/FTS/Solr
> but I can't get it to work properly.
>
> This is my installation that I made public provisionally without a
> password:
> http://5.39.2.59:8987/solr/#/
> (I changed port because the default one was busy)
>
> I believe that the index is not created, should it be created
> automatically? or did I do something wrong?
>
> if I run one of these two commands as a guide
> curl http://5.39.2.59:8987/solr/dovecot/update?optimize=true
> curl http://5.39.2.59:8987/solr/dovecot/update?commit=true
> I get
>
> 
> 
> 
> 0 
> 2 
> 
> 
>
> this is right? have I forgotten or am I wrong?
>
> Excuse the stupid questions but I'm seeing Solr for the first time
> thank you all
>


-- 

Dario Rigolin
Comperio srl - CTO
Mobile: +39 347 7232652 - Office: +39 0425 471482
Skype: dario.rigolin


Solr index

2019-08-08 Thread HTMLServices.it

Hi everyone
I installed Solr on a test server (centos 7) to get the fastest searches 
on dovecot, Solr and new for me and I think I didn't understand how it 
works perfectly.
I installed following the official guide on the dovecot wiki: 
https://wiki2.dovecot.org/Plugins/FTS/Solr

but I can't get it to work properly.

This is my installation that I made public provisionally without a password:
http://5.39.2.59:8987/solr/#/
(I changed port because the default one was busy)

I believe that the index is not created, should it be created 
automatically? or did I do something wrong?


if I run one of these two commands as a guide
curl http://5.39.2.59:8987/solr/dovecot/update?optimize=true
curl http://5.39.2.59:8987/solr/dovecot/update?commit=true
I get




   0 
   2 



this is right? have I forgotten or am I wrong?

Excuse the stupid questions but I'm seeing Solr for the first time
thank you all


Re: Lower case "or" is being treated as operator OR?

2019-08-08 Thread Steven White
Hi Chris,

I was able to fix the issue by adding the line "false " to my request handler.  Here is how
my request handler looks like

  

  explicit
  edismax
  *:*
  100
  true
  CC_UNIQUE_FIELD,CC_FILE_PATH,score
  CC_ALL_FIELDS_DATA
  xml
  false

  

So I am all set.  However, earlier you said  "lowercaseOperators" is set to
"false" by default for 8.1.  Looks like that's not the case.

Thanks.

Steven



On Wed, Aug 7, 2019 at 8:26 PM Chris Hostetter 
wrote:

>
> : I think by "what query parser" you mean this:
>
> no, that's the fieldType -- what i was refering to is that you are in fact
> using "edismax", but with solr 8.1 lowercaseOperators should default to
> "false", so my initial guess is probably wrong.
>
> : By "request parameter" I think you are asking what I'm sending to Solr?
> if
> : sow I'm sending it the raw text of "or" or "OR".  In case you mean my
> : request-handler, it is this:
>
> i mean all of it -- including any other request params your client may be
> sending to solr that overrides those defaults you just posted.
>
> the best thing to do to make sense of this is add
> "echoParams=all" and "debug=true" to your request, and show us the
> full response, along with some details of what docs in that result you
> don't expect to match, so we can look at:
>
> 1) what params come back in the responseHeader, so we can sanity check
> exactly what query string(s) are getting sent to solr, and that
> nothing is overriding lowercaseOperators, etc...
>
> 2) what comes back in the query debug section, so we can sanity check how
> your query strings are getting parsed
>
> 2) what the "explain" output looks like for those docs you are getting
> that you don't expect, so we can see why they matched.
>
>
> FWIW: you mentioned "My default operator is AND" ... but that's not
> visible in the requestHandler defaults you posted -- so where is it being
> set?  (maybe it's not being set like you think it is?)
>
>
>
> -Hoss
> http://www.lucidworks.com/
>


Re: NRT for new items in index

2019-08-08 Thread Updates Profimedia
Thank you for the interesting reply.

You confirmed our assumptions about that. The usage of two or more collections, 
as Jörn Franke said, is more complicated  for developing. And for a now we will 
only try split image to more shards and servers and try to reduce commit times 
too.

I think that NRT times about one minute are acceptable

Thank you


On 2019/08/06 19:59:49, Shawn Heisey  wrote: 
> On 7/31/2019 6:47 AM, profiuser wrote:
> > we have something about 400 000 000 items in a solr collection.
> > We have set up auto commit property for this collection to 15 minutes.
> > Is a big collection and we using some caches etc. Therefore we have big
> > autocommit value.
> 
> I would set autoCommit to 60 seconds (a value of 6) with 
> openSearcher set to false.  This will not affect change visibility in 
> any way, but it will keep your transaction logs from becoming huge. 
> Commits that do NOT open a new searcher are very fast.
> 
> Then I would use autoSoftCommit as a failsafe on change visibility. 
> Start with a value between two and five minutes.
> 
> > This have disadvantage that we haven't NRT searches.
> > 
> > We would like to have NRT at least for searching for the newly added items.
> > 
> > We read about new functionality "Category routed alilases" in a solr version
> > 8.1.
> > 
> > And we got an idea, that we could add to our collection schema field for
> > routing.
> > And at the time of indexing we check if item is new and to routing field we
> > set up value "new", or the item is older than some time period we set up
> > value to "old".
> > And we will have one category routed alias routedCollection, and there will
> > be 2 collections old and new.
> > 
> > If we index new item, router choose new collection and this item is inserted
> > to it. After some period we reindex item and we decide that this item is old
> > and to routing field we set up value "old". Router decide to update (insert)
> > item to collection old. But we expect that solr automatically check
> > uniqueness in all routed collections. And if solr found item in other
> > collection, than will be automatically deleted. But not !!!
> > 
> > Is this expected behaviour?
> 
> I know very little about the new routed collection capability, but in 
> general, I would not expect Solr to check more than one collection for 
> an existing ID value when it is indexing.  I don't think there's 
> anything happening at that level that even knows about other 
> collections.  If you want to split your index into hot and cold pieces, 
> you're probably going to need to have your indexing software be aware of 
> that and either figure out where to send deletes, or just send deletes 
> to all parts of the index.
> 
> What kind of lag time do you think about when you imagine near real time 
> indexing?  Note that extremely short NRT times may not be achievable, 
> especially with the large index you're using.  A good starting point in 
> my opinion is 3, which is 30 seconds.
> 
> What I would do is use the autoCommit and autoSoftCommit settings that I 
> mentioned above, and include a "commitWithin" parameter on all indexing 
> requests.  The commitWithin would be for NRT.
> 
> Thanks,
> Shawn
> 


Re: Clustering error in Solr 8.2.0

2019-08-08 Thread Erick Erickson
This dependency was removed as part of 
https://issues.apache.org/jira/browse/SOLR-9079, so my guess is you’re pointing 
to an old version of the clustering code.

Best,
Erick

> On Aug 8, 2019, at 4:22 AM, Zheng Lin Edwin Yeo  wrote:
> 
> ObjectUtils



Clustering error in Solr 8.2.0

2019-08-08 Thread Zheng Lin Edwin Yeo
Hi,

I am currently working on the upgrade from Solr 7.7.1 to Solr 8.2.0.

For the clustering, I am using lingo3g. There was no error in the earlier
version, but I am getting this error in Solr 8.2.0, even though the
configurations are the same.

This is my configuration for clustering in solrconfig.xml:

  

  lingo
 com.carrotsearch.lingo3g.Lingo3GClusteringAlgorithm

  clustering/carrot2




  stc
  org.carrot2.clustering.stc.STCClusteringAlgorithm




  kmeans
  org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm

  

  


   none
  150
   json
   true
  text
  null

  true
  true
 default

 id
  
 subject
  
  resourcename
  
 content
  
  true

 200

 
 3
  
  20
  
  true
 20



  clustering

  


This is the error I am getting:

HTTP ERROR 500

Problem accessing /edm/calls/clustering. Reason:

Server Error

Caused by:

java.lang.NoClassDefFoundError: org/apache/commons/lang/ObjectUtils
at lingo3g.s.hashCode(Unknown Source)
at 
org.apache.commons.lang3.builder.HashCodeBuilder.append(HashCodeBuilder.java:848)
at 
org.apache.commons.lang3.builder.HashCodeBuilder.append(HashCodeBuilder.java:901)
at 
org.apache.commons.lang3.builder.HashCodeBuilder.appendArray(HashCodeBuilder.java:883)
at 
org.apache.commons.lang3.builder.HashCodeBuilder.append(HashCodeBuilder.java:846)
at org.apache.commons.lang3.ArrayUtils.hashCode(ArrayUtils.java:192)
at 
org.carrot2.util.resource.ResourceLookup.hashCode(ResourceLookup.java:201)
at java.util.HashMap.hash(Unknown Source)
at java.util.HashMap.get(Unknown Source)
at com.carrotsearch.lingo3g.Lingo3GClusteringAlgorithm.a(Unknown Source)
at com.carrotsearch.lingo3g.Lingo3GClusteringAlgorithm.init(Unknown 
Source)
at org.carrot2.core.ControllerUtils.init(ControllerUtils.java:52)
at 
org.carrot2.core.PoolingProcessingComponentManager$ComponentInstantiationListener.objectInstantiated(PoolingProcessingComponentManager.java:189)
at 
org.carrot2.core.PoolingProcessingComponentManager$ComponentInstantiationListener.objectInstantiated(PoolingProcessingComponentManager.java:170)
at 
org.carrot2.util.pool.SoftUnboundedPool.borrowObject(SoftUnboundedPool.java:83)
at 
org.carrot2.core.PoolingProcessingComponentManager.prepare(PoolingProcessingComponentManager.java:129)
at org.carrot2.core.Controller.process(Controller.java:342)
at org.carrot2.core.Controller.process(Controller.java:247)
at 
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.lambda$cluster$2(CarrotClusteringEngine.java:241)
at 
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.withContextClassLoader(CarrotClusteringEngine.java:557)
at 
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:240)
at 
org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:237)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:305)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2578)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:780)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:566)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:423)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at