Re: Dependency log4j-slf4j-impl for solr-core:7.5.0 causing a number of build problems

2020-01-16 Thread David Smiley
Ultimately if you deduce the problem, file a JIRA issue and share it with
me; I will look into it.  I care about this matter too; I hate having to
exclude logging dependencies on the consuming end.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Jan 15, 2020 at 9:03 PM Wolf, Chris (ELS-CON) 
wrote:

> I am having several issues due to the slf4j implementation dependency
> “log4j-slf4j-impl” being declared as a dependency of solr-core:7.5.0.   The
> first issue observed when starting the app is this:
>
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/Users/ma-wolf2/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.7/log4j-slf4j-impl-2.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/Users/ma-wolf2/.m2/repository/ch/qos/logback/logback-classic/1.1.3/logback-classic-1.1.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type
> [org.apache.logging.slf4j.Log4jLoggerFactory]
>
> I first got wind that this might not be just myself from this thread:
>
> https://lucene.472066.n3.nabble.com/log4j-slf4j-impl-dependency-in-solr-core-td4449635.html#a4449891
>
>
>   *   If there are any users that integrate solr-core into their own code,
> it's currently a bit of a land-mine situation to change logging
> implementations.  If there's a way we can include log4j jars at build
> time, but remove the log4j dependency on the published solr-core
> artifact, that might work well.  We should do our best to make it so
> people can use EmbeddedSolrServer without log4j jars.
>
> There are two dimensions to this dependency problem:
>
>   *   Building a war file (this runs with a warning)
>   *   Building a spring-boot executable JAR with embedded servlet
> container (doesn’t run)
>
> When building a WAR and deploying, I get the “multiple SLF4J bindings”
> warning, but the app works. However, I want the convenience of a
> spring-boot executable JAR with embedded servlet container, but in that
> case, I get that warning followed by a fatal NoClassDefFoundError/
> ClassNotFoundException – which is a show-stopper.  If I hack the built
> spring-boot FAT jar and remove “log4j-slf4j-impl.jar” then the app works.
>
> For the WAR build, the proper version of log4j-slf4j-impl.jar was included
> – 2.11.0, but,for some reason when building the spring-boot fat (uber) jar,
> it was building with log4j-slf4j-impl:2.7 so of course it will croak.
>
> There are several issues:
>
>   1.  I don’t want log4j-slf4j-impl at all
>   2.  Somehow the version of “log4j-slf4j-impl” being used for the build
> is 2.7 rather then the expected 2.11.0
>   3.  Due to the version issue, the app croaks with
> ClassNotFoundException: org.apache.logging.log4j.util.ReflectionUtil
>
> For issue #1, I tried:
>   
>   org.apache.solr
>   solr-core
>   7.5.0
>   
> 
>   org.apache.logging.log4j
>   log4j-slf4j-impl
> 
>   
> 
>
> All to no avail, as that dependency ends up in the packaged build - for
> WAR, it’s version 2.11.0, so even though it’s a bad build, the app runs,
> but for building a spring-boot executable JAR with embedded webserver, for
> some reason, it switches log4j-slf4j-impl from version 2.11.0  to 2,7
> (2.11.0  works, but should not even be there)
>
> I also tried this:
>
> https://docs.spring.io/spring-boot/docs/current/maven-plugin/examples/exclude-dependency.html
>
> …that didn’t work either.
>
> I’m thinking that solr-core should have added a classifier of “provided”
> for “log4j-slf4j-impl”, but that’s conjecture of a possible solution going
> forward, but does anyone know how I can exclude  “log4j-slf4j-impl”  from a
> spring-boot build?
>
>
>
>
>
>


Re: Failed to connect to server

2020-01-16 Thread David Hastings
>  'Error: Solr core is loading'

do you have any suggesters or anything configured that would get rebuilt?



On Thu, Jan 16, 2020 at 3:41 PM rhys J  wrote:

> On Thu, Jan 16, 2020 at 3:27 PM Edward Ribeiro 
> wrote:
>
> > A regular update is a delete followed by an indexing of the document. So
> > technically both are indexes. :) If there's an atomic update (
> >
> https://lucene.apache.org/solr/guide/8_4/updating-parts-of-documents.html
> > ), Solr would throw some sort of version conflict exception like
> >
> >
> These would have been atomic updates running at the same time I was
> importing a csv file into another core.
>
> After the connection errors, I noticed in the log that there was an error
> from a curl statement that said 'Error: Solr core is loading'
>
> The connection refused exception does not seem related to the indexing by
> > itself. Maybe it has to do with you hitting the maximum connection
> requests
> > allowed per host. See in the link below the maxConnectionsPerHost and
> > maxConnections parameters of your Solr version:
> >
> >
> >
> https://lucene.apache.org/solr/guide/6_6/format-of-solr-xml.html#Formatofsolr.xml-The%3CshardHandlerFactory%3Eelement
> >
> >
> Thank you for this. This was helpful. I have increased the number of
> maxConnections to see if this fixes the problem.
>
> Rhys
>


Re: Error while updating: java.lang.NumberFormatException: empty String

2020-01-16 Thread rhys J
On Thu, Jan 16, 2020 at 3:10 PM Edward Ribeiro 
wrote:

> Hi,
>
> There is a status_code in the JSON snippet and it is going as a string with
> single space. Maybe it is an integer?
>
> Best,
> Edward
>
>
Oh wow, yes you are right. When I adjusted the status_code to not be a
space, it fixed everything.

I had forgotten that status_code was an integer.

It turned out that a database update had an error, and the status_code was
not entered. So my script is now handling whether the status_code is empty,
and adjusting accordingly.

Thanks,

Rhys


Re: Failed to connect to server

2020-01-16 Thread rhys J
On Thu, Jan 16, 2020 at 3:27 PM Edward Ribeiro 
wrote:

> A regular update is a delete followed by an indexing of the document. So
> technically both are indexes. :) If there's an atomic update (
> https://lucene.apache.org/solr/guide/8_4/updating-parts-of-documents.html
> ), Solr would throw some sort of version conflict exception like
>
>
These would have been atomic updates running at the same time I was
importing a csv file into another core.

After the connection errors, I noticed in the log that there was an error
from a curl statement that said 'Error: Solr core is loading'

The connection refused exception does not seem related to the indexing by
> itself. Maybe it has to do with you hitting the maximum connection requests
> allowed per host. See in the link below the maxConnectionsPerHost and
> maxConnections parameters of your Solr version:
>
>
> https://lucene.apache.org/solr/guide/6_6/format-of-solr-xml.html#Formatofsolr.xml-The%3CshardHandlerFactory%3Eelement
>
>
Thank you for this. This was helpful. I have increased the number of
maxConnections to see if this fixes the problem.

Rhys


Re: SolrCloud upgrade concern

2020-01-16 Thread David Hastings
ha, im on that thread, didnt know they got stored on a site, thats good to
know!

-i stand by what i said in there.  so i have nothing more to add

On Thu, Jan 16, 2020 at 3:29 PM Arnold Bronley 
wrote:

> Hi,
>
> I am trying to upgrade my system from Solr master-slave architecture to
> SolrCloud architecture.
> Meanwhile, I stumbled upon this very negative post about SolrCloud.
>
>
> https://lucene.472066.n3.nabble.com/A-Last-Message-to-the-Solr-Users-td4452980.html
>
>
> Given that it is from one of the initial authors of SolrCloud
> functionality, I am having second thoughts about the upgrade and I am
> somewhat concerned.
>
> I will greatly appreciate any advice/feedback on this from Solr community.
>


SolrCloud upgrade concern

2020-01-16 Thread Arnold Bronley
Hi,

I am trying to upgrade my system from Solr master-slave architecture to
SolrCloud architecture.
Meanwhile, I stumbled upon this very negative post about SolrCloud.

https://lucene.472066.n3.nabble.com/A-Last-Message-to-the-Solr-Users-td4452980.html


Given that it is from one of the initial authors of SolrCloud
functionality, I am having second thoughts about the upgrade and I am
somewhat concerned.

I will greatly appreciate any advice/feedback on this from Solr community.


Re: Failed to connect to server

2020-01-16 Thread Edward Ribeiro
A regular update is a delete followed by an indexing of the document. So
technically both are indexes. :) If there's an atomic update (
https://lucene.apache.org/solr/guide/8_4/updating-parts-of-documents.html
), Solr would throw some sort of version conflict exception like

{"error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"version conflict for aaa expected=99
actual=1632740120218042368",
"code":409}}

The connection refused exception does not seem related to the indexing by
itself. Maybe it has to do with you hitting the maximum connection requests
allowed per host. See in the link below the maxConnectionsPerHost and
maxConnections parameters of your Solr version:

https://lucene.apache.org/solr/guide/6_6/format-of-solr-xml.html#Formatofsolr.xml-The%3CshardHandlerFactory%3Eelement

Other than that, it can be related to connection issues with the VM,
containers, etc, I guess.

Edward

On Thu, Jan 16, 2020 at 1:45 PM rhys J  wrote:
>
> I have noticed that if I am using curl to index a csv file *and* using
curl
> thru a script to update the Solr cores, that I get the following error:
>
> curl: (7) Failed to connect to 10.40.10.14 port 8983: Connection refused
>
> Can I only index *or* update, but not do both?
>
> I am not running shards or collections, just a standalone set of cores.
>
> Thanks,
>
> Rhys


Re: Error while updating: java.lang.NumberFormatException: empty String

2020-01-16 Thread Edward Ribeiro
Hi,

There is a status_code in the JSON snippet and it is going as a string with
single space. Maybe it is an integer?

Best,
Edward


On Thu, Jan 16, 2020 at 2:06 PM rhys J  wrote:

> While updating my Solr core, I ran into a problem with this curl statement.
>
> When I looked up the error, the only reference I could find was that maybe
> a float was being added as null. So I changed all the float fields from
> 'null' to '0.00'. But I still get the error.
>
> Float fields as per the schema:
>
> adjust_int
>
> adjust_princ
>
> cur_bal_original_currency
>
> manual_orig_balance
>
> orig_int_amt
>
> orig_princ_amt
>
> princ_paid
>
> Curl statement:
>
> curl http://localhost:8983/solr/debt/update?commit=true -d "[{ 'id':
> '636628-242', 'adjust_int': {'set': '0.00'},'adjust_princ': {'set':
> '0.00'},'clt_id': {'set': '3017'},'clt_ref_no': {'set':
> '1057-43261-9/128694'},'comments': {'set': ' '},'contract_number': {'set':
> '1057-43261-9'},'cur_bal_original_currency': {'set':
> '0.00'},'currency_conv': {'set': '0.00'},'debt_descr': {'set': 'PO/XREF:
> 994042088'},'debt_id': {'set': '636628'},'debt_no': {'set':
> '242'},'debt_type': {'set': 'COM'},'delq_date': {'set':
> '2020-01-30T00:00:00Z'},'internal_adjustment': {'set':
> '0'},'invoice_currency': {'set': null},'last_spreadsheet_date': {'set':
> null},'list_date': {'set': '2019-12-31T00:00:00Z'},'manual_change': {'set':
> null},'manual_orig_balance': {'set': '0.00'},'orig_clt': {'set':
> '3017'},'orig_int_amt': {'set': '0.00'},'orig_princ_amt': {'set':
> '480.00'},'original_invoice': {'set': null},'potential_bad_debt': {'set':
> '0'},'primary_debtor_id': {'set': null},'princ_paid': {'set':
> '0.00'},'reference_no': {'set':
> 'invoice:1057-43261-9/128694'},'reg_number': {'set': null},'salesperson':
> {'set': 'Bob Drummond'},'serv_date': {'set':
> '2019-12-31T00:00:00Z'},'shipper_name': {'set': null},'status_code':
> {'set': ' '},'status_date': {'set':
> '2020-01-16T00:00:00Z'},'storage_account': {'set': '0'},'time_stamp':
> {'set': '2019-12-31T23:35:00Z'},}]"
>
>
> Thanks,
>
>
> Rhys
>


Re: Coming back to search after some time... SOLR or Elastic for text search?

2020-01-16 Thread Nicolas Paris
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.

I use the spark-solr library in production and have looked at the ES
equivalent and the solr connector looks much more advanced for both
loading and fetching data. In particular the fetching part uses the solr
export handler which makes things incredibly fast. Also spark-solr uses
the dataframe API while ES looks to be stuck with the RDD api AFAIK.

A good connector to spark offer lot of perspectives in term of data
transformation and machine learning advanced features within the search
engine.

On Tue, Jan 14, 2020 at 11:02:17PM -0500, Dc Tech wrote:
> I am SOLR fant and had implemented it in our company over 10 years ago.
> I moved away from that role and the new search team in the meanwhile
> implemented a proprietary (and expensive) nosql style search engine. That
> the project did not go well, and now I am back to project and reviewing the
> technology stack.
> 
> Some of the team think that ElasticSearch could be a good option,
> especially since we can easily get hosted versions with AWS where we have
> all the contractual stuff sorted out.
> 
> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
> graph, and all the knobs and dials for relevancy tuning), Elastic may be
> sufficient for our needs. It does not seem to have LTR out of the box but
> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
> 
> The corpus size is not a challenge  - we have about one million document,
> of which about 1/2 have full text, while the test are simpler (i.e. company
> directory etc.).
> The query volumes are also quite low (max 5/second at peak).
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.
> 
> I would really appreciate any guidance from the community !!

-- 
nicolas


Re: Solr issue with Sitecore 9.0.1

2020-01-16 Thread Jan Høydahl
You have to provide a lot more information in order to get help.

Java version, solr version, solr config (env vars), much more logs than what 
you posted (something is crashing)

Jan Høydahl

> 16. jan. 2020 kl. 12:22 skrev Lakshmana Gudivada (AU) 
> :
> 
> Hi Team,
> 
> Recently I have installed Solr 6.6.2 in my local desktop along with Sitecore 
> 9.0.1. I have installed https version of Solr using nssm 2.24 and trying to 
> browse https://localhost:8983/Solr
> 
> But unfortunately the web page throws the below error and solr.log file also 
> has the same error.
> 
> Can someone please help on this below error? This needs to be fixed asap so 
> we will have our local setup up and running.
> 
> 2020-01-16 10:10:08.508 ERROR (qtp1330278544-27) [   ] 
> o.a.s.s.SolrDispatchFilter Error processing the request. CoreContainer is 
> either not initialized or shutting down.
> 2020-01-16 10:10:08.508 WARN  (qtp1330278544-27) [   ] o.e.j.s.ServletHandler
> javax.servlet.UnavailableException: Error processing the request. 
> CoreContainer is either not initialized or shutting down.
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:321)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>at org.eclipse.jetty.server.Server.handle(Server.java:534)
>at 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>at 
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:202)
>at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>at 
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>at java.lang.Thread.run(Unknown Source)
> 
> 
> Much appreciate your help in resolving this.
> 
> Thanks
> 
> Lakshmana Prasanna Kumar Gudivada
> Senior Consultant
> Digital Business
> NTT
> 
> T: +61 (3) 70253006
> M: +91 9032443591
> E: lakshmana.gudiv...@global.ntt
> hello.global.ntt
> 
> This email and all contents are subject to the following disclaimer:
> https://hello.global.ntt/en-us/email-disclaimer
> 
> 
> 
> 
> 


Error while updating: java.lang.NumberFormatException: empty String

2020-01-16 Thread rhys J
While updating my Solr core, I ran into a problem with this curl statement.

When I looked up the error, the only reference I could find was that maybe
a float was being added as null. So I changed all the float fields from
'null' to '0.00'. But I still get the error.

Float fields as per the schema:

adjust_int

adjust_princ

cur_bal_original_currency

manual_orig_balance

orig_int_amt

orig_princ_amt

princ_paid

Curl statement:

curl http://localhost:8983/solr/debt/update?commit=true -d "[{ 'id':
'636628-242', 'adjust_int': {'set': '0.00'},'adjust_princ': {'set':
'0.00'},'clt_id': {'set': '3017'},'clt_ref_no': {'set':
'1057-43261-9/128694'},'comments': {'set': ' '},'contract_number': {'set':
'1057-43261-9'},'cur_bal_original_currency': {'set':
'0.00'},'currency_conv': {'set': '0.00'},'debt_descr': {'set': 'PO/XREF:
994042088'},'debt_id': {'set': '636628'},'debt_no': {'set':
'242'},'debt_type': {'set': 'COM'},'delq_date': {'set':
'2020-01-30T00:00:00Z'},'internal_adjustment': {'set':
'0'},'invoice_currency': {'set': null},'last_spreadsheet_date': {'set':
null},'list_date': {'set': '2019-12-31T00:00:00Z'},'manual_change': {'set':
null},'manual_orig_balance': {'set': '0.00'},'orig_clt': {'set':
'3017'},'orig_int_amt': {'set': '0.00'},'orig_princ_amt': {'set':
'480.00'},'original_invoice': {'set': null},'potential_bad_debt': {'set':
'0'},'primary_debtor_id': {'set': null},'princ_paid': {'set':
'0.00'},'reference_no': {'set':
'invoice:1057-43261-9/128694'},'reg_number': {'set': null},'salesperson':
{'set': 'Bob Drummond'},'serv_date': {'set':
'2019-12-31T00:00:00Z'},'shipper_name': {'set': null},'status_code':
{'set': ' '},'status_date': {'set':
'2020-01-16T00:00:00Z'},'storage_account': {'set': '0'},'time_stamp':
{'set': '2019-12-31T23:35:00Z'},}]"


Thanks,


Rhys


Failed to connect to server

2020-01-16 Thread rhys J
I have noticed that if I am using curl to index a csv file *and* using curl
thru a script to update the Solr cores, that I get the following error:

curl: (7) Failed to connect to 10.40.10.14 port 8983: Connection refused

Can I only index *or* update, but not do both?

I am not running shards or collections, just a standalone set of cores.

Thanks,

Rhys


Re: Dependency log4j-slf4j-impl for solr-core:7.5.0 causing a number of build problems

2020-01-16 Thread Wolf, Chris (ELS-CON)
--- original message ---
It looks to me as though solr-core is not the only artifact with that
dependency.  The first thing I would do is examine the output of 'mvn
dependency:tree' to see what has dragged log4j-slf4j-impl in even when
it is excluded from solr-core. 
--- end of original message ---

Hi, that's the first thing I did and *only* solr-core is pulling in 
log4j-slf4j-impl, but there is more weirdness to this.  When I build as a WAR 
project, then version 2.11.0 of in log4j-slf4j-impl is pulled in which results 
in "multiple implementations" warning and is non-fatal.  

However, when building as a spring-boot executable jar, for some reason, it 
pulls in version 2.7 rather then 2.11.0 resulting in fatal 
"ClassNotFoundException: org.apache.logging.log4j.util.ReflectionUtil"

Thanks.

Here is the dependency tree:

com.elsevier::jar:1.1.7
+- org.springframework.boot:spring-boot-starter-web:jar:1.5.6.RELEASE:compile
|  +- org.springframework.boot:spring-boot-starter:jar:1.5.6.RELEASE:compile
|  |  +- org.springframework.boot:spring-boot:jar:1.5.6.RELEASE:compile
|  |  +- 
org.springframework.boot:spring-boot-autoconfigure:jar:1.5.6.RELEASE:compile
|  |  +- 
org.springframework.boot:spring-boot-starter-logging:jar:1.5.6.RELEASE:compile
|  |  +- org.springframework:spring-core:jar:4.3.10.RELEASE:compile
|  |  \- org.yaml:snakeyaml:jar:1.17:runtime
|  +- 
org.springframework.boot:spring-boot-starter-tomcat:jar:1.5.6.RELEASE:compile
|  |  +- org.apache.tomcat.embed:tomcat-embed-core:jar:8.5.16:compile
|  |  +- org.apache.tomcat.embed:tomcat-embed-el:jar:8.5.16:compile
|  |  \- org.apache.tomcat.embed:tomcat-embed-websocket:jar:8.5.16:compile
|  +- org.hibernate:hibernate-validator:jar:5.3.5.Final:compile
|  |  +- javax.validation:validation-api:jar:1.1.0.Final:compile
|  |  +- org.jboss.logging:jboss-logging:jar:3.3.1.Final:compile
|  |  \- com.fasterxml:classmate:jar:1.3.3:compile
|  +- com.fasterxml.jackson.core:jackson-databind:jar:2.8.9:compile
|  |  +- com.fasterxml.jackson.core:jackson-annotations:jar:2.8.0:compile
|  |  \- com.fasterxml.jackson.core:jackson-core:jar:2.8.9:compile
|  +- org.springframework:spring-web:jar:4.3.10.RELEASE:compile
|  |  +- org.springframework:spring-aop:jar:4.3.10.RELEASE:compile
|  |  +- org.springframework:spring-beans:jar:4.3.10.RELEASE:compile
|  |  \- org.springframework:spring-context:jar:4.3.10.RELEASE:compile
|  \- org.springframework:spring-webmvc:jar:4.3.10.RELEASE:compile
| \- org.springframework:spring-expression:jar:4.3.10.RELEASE:compile
+- cglib:cglib-nodep:jar:3.0:runtime
+- org.apache.commons:commons-lang3:jar:3.1:compile
+- org.jdom:jdom2:jar:2.0.6:compile
+- com.example.somelibjar:jar:1.1.9:compile
|  +- org.apache.lucene:lucene-core:jar:7.5.0:compile
|  +- com.healthline:qpe:jar:1.1.5:compile
|  |  +- org.springframework:spring-context-support:jar:4.3.10.RELEASE:compile
|  |  +- org.springframework:spring-jdbc:jar:4.3.10.RELEASE:compile
|  |  |  \- org.springframework:spring-tx:jar:4.3.10.RELEASE:compile
|  |  \- org.apache.logging.log4j:log4j-core:jar:2.7:compile
|  +- com.exampe:somelib2:jar:1.1.4:compile
|  |  +- org.apache.solr:solr-core:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-analyzers-common:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-analyzers-kuromoji:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-analyzers-nori:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-analyzers-phonetic:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-backward-codecs:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-classification:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-codecs:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-expressions:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-grouping:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-highlighter:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-join:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-memory:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-misc:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-queries:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-queryparser:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-sandbox:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-spatial-extras:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-spatial3d:jar:7.5.0:compile
|  |  |  +- org.apache.lucene:lucene-suggest:jar:7.5.0:compile
|  |  |  +- com.carrotsearch:hppc:jar:0.8.1:compile
|  |  |  +- 
com.fasterxml.jackson.dataformat:jackson-dataformat-smile:jar:2.8.9:compile
|  |  |  +- com.github.ben-manes.caffeine:caffeine:jar:2.3.5:compile
|  |  |  +- com.google.guava:guava:jar:14.0.1:compile
|  |  |  +- com.google.protobuf:protobuf-java:jar:3.1.0:compile
|  |  |  +- com.lmax:disruptor:jar:3.4.0:compile
|  |  |  +- com.tdunning:t-digest:jar:3.1:compile
|  |  |  +- commons-codec:commons-codec:jar:1.10:compile
|  |  |  +- 

solr-diagnostics: utility for collecting info from the Solr installation

2020-01-16 Thread Radu Gheorghe
Hello Solr users :)

We just published a small tool that collects diagnostics information:
configs, logs, metrics API output, etc as well as system info (dmesg,
netstat, top...). I thought others might find it interesting, so
here's a short blog post that describes it:
https://sematext.com/blog/solr-diagnostics/

Oh, and by "just published", I mean about two years ago :) It needs
more love, for example it doesn't work on Windows yet (contributions
welcome!), but we've already used it on N clusters and found it
useful.

Please let me know if you have any questions or feedback. Or even
better, please open an issue or submit a PR :)

Thanks and best regards,
Radu
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


Re: How do I add multiple values for same field with DIH script?

2020-01-16 Thread O. Klein
Yes, field is multivalued.

I managed to add an array to the content_text field and comma separated
values "foo,bar" eg. but not a " list" like normally you see with a
multivalued field.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: How do I add multiple values for same field with DIH script?

2020-01-16 Thread Edward Ribeiro
Hi,

Are you sure content_text is a multivalued field (i.e., field definition has
multiValued="true" in managed-schema)?

Edward


Em qui, 16 de jan de 2020 08:42, O. Klein  escreveu:

> row.put('content_text', "hello");
> row.put('content_text', "this is a test");
> return row;
>
> will only return "this is a test"
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Dependency log4j-slf4j-impl for solr-core:7.5.0 causing a number of build problems

2020-01-16 Thread Mark H. Wood
On Thu, Jan 16, 2020 at 02:03:06AM +, Wolf, Chris (ELS-CON) wrote:
[snip]
> There are several issues:
> 
>   1.  I don’t want log4j-slf4j-impl at all
>   2.  Somehow the version of “log4j-slf4j-impl” being used for the build is 
> 2.7 rather then the expected 2.11.0
>   3.  Due to the version issue, the app croaks with ClassNotFoundException: 
> org.apache.logging.log4j.util.ReflectionUtil
> 
> For issue #1, I tried:
>   
>   org.apache.solr
>   solr-core
>   7.5.0
>   
> 
>   org.apache.logging.log4j
>   log4j-slf4j-impl
> 
>   
> 
> 
> All to no avail, as that dependency ends up in the packaged build - for WAR, 
> it’s version 2.11.0, so even though it’s a bad build, the app runs, but for 
> building a spring-boot executable JAR with embedded webserver, for some 
> reason, it switches log4j-slf4j-impl from version 2.11.0  to 2,7 (2.11.0  
> works, but should not even be there)
> 
> I also tried this:
> https://docs.spring.io/spring-boot/docs/current/maven-plugin/examples/exclude-dependency.html
> 
> …that didn’t work either.
> 
> I’m thinking that solr-core should have added a classifier of “provided” for 
> “log4j-slf4j-impl”, but that’s conjecture of a possible solution going 
> forward, but does anyone know how I can exclude  “log4j-slf4j-impl”  from a 
> spring-boot build?

It looks to me as though solr-core is not the only artifact with that
dependency.  The first thing I would do is examine the output of 'mvn
dependency:tree' to see what has dragged log4j-slf4j-impl in even when
it is excluded from solr-core.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: How do I add multiple values for same field with DIH script?

2020-01-16 Thread Mikhail Khludnev
Hello.
What about putting Arrays.asList("foo", "bar") ?

On Thu, Jan 16, 2020 at 2:42 PM O. Klein  wrote:

> row.put('content_text', "hello");
> row.put('content_text', "this is a test");
> return row;
>
> will only return "this is a test"
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
Sincerely yours
Mikhail Khludnev


Re: Coming back to search after some time... SOLR or Elastic for text search?

2020-01-16 Thread Emir Arnautović
Hi Jan,
Here is a blog post related to this topic: 
https://sematext.com/blog/solr-vs-elasticsearch-differences/ 

It also contains links to other resources that might help you make a decision.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 15 Jan 2020, at 05:02, Dc Tech  wrote:
> 
> I am SOLR fant and had implemented it in our company over 10 years ago.
> I moved away from that role and the new search team in the meanwhile
> implemented a proprietary (and expensive) nosql style search engine. That
> the project did not go well, and now I am back to project and reviewing the
> technology stack.
> 
> Some of the team think that ElasticSearch could be a good option,
> especially since we can easily get hosted versions with AWS where we have
> all the contractual stuff sorted out.
> 
> Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
> graph, and all the knobs and dials for relevancy tuning), Elastic may be
> sufficient for our needs. It does not seem to have LTR out of the box but
> the relevancy tuning knobs and dials seem to be similar to what SOLR has.
> 
> The corpus size is not a challenge  - we have about one million document,
> of which about 1/2 have full text, while the test are simpler (i.e. company
> directory etc.).
> The query volumes are also quite low (max 5/second at peak).
> We have implemented the content ingestion and processing pipelines already
> in python and SPARK, so most of the data will be pushed in using APIs.
> 
> I would really appreciate any guidance from the community !!



How do I add multiple values for same field with DIH script?

2020-01-16 Thread O. Klein
row.put('content_text', "hello");
row.put('content_text', "this is a test");
return row;

will only return "this is a test"




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr issue with Sitecore 9.0.1

2020-01-16 Thread Lakshmana Gudivada (AU)
Hi Team,

Recently I have installed Solr 6.6.2 in my local desktop along with Sitecore 
9.0.1. I have installed https version of Solr using nssm 2.24 and trying to 
browse https://localhost:8983/Solr

But unfortunately the web page throws the below error and solr.log file also 
has the same error.

Can someone please help on this below error? This needs to be fixed asap so we 
will have our local setup up and running.

2020-01-16 10:10:08.508 ERROR (qtp1330278544-27) [   ] 
o.a.s.s.SolrDispatchFilter Error processing the request. CoreContainer is 
either not initialized or shutting down.
2020-01-16 10:10:08.508 WARN  (qtp1330278544-27) [   ] o.e.j.s.ServletHandler
javax.servlet.UnavailableException: Error processing the request. CoreContainer 
is either not initialized or shutting down.
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:321)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:202)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Unknown Source)


Much appreciate your help in resolving this.

Thanks

Lakshmana Prasanna Kumar Gudivada
Senior Consultant
Digital Business
NTT

T: +61 (3) 70253006
M: +91 9032443591
E: lakshmana.gudiv...@global.ntt
hello.global.ntt

This email and all contents are subject to the following disclaimer:
https://hello.global.ntt/en-us/email-disclaimer







Re: Update synonyms.txt file based on values in the database

2020-01-16 Thread Charlie Hull
Try looking into Managed Resources: 
https://lucene.apache.org/solr/guide/8_4/managed-resources.html


Charlie

On 15/01/2020 10:35, seeteshh wrote:

How do I update the synonyms.txt file if the data is being fetched from a
database say PostgreSQL since I wont be able to update the synonmys.txt file
every time manually and also the data is related to a table and not known to
Solr.

I am using Apache Solr 8.4.

Regards,

Seetesh hindlekar



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



--
Charlie Hull
OpenSource Connections, previously Flax

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com



Re: Coming back to search after some time... SOLR or Elastic for text search?

2020-01-16 Thread Charlie Hull

On 15/01/2020 11:42, Dc Tech wrote:

Thank you Jan and Charlie.

I should say that in terms of posting to the community regarding Elastic vs 
Solr - this is probably the most civil and helpful community that I have been a 
part of - and your answers have only reinforced that  notion !!

Thank you for your responses. I am glad to hear that both can do most of it, 
which was my gut feeling as well.

Charlie, to your point - the team probably feels that Elastic  is easier to get 
started with hence the preference, as well as the hosting options (with the 
caveats you noted). Agree with you completely that tech is not the real issue.

Jan,  agree with  the points you made on team skills.  On our previous 
proprietary engine - that was in fact the biggest issue - the engine was 
powerful enough and had good references.  However, we were not able to exploit 
it to good effect.


Hi again,

The dirty secret that few will voice is that...most search engines are 
basically the same. Once you've worked on a search project you can apply 
those skills to any future search engine. This is why I'm currently 
focused on supporting the search team, not the search tech - how do you 
learn and improve those relevance tuning skills, considering it's 
really, really hard to recruit people with existing high-level search 
skills (and if you can find them you probably can't afford them).


Cheers

Charlie



Thank you again.


On Jan 15, 2020, at 5:10 AM, Jan Høydahl  wrote:

Hi,

Choosing the solr community mailing list to ask advice for whether to choose ES 
- you already know what to expect, not?
More often than not the choice comes down to policy, standardization, what 
skills you have in the house etc rather than ticking off feature checkboxes.
Sometimes company values also may drive a choice, i.e. Solr is 100% Apache and 
not open core, which may matter if you plan to get involved in the community, 
and contribute features or patches.

However, if I were in your shoes as architect to evaluate tech stack, and there 
was not a clear choice based on the above, I’d do what projects normally do, to 
ask yourself what you really need from the engine. Maybe you have some features 
in your requirement list that makes one a much better choice over the other. Or 
maybe after that exercise you are still wondering what to choose, in which case 
you just follow your gut feeling and make a choice :)

Jan


15. jan. 2020 kl. 10:07 skrev Charlie Hull :


On 15/01/2020 04:02, Dc Tech wrote:
I am SOLR fant and had implemented it in our company over 10 years ago.
I moved away from that role and the new search team in the meanwhile
implemented a proprietary (and expensive) nosql style search engine. That
the project did not go well, and now I am back to project and reviewing the
technology stack.

Some of the team think that ElasticSearch could be a good option,
especially since we can easily get hosted versions with AWS where we have
all the contractual stuff sorted out.

You can, but you should be aware that:
1. Amazon's hosted Elasticsearch isn't great, often lags behind the current 
version, doesn't allow plugins etc.
2.  Amazon and Elastic are currently engaged in legal battles over who is the 
most open sourcey,who allegedly copied code that was 'open' but commercially 
licensed, who would like to capture the hosted search market...not sure how 
this will pan out (Google for details)
3. You can also buy fully hosted Solr from several places.

Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
graph, and all the knobs and dials for relevancy tuning), Elastic may be
sufficient for our needs. It does not seem to have LTR out of the box but
the relevancy tuning knobs and dials seem to be similar to what SOLR has.

Yes, they're basically the same under the hood (unsurprising as they're both 
based on Lucene). If you need LTR there's an ES plugin for that (disclaimer, my 
new employer built and maintains it: 
https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track of the 
amount of times I've been asked 'Elasticsearch or Solr, which should I choose?' 
and my current thoughts are:

1. Don't switch from one to the other for the sake of it.  Switching search 
engines rarely addresses underlying issues (content quality, team skills, 
relevance tuning methodology)
2. Elasticsearch is easier to get started with, but at some point you'll need 
to learn how it all works
3. Solr is harder to get started with, but you'll know more about how it all 
works earlier
4. Both can be used for most search projects, most features are the same, both 
can scale.
5. Lots of Elasticsearch projects (and developers) are focused on logs, which 
is often not really a 'search' project.


The corpus size is not a challenge  - we have about one million document,
of which about 1/2 have full text, while the test are simpler (i.e. company
directory etc.).
The query volumes are also quite low (max 5/second at peak).
We have