KeeperErrorCode = NoNode for /collections/my-valid-collection/state.json

2018-07-27 Thread S G
Hi,


Following error is very commonly seen in Solr.

Does anybody know why that is so?

And is it asking the user to do something about it?


org.apache.solr.common.SolrException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
= NoNode for /collections/my-valid-collection/state.json
at 
org.apache.solr.handler.admin.ZookeeperInfoHandler$ZKPrinter.writeError(ZookeeperInfoHandler.java:544)
at 
org.apache.solr.handler.admin.ZookeeperInfoHandler$ZKPrinter.printZnode(ZookeeperInfoHandler.java:812)
at 
org.apache.solr.handler.admin.ZookeeperInfoHandler$ZKPrinter.print(ZookeeperInfoHandler.java:526)
at 
org.apache.solr.handler.admin.ZookeeperInfoHandler.handleRequestBody(ZookeeperInfoHandler.java:414)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:497)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:745)



Thanks

SG


Re: sharding and placement of replicas

2018-07-27 Thread Erick Erickson
bq. Could SolrCloud avoid putting multiple replicas of the same shard
on the same host when there are multiple nodes per host?

Yes with some fiddling as far as "placement rules", start here:
https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html

The idea (IIUC) is that you provide a snitch" that identifies what
"rack" the Solr instance is on and can define placement rules that
define "don't put more than one thingy on the same rack". "Thingy"
here is replica, shard, whatever as defined by other placement rules.

NOTE: pay particular attention to which version of Solr you're using
as I think this is changing pretty rapidly as part of the autoscaling
(7x) work.


Best,
Erick



On Fri, Jul 27, 2018 at 7:34 AM, Shawn Heisey  wrote:
> On 7/25/2018 3:49 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote:
>> I end up with four cores instead of two, as expected. The problem is that 
>> three of the four cores (col_shard1_0_replica_n5, col_shard1_0_replica0 and 
>> col_shard1_1_replica_n6) are *all on hostname1*. Only col_shard1_1_replica0 
>> was placed on hostname2.
> 
>> My question is: How can I tell Solr "avoid putting two replicas of the same 
>> shard on the same node"?
>
> Somehow I missed that there were three cores on host1 when you first
> described the problem.  Looking back, I see that you did have that
> information there.  I was more focused on the fact that host2 only had
> one core.  My apologies for not reading closely enough.
>
> Is this collection using compositeId or implicit?  I think it would have
> to be compositeId for a split to work correctly.  I wouldn't expect
> split to be supported on a collection with the implicit router.
>
> Are you running one Solr node per host?  If you have multiple Solr nodes
> (instances) on one host, Solr will have no idea that this is the case --
> the entire node identifier (including host name, port, and context path)
> is compared to distinguish nodes from each other.  The assumption in
> SolrCloud's internals is that each node is completely separate from
> every other node.  Running multiple nodes per host is only recommended
> when the heap requirements are *very* high, and in that situation,
> making sure that replicas are distributed properly will require extra
> effort.  For most installations, it is strongly recommended to only have
> one Solr node per physical host.
>
> If you are only running one Solr node per host, then the way it's
> behaving for you is certainly not the design intent, and sounds like a
> bug in SPLITSHARD.  Solr should try very hard to not place multiple
> replicas of one shard on the same *node*.
>
> A side question for devs that know about SolrCloud internals:  Could
> SolrCloud avoid putting multiple replicas of the same shard on the same
> host when there are multiple nodes per host?  It seems to me that it
> would not be supremely difficult to have SolrCloud detect a match in the
> host name and use that information to prefer nodes on different hosts
> when possible.  I am thinking about creating an issue for this enhancement.
>
> Thanks,
> Shawn
>


Re: Recent configuration change to our site causes frequent index corruption

2018-07-27 Thread Erick Erickson
bq: Error opening new searcher. exceeded limit of maxWarmingSearchers=2

did you make sure that your indexing client isn't issuing commits all
the time? The other possible culprit (although I'd be very surprised)
is if you have your filterCache and queryResultCache autowarm settings
set extremely high. I usually start with 16 or so.

Best,
Erick

On Fri, Jul 27, 2018 at 10:02 AM, cyndefromva  wrote:
> That makes sense, the ulimit was too small and I've updated it.
>
> I'm just curious why are there still so many 503 errors being generated
> (Error - Rsolr::Error::Http - 503 Service Unavailable - retrying ...)
>
> Is it related to all the "Error opening new searcher. exceeded limit of
> maxWarmingSearchers=2, try again later" java exceptions and if so is there a
> way to reduce these?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Can the export handler be used with the edismax or dismax query handler

2018-07-27 Thread Erick Erickson
What about cursorMark? That's designed to handle repeated calls with
increasing "start" parameters without bogging down.

https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html

Best,
Erick

On Fri, Jul 27, 2018 at 9:47 AM, Tom Burton-West  wrote:
> Thanks Joel,
>
> My use case is that I have a complex edismax query (example below)  and the
> user wants to download the set of *all* search results (ids and some small
> metadata fields).   So they don't need the relevance ranking.  However, I
> need to somehow get the exact set that the complex edismax query matched.
>
> Should I try to write some code to rewrite  the logic of the edismax query
> with a complex boolean query or would it make sense for me to look at
> possibly modifying the export handler for my use case?
>
> Tom
>
> "q= _query_:"{!edismax
> qf='ocr^5+allfieldsProper^2+allfields^1+titleProper^50+title_topProper^30+title_restProper^15+title^10+title_top^5+title_rest^2+series^5+series2^5+author^80+author2^50+issn^1+isbn^1+oclc^1+sdrnum^1+ctrlnum^1+id^1+rptnum^1+topicProper^2+topic^1+hlb3^1+fullgeographic^1+fullgenre^1+era^1+'
> pf='title_ab^1+titleProper^1500+title_topProper^1000+title_restProper^800+series^100+series2^100+author^1600+author2^800+topicProper^200+fullgenre^200+hlb3^200+allfieldsProper^100+'
> mm='100%25' tie='0.9' } European Art History"
>
>
> On Thu, Jul 26, 2018 at 6:02 PM, Joel Bernstein  wrote:
>
>> The export handler doesn't allow sorting by score at this time. It only
>> supports sorting on fields. So the edismax qparser won't cxcurrently work
>> with the export handler.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Thu, Jul 26, 2018 at 5:52 PM, Tom Burton-West 
>> wrote:
>>
>> > Hello all,
>> >
>> > I am completely new to the export handler.
>> >
>> > Can the export handler be used with the edismax or dismax query handler?
>> >
>> > I tried using local params :
>> >
>> > q= _query_:"{!edismax qf='ocr^5+allfields^1+titleProper^50'
>> > mm='100%25'
>> > tie='0.9' } art"
>> >
>> > which does not seem to be working.
>> >
>> > Tom
>> >
>>


Help on indexing nested documenta in MongoDB

2018-07-27 Thread Wendy2
Hi fellow Solr users,

I am looking for a way to index nested documents in mongodb using Solr's
DataImportHandler. Is there any recommendations? 

I googled on the web in the last two weeks and found the following posts. I
was able to index the top level fields W/O any issue, but had trouble in
indexing the nested object. 

In the past, I used mongo-connector to index simple JSON documents in
mongodb. But for deeply nested documents in MongoDB, I am looking for a more
powerful way to do it. 

Is there any recommendations or good posts on indexing nested documents in
mongodb using Solr's DataImportHandler?   Thanks!

===tried by following the following references==

https://stackoverflow.com/questions/21450555/steps-to-connect-mongodb-and-solr-using-dataimporthandler



https://github.com/james75/SolrMongoImporter/blob/master/src/main/org/apache/solr/handler/dataimport/MongoMapperTransformer.java

https://github.com/5missions/mongoSolrImporter

https://stackoverflow.com/questions/21450555/steps-to-connect-mongodb-and-solr-using-dataimporthandler

https://mrstevenzhao.blogspot.com/2016/05/apache-solr-install-w-mongodb-indexing.html?showComment=1532376114861#c9086728334737074426

 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Recent configuration change to our site causes frequent index corruption

2018-07-27 Thread cyndefromva
That makes sense, the ulimit was too small and I've updated it. 

I'm just curious why are there still so many 503 errors being generated
(Error - Rsolr::Error::Http - 503 Service Unavailable - retrying ...)

Is it related to all the "Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later" java exceptions and if so is there a
way to reduce these?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Can the export handler be used with the edismax or dismax query handler

2018-07-27 Thread Tom Burton-West
Thanks Joel,

My use case is that I have a complex edismax query (example below)  and the
user wants to download the set of *all* search results (ids and some small
metadata fields).   So they don't need the relevance ranking.  However, I
need to somehow get the exact set that the complex edismax query matched.

Should I try to write some code to rewrite  the logic of the edismax query
with a complex boolean query or would it make sense for me to look at
possibly modifying the export handler for my use case?

Tom

"q= _query_:"{!edismax
qf='ocr^5+allfieldsProper^2+allfields^1+titleProper^50+title_topProper^30+title_restProper^15+title^10+title_top^5+title_rest^2+series^5+series2^5+author^80+author2^50+issn^1+isbn^1+oclc^1+sdrnum^1+ctrlnum^1+id^1+rptnum^1+topicProper^2+topic^1+hlb3^1+fullgeographic^1+fullgenre^1+era^1+'
pf='title_ab^1+titleProper^1500+title_topProper^1000+title_restProper^800+series^100+series2^100+author^1600+author2^800+topicProper^200+fullgenre^200+hlb3^200+allfieldsProper^100+'
mm='100%25' tie='0.9' } European Art History"


On Thu, Jul 26, 2018 at 6:02 PM, Joel Bernstein  wrote:

> The export handler doesn't allow sorting by score at this time. It only
> supports sorting on fields. So the edismax qparser won't cxcurrently work
> with the export handler.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Jul 26, 2018 at 5:52 PM, Tom Burton-West 
> wrote:
>
> > Hello all,
> >
> > I am completely new to the export handler.
> >
> > Can the export handler be used with the edismax or dismax query handler?
> >
> > I tried using local params :
> >
> > q= _query_:"{!edismax qf='ocr^5+allfields^1+titleProper^50'
> > mm='100%25'
> > tie='0.9' } art"
> >
> > which does not seem to be working.
> >
> > Tom
> >
>


Re: create collection from existing managed-schema

2018-07-27 Thread Chuming Chen
Yes. command line with -d works.

Thanks,

Chuming


On Jul 27, 2018, at 7:49 AM, Alexandre Rafalovitch  wrote:

> For non cloud, the schema is on the filesystem.
> 
> At least from command line, you can specify path to it with -d flag when
> creating a new core. It will then be treated as template to copy.
> 
> That is more of a trick than production approach though.
> 
> Regards,
>Alex
> 
> 
> 
> On Wed, Jul 25, 2018, 1:04 PM Chuming Chen,  wrote:
> 
>> Hi All,
>> 
>> From Solr Admin interface, I have created a collection and added field
>> definitions. I can get its managed-schema from the Admin interface.
>> 
>> Can I use this managed-schema to create a new collection? If yes, how?
>> 
>> Thanks,
>> 
>> Chuming
>> 
>> 



Re: create collection from existing managed-schema

2018-07-27 Thread Shawn Heisey
On 7/25/2018 11:04 AM, Chuming Chen wrote:
> From Solr Admin interface, I have created a collection and added field 
> definitions. I can get its managed-schema from the Admin interface. 
>
> Can I use this managed-schema to create a new collection? If yes, how?

What Solr version?

The fact that you talk about "collection" suggests that you are running
in SolrCloud mode.  If you're not running in cloud mode, then
"collection" is not the correct terminology for an index.

If you're running in cloud mode, then your configuration for the
existing collection (which includes the schema) will be stored in
zookeeper, and you will be able to create a new collection that also
uses that same configuration.  You'll just have to figure out what the
name of the configuration is.  Keep in mind that any changes you make to
the configuration after that will affect *both* collections.

If you're not running in cloud mode, then creating a new core usually
involves placing a conf directory before trying to create the core - it
can't be done entirely via HTTP.

Thanks,
Shawn



Re: Configuring ZK Timeouts

2018-07-27 Thread Shawn Heisey
On 7/26/2018 8:58 AM, solrnoobie wrote:
> We are having problems with zk / solr node recovery and we are encountering
> this issue:
>
>  [   ] o.a.z.ClientCnxn Client session timed out, have not heard from server
> in 5003ms
>
> We have set the solr.xml zkClientTimeout to 30 secs.

The internal default for zkClientTimeout is 15 seconds, and recent
example configurations use a value of 30 seconds.  This value is used by
Solr to configure the zookeeper client session timeout.

Assuming that this log entry is found in the Solr server log ... it
appears that zkClientTimeout is set to 5 seconds.  If Solr were ignoring
its configuration entirely, the session timeout would be 15 seconds, not
5 seconds ... so I believe that Solr *is* paying attention to some kind
of config.

Can you share your full solr.xml file as well as everything in "JVM
Args" on the admin UI dashboard?

What version of Solr is it?

The default value of 15 seconds is a relative eternity to a program like
Solr.  Unless system resources are inadequate for handling the load, I
would even expect 5 seconds to be plenty of time.

Thanks,
Shawn



Re: sharding and placement of replicas

2018-07-27 Thread Shawn Heisey
On 7/25/2018 3:49 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote:
> I end up with four cores instead of two, as expected. The problem is that 
> three of the four cores (col_shard1_0_replica_n5, col_shard1_0_replica0 and 
> col_shard1_1_replica_n6) are *all on hostname1*. Only col_shard1_1_replica0 
> was placed on hostname2.

> My question is: How can I tell Solr "avoid putting two replicas of the same 
> shard on the same node"?

Somehow I missed that there were three cores on host1 when you first
described the problem.  Looking back, I see that you did have that
information there.  I was more focused on the fact that host2 only had
one core.  My apologies for not reading closely enough.

Is this collection using compositeId or implicit?  I think it would have
to be compositeId for a split to work correctly.  I wouldn't expect
split to be supported on a collection with the implicit router.

Are you running one Solr node per host?  If you have multiple Solr nodes
(instances) on one host, Solr will have no idea that this is the case --
the entire node identifier (including host name, port, and context path)
is compared to distinguish nodes from each other.  The assumption in
SolrCloud's internals is that each node is completely separate from
every other node.  Running multiple nodes per host is only recommended
when the heap requirements are *very* high, and in that situation,
making sure that replicas are distributed properly will require extra
effort.  For most installations, it is strongly recommended to only have
one Solr node per physical host.

If you are only running one Solr node per host, then the way it's
behaving for you is certainly not the design intent, and sounds like a
bug in SPLITSHARD.  Solr should try very hard to not place multiple
replicas of one shard on the same *node*.

A side question for devs that know about SolrCloud internals:  Could
SolrCloud avoid putting multiple replicas of the same shard on the same
host when there are multiple nodes per host?  It seems to me that it
would not be supremely difficult to have SolrCloud detect a match in the
host name and use that information to prefer nodes on different hosts
when possible.  I am thinking about creating an issue for this enhancement.

Thanks,
Shawn



Re: Recent configuration change to our site causes frequent index corruption

2018-07-27 Thread Pure Host - Wolfgang Freudenberger

Hi,


You have to increase the openfile limit for your SOLR user - you can
check it with uname -a. It should show something about 1024.

To increase it, you have to raise the systemlimit in
/etc/security/limits.conf.

Add the following lines:

* hard nofile 102400
* soft nofile 102400
root hard nofile 102400
root soft nofile 102400


Cheers

Mit freundlichem Gruß / kind regards

Wolfgang Freudenberger
Pure Host IT-Services
Münsterstr. 14
48341 Altenberge
GERMANY
Tel.: (+49) 25 71 - 99 20 170
Fax: (+49) 25 71 - 99 20 171

Umsatzsteuer ID DE259181123

Informieren Sie sich über unser gesamtes Leistungsspektrum unter 
www.pure-host.de
Get our whole services at www.pure-host.de

Am 27.07.2018 um 15:53 schrieb cyndefromva:

I have Rails 5 application that uses solr to index and search our site. The
sunspot gem is used to integrate ruby and sunspot.  It's a relatively small
site (no more 100,000 records) and has moderate usage (except for the
googlebot).

Until recently we regularly received 503 errors; reloading the page
generally cleared it up, but that was not exactly the user experience we
wanted so we added the following initializer to force the retry on failures:

Sunspot.session =
Sunspot::SessionProxy::Retry5xxSessionProxy.new(Sunspot.session)

As a result, about every third day the site locks up until we rebuild the
data directory (stop solr, move data directory to another location, start
solr, reindex).

At the point it starts failing I see a java exception: "java.io-IOException:
Too many open files" in the solr log file and a SolrException (Error open
new searcher) is returned to the user.

In the solrconfig.xml file we have autoCommit and autoSoftCommit set as
follows:

   
  ${solr.autoCommit.maxTime:15000}
  false
   

   
  ${solr.autoSoftCommit.maxTime:-1}
   

Which I believe means there should be a hard commit every 15 seconds.

But it appears to be calling commit more frequently. In the solr log I see
the following commit written miliseconds from each other:

   UpdateHandler start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

I also see the following written right below it:

PERFORMANCE WARNING: Overlapping onDeckSearchers=2

Note: maxWarmingSearchers is set to 2.


I would really appreciate any help I can get to resolve this issue.

Thank you!



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html








smime.p7s
Description: S/MIME Cryptographic Signature


Recent configuration change to our site causes frequent index corruption

2018-07-27 Thread cyndefromva
I have Rails 5 application that uses solr to index and search our site. The
sunspot gem is used to integrate ruby and sunspot.  It's a relatively small
site (no more 100,000 records) and has moderate usage (except for the
googlebot).

Until recently we regularly received 503 errors; reloading the page
generally cleared it up, but that was not exactly the user experience we
wanted so we added the following initializer to force the retry on failures:

Sunspot.session =
Sunspot::SessionProxy::Retry5xxSessionProxy.new(Sunspot.session)

As a result, about every third day the site locks up until we rebuild the
data directory (stop solr, move data directory to another location, start
solr, reindex).

At the point it starts failing I see a java exception: "java.io-IOException:
Too many open files" in the solr log file and a SolrException (Error open
new searcher) is returned to the user.

In the solrconfig.xml file we have autoCommit and autoSoftCommit set as
follows:

  
 ${solr.autoCommit.maxTime:15000}
 false
  

  
 ${solr.autoSoftCommit.maxTime:-1}
  

Which I believe means there should be a hard commit every 15 seconds.

But it appears to be calling commit more frequently. In the solr log I see
the following commit written miliseconds from each other:

  UpdateHandler start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

I also see the following written right below it:

PERFORMANCE WARNING: Overlapping onDeckSearchers=2

Note: maxWarmingSearchers is set to 2.


I would really appreciate any help I can get to resolve this issue.

Thank you! 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: java.lang.OutOfMemoryError indexing xlsm and xlsx file

2018-07-27 Thread Andrea Gazzarini
Hi Mario, could you please share your settings (e.g. OS, JVM memory, 
System memory)?


Andrea

On 27/07/18 11:36, Bisonti Mario wrote:

Hallo
I obtain the error indexing a .xlsm or .xlsx file of 11 MB

What could I do?

Thanks a lot
Mario

2018-07-27 11:08:25.634 WARN  (qtp1521083627-99) [   x:core_share] 
o.e.j.s.HttpChannel /solr/core_share/update/extract
java.lang.OutOfMemoryError
 at 
java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188)
 at 
java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180)
 at 
java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147)
 at 
java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660)
 at java.base/java.lang.StringBuilder.append(StringBuilder.java:195)
 at 
org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:302)
 at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
 at 
org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
 at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
 at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
 at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
 at 
org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
 at 
org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
 at 
org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
 at 
org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
 at 
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)
 at 
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)
 at 
org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler.run(OOXMLTikaBodyPartHandler.java:147)
 at 
org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.handleEndOfRun(OOXMLWordAndPowerPointTextHandler.java:468)
 at 
org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.endElement(OOXMLWordAndPowerPointTextHandler.java:450)
 at 
org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
 at 
org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
 at 
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609)
 at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1714)
 at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2879)
 at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
 at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
 at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:532)
 at 
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
 at 
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
 at 
java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
 at 
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
 at 
java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
 at 
java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:324)
 at java.xml/javax.xml.parsers.SAXParser.parse(SAXParser.java:197)
 at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleGeneralTextContainingPart(AbstractOOXMLExtractor.java:506)
 at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processShapes(XSSFExcelExtractorDecorator.java:279)
 at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:185)
 at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135)
 at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120)
 at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143)
 at 
org.apac

Re: problems with while loop in indexing - no results given when I use that

2018-07-27 Thread Chris Ulicny
The lucenenet project is a separate Apache project from Solr (and the
Lucene project as well).

You will have better luck getting helpful information on their mailing list
(https://cwiki.apache.org/confluence/display/LUCENENET/Mailing+Lists).

Best,
Chris

On Fri, Jul 27, 2018 at 8:40 AM - -  wrote:

> I use lucene.net to index the documents. My main aim was to get to search
> and have the line number and the line of text returned in a document.
>
>
> Here's the code that indexes
>
> using (TextReader contentsReader = new StreamReader(fi.FullName))
> {
> doc.Add(new StringField("FullFileName", fi.FullName, Field.Store.YES));
> doc.Add(new StringField("LastModifiedDate", modDate, Field.Store.YES));
> //doc.Add(new TextField("Contents", contentsReader.ReadToEnd(),
> Field.Store.YES));
>
> int lineCount = 1;
> string line = String.Empty;
> while ((line = contentsReader.ReadLine()) != null)
> {
> doc.Add(new Int32Field("LineNo", lineCount, Field.Store.YES));
> doc.Add(new TextField("Contents", line, Field.Store.YES));
> lineCount++;
> }
>
> Console.ForegroundColor = ConsoleColor.Blue;
> Console.WriteLine("adding " + fi.Name);
> Console.ResetColor();
> writer.AddDocument(doc);
> }
>
>
> As you can see I add the filename, modified date, then I loop through all
> the lines in the file and add aTextFieldfor each line.
>
>
> This is how I search:
>
> Lucene.Net.Store.Directory directory =
> Lucene.Net.Store.FSDirectory.Open(new System.IO.DirectoryInfo(indexDir));
> Lucene.Net.Search.IndexSearcher searcher = new
> Lucene.Net.Search.IndexSearcher(Lucene.Net.Index.DirectoryReader.Open(directory));
> TopScoreDocCollector collector = TopScoreDocCollector.Create(100, true);
> searcher.Search(query, collector);
> ScoreDoc[] hits1 = collector.GetTopDocs().ScoreDocs;
> for (int i = 0; i < hits1.Length; i++)
> {
> int docId = hits1[i].Doc;
> float score = hits1[i].Score;
>
> Lucene.Net.Documents.Document doc = searcher.Doc(docId);
>
> string result = "FileName: " + doc.Get("FullFileName") + "\n"+
> " Line No: " + doc.Get("LineNo") + "\n"+
> " Contents: " + doc.Get("Contents");
> }
>
>
> Yet. My search results return 0 hits whereas if I simply comment out
> thatwhileloop and uncomment the commented line above I get the results.
>
>
> What could be the problem?
>


java.lang.OutOfMemoryError indexing xlsm and xlsx file

2018-07-27 Thread Bisonti Mario
Hallo
I obtain the error indexing a .xlsm or .xlsx file of 11 MB

What could I do?

Thanks a lot
Mario

2018-07-27 11:08:25.634 WARN  (qtp1521083627-99) [   x:core_share] 
o.e.j.s.HttpChannel /solr/core_share/update/extract
java.lang.OutOfMemoryError
at 
java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188)
at 
java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180)
at 
java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147)
at 
java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660)
at java.base/java.lang.StringBuilder.append(StringBuilder.java:195)
at 
org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:302)
at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at 
org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at 
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at 
org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
at 
org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
at 
org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
at 
org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
at 
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)
at 
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler.run(OOXMLTikaBodyPartHandler.java:147)
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.handleEndOfRun(OOXMLWordAndPowerPointTextHandler.java:468)
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.endElement(OOXMLWordAndPowerPointTextHandler.java:450)
at 
org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
at 
org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
at 
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609)
at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1714)
at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2879)
at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at 
java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:532)
at 
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
at 
java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
at 
java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at 
java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at 
java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
at 
java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:324)
at java.xml/javax.xml.parsers.SAXParser.parse(SAXParser.java:197)
at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleGeneralTextContainingPart(AbstractOOXMLExtractor.java:506)
at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processShapes(XSSFExcelExtractorDecorator.java:279)
at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:185)
at 
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135)
at 
org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120)
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143)
at 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.ti

problems with while loop in indexing - no results given when I use that

2018-07-27 Thread - -
I use lucene.net to index the documents. My main aim was to get to search and 
have the line number and the line of text returned in a document.


Here's the code that indexes

using (TextReader contentsReader = new StreamReader(fi.FullName))
{
doc.Add(new StringField("FullFileName", fi.FullName, Field.Store.YES));
doc.Add(new StringField("LastModifiedDate", modDate, Field.Store.YES));
//doc.Add(new TextField("Contents", contentsReader.ReadToEnd(), 
Field.Store.YES));

int lineCount = 1;
string line = String.Empty;
while ((line = contentsReader.ReadLine()) != null)
{
doc.Add(new Int32Field("LineNo", lineCount, Field.Store.YES));
doc.Add(new TextField("Contents", line, Field.Store.YES));
lineCount++;
}

Console.ForegroundColor = ConsoleColor.Blue;
Console.WriteLine("adding " + fi.Name);
Console.ResetColor();
writer.AddDocument(doc);
}


As you can see I add the filename, modified date, then I loop through all the 
lines in the file and add aTextFieldfor each line.


This is how I search:

Lucene.Net.Store.Directory directory = Lucene.Net.Store.FSDirectory.Open(new 
System.IO.DirectoryInfo(indexDir));
Lucene.Net.Search.IndexSearcher searcher = new 
Lucene.Net.Search.IndexSearcher(Lucene.Net.Index.DirectoryReader.Open(directory));
TopScoreDocCollector collector = TopScoreDocCollector.Create(100, true);
searcher.Search(query, collector);
ScoreDoc[] hits1 = collector.GetTopDocs().ScoreDocs;
for (int i = 0; i < hits1.Length; i++)
{
int docId = hits1[i].Doc;
float score = hits1[i].Score;

Lucene.Net.Documents.Document doc = searcher.Doc(docId);

string result = "FileName: " + doc.Get("FullFileName") + "\n"+
" Line No: " + doc.Get("LineNo") + "\n"+
" Contents: " + doc.Get("Contents");
}


Yet. My search results return 0 hits whereas if I simply comment out 
thatwhileloop and uncomment the commented line above I get the results.


What could be the problem?


Re: Recent configuration change to our site causes frequent index corruption

2018-07-27 Thread Shawn Heisey

On 7/26/2018 1:32 PM, cyndefromva wrote:

At the point it starts failing I see a java exception: "java.io-IOException:
Too many open files" in the solr log file and a SolrException (Error open
new searcher) is returned to the user.


The operating system where Solr is running needs its open file limit 
increased.  Exactly how to do this will depend on what OS it is.  Most 
Linux systems need to have /etc/security/limits.conf edited.



But it appears to be calling commit more frequently. In the solr log I see
the following commit written miliseconds from each other:

   UpdateHandler start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

I also see the following written right below it:

PERFORMANCE WARNING: Overlapping onDeckSearchers=2


You have autoCommit with openSearcher set to false.  The commit log 
entry you've pasted is openSearcher=true ... it is not coming from 
autoCommit.


The overlapping onDeckSearchers is also not being caused by autoCommit 
-- that will not open a new searcher.  The commit log also has 
softCommit=false, which means that it is an explicit hard commit, most 
likely coming from your indexing application.  If autoSoftCommit or 
commitWithin were happening, it would be a soft commit.


Thanks,
Shawn



Re: change the ranking function

2018-07-27 Thread Shawn Heisey

On 7/25/2018 10:46 PM, Reem wrote:

The way I found to change the ranking function is by setting the similarity 
property of text fields in schema.xml as follows:
``

However, this means we can only set the similarity/ranking function only in 
indexing time. As Solr is built over Lucene which allows changing the ranking 
function in search time, I find it not logical that Solr doesn’t support it, so 
it seems I’m missing something here!


That setting will change the similarity used for both index and query.  
I am not aware of any way in Solr to specify a different similarity for 
index than what is used for query.  It wouldn't surprise me to learn 
that this is possible when writing Lucene code directly.


Thanks,
Shawn



Re: create collection from existing managed-schema

2018-07-27 Thread Alexandre Rafalovitch
For non cloud, the schema is on the filesystem.

At least from command line, you can specify path to it with -d flag when
creating a new core. It will then be treated as template to copy.

That is more of a trick than production approach though.

Regards,
Alex



On Wed, Jul 25, 2018, 1:04 PM Chuming Chen,  wrote:

> Hi All,
>
> From Solr Admin interface, I have created a collection and added field
> definitions. I can get its managed-schema from the Admin interface.
>
> Can I use this managed-schema to create a new collection? If yes, how?
>
> Thanks,
>
> Chuming
>
>


Re: Edismax and ShingleFilterFactory exponential term grow

2018-07-27 Thread Jokin Cuadrado
Ok, I thought that it was somehow expected, but what bothers me is that if
I use min and max = 2 or min and max = 3, it grows linearly, but when I
change to min = 2 and max = 3, the number of tokens explode.

What I expect it was going to do was to make first the 2 shingles clauses
and after the 3 shingles one, making something like:

text_shingles:word1_word2 text_shingles:word2_word3
text_shingles:word3_word4   text_shingles:word1_word2_word3
text_shingles:word2_word3_word4, i

Actually, if I analyze the field in the output it's ok, but when it uses
that information to create query it creates a lot of groups.


But when the query gets build it explodes with so many clauses

for example, the term " text_shingles:word4 word5" appears 4 times, and as
you grow the same term repeats even more, when I though that each term
should appear 1 time in each query.

5 words:
"parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
+text_shingles:word2 +text_shingles:word3 +text_shingles:word4
+text_shingles:word5) (+text_shingles:word1 +text_shingles:word2
+text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
+text_shingles:word2 +text_shingles:word3 word4 +text_shingles:word5)
(+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4
word5) (+text_shingles:word1 +text_shingles:word2 word3
+text_shingles:word4 +text_shingles:word5) (+text_shingles:word1
+text_shingles:word2 word3 +text_shingles:word4 word5)
(+text_shingles:word1 +text_shingles:word2 word3 word4
+text_shingles:word5) (+text_shingles:word1 word2 +text_shingles:word3
+text_shingles:word4 +text_shingles:word5) (+text_shingles:word1 word2
+text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
word2 +text_shingles:word3 word4 +text_shingles:word5)
(+text_shingles:word1 word2 +text_shingles:word3 word4 word5)
(+text_shingles:word1 word2 word3 +text_shingles:word4
+text_shingles:word5) (+text_shingles:word1 word2 word3 +text_shingles:word4
word5",



On Fri, Jul 27, 2018 at 1:38 AM, Erick Erickson 
wrote:

> This is doing exactly what it should. It'd be a little clearer if you
> used a tokenSeparator other than the default space. Then this line:
>
> text_shingles:word1 word2 word3+text_shingles:word4 word5
>
> would look more like this:
> text_shingles:word1_word2_word3+text_shingles:word4_word5
>
> It's building a query from all of the 1, 2 and 3 grams. You're getting
> the single tokens because outputUnigrams defaults to "true".
>
> So of course as the number of terms in the query grows the number of
> clauses int he parsed query grows non-linearly.
>
> Best,
> Erick
>
> On Thu, Jul 26, 2018 at 12:44 PM, Jokin C 
> wrote:
> > Hi, I have a problem and I don't know if it's something that am and doing
> > wrong or if it's maybe a bug. I want to query a field with shingles, the
> > field and type definition are this:
> >
> >  > stored="false"/>
> >
> >  > positionIncrementGap="100">
> > 
> >   
> >   
> >> maxShingleSize="3" />
> > 
> >   
> >
> >
> > I'm using Solr  7.2.1.
> >
> > I jus wanted to have different min and max shingle sizes to test how ir
> > works, but if the query is long solr is giving timeouts, high cpu and
> OOM.
> >
> > the query I'm using is this:
> >
> > http://localhost:8983/solr/ntnx/select?debugQuery=on&q={!
> edismax%20%20qf=%22text_shingles%22%20}%22%20word1%
> 20word2%20word3%20word4%20word5%20word6%20word7
> >
> > and the parsed query grows like this with just 4 words, when I use a
> query
> > with a lot of word it fails.
> >
> > 2 words:
> > "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> > +text_shingles:word2) text_shingles:word1 word2)))",
> >
> > 3words:
> > "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> > +text_shingles:word2 +text_shingles:word3) (+text_shingles:word1
> > +text_shingles:word2 word3) (+text_shingles:word1 word2
> > +text_shingles:word3) text_shingles:word1 word2 word3)))",
> >
> > 4 words:
> > "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> > +text_shingles:word2 +text_shingles:word3 +text_shingles:word4)
> > (+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4)
> > (+text_shingles:word1 +text_shingles:word2 word3 +text_shingles:word4)
> > (+text_shingles:word1 +text_shingles:word2 word3 word4)
> > (+text_shingles:word1 word2 +text_shingles:word3 +text_shingles:word4)
> > (+text_shingles:word1 word2 +text_shingles:word3 word4)
> > (+text_shingles:word1 word2 word3 +text_shingles:word4",
> >
> > 5 words:
> > "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> > +text_shingles:word2 +text_shingles:word3 +text_shingles:word4
> > +text_shingles:word5) (+text_shingles:word1 +text_shingles:word2
> > +text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
> > +text_shingles:word2 +text_shingles:word3 word4 +text_shingles:word5)
> > (+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4
> > word5) (+text_shingles:word1 +text_shingles:wo

Odp.: Problem in QueryElevationComponent with solr 7.4.0

2018-07-27 Thread nc-tech-user
Hi there.
Are there any ideas?

Od: nc-tech-user 
Wysłane: 19 lipca 2018 11:09
Do: solr-user@lucene.apache.org
Temat: Problem in QueryElevationComponent with solr 7.4.0

Hello.


We are using solr 6.6.2 and want to upgrade it to version 7.4.0.

But we have a problem with QueryElevationComponent when adding parameter 
"elevateIds=..." and "fl=[elevated]"


Expample of query 
/solr/products/select?omitHeader=true&elevateIds=1,2,3,4,5&q=*:*&start=0&rows=20&fl=id,[elevated]&enableElevation=true&fq=category_1_id_is:123​&forceElevation=true​

and in response we've got http error 500 with such stacktrace


java.lang.AssertionError: Expected an IndexableField but got: class 
java.lang.String
at 
org.apache.solr.response.transform.BaseEditorialTransformer.getKey(BaseEditorialTransformer.java:72)
at 
org.apache.solr.response.transform.BaseEditorialTransformer.transform(BaseEditorialTransformer.java:52)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:123)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:59)
at 
org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:276)
at 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:162)
at 
org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:209)
at 
org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:325)
at 
org.apache.solr.response.JSONWriter.writeResponse(JSONResponseWriter.java:120)
at org.apache.solr.response.JSONResponseWriter.write(JSONResponseWriter.java:71)
at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
at org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:787)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:524)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:531)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:760)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:678)
at java.lang.Thread.run(Thread.java:748)


Configuration in solrconf.xml of select request handler is




explicit
10


Re: change the ranking function

2018-07-27 Thread Joël Trigalo
Hi,

It is not possible in general because similarities are computing norms at
index time. (
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/similarities/Similarity.java#L46
)
My understanding is that you should double a field and set different
similarity to the new field in order to be able to change similarity for
every query. If someone has a better idea, I am also interested.


On Thu, Jul 26, 2018 at 8:51 AM Reem  wrote:

> Hello,
>
> Is it possible to change the ranking function (e.g., BM25Similarity,
> ClassicSimilarity, LMDirichletSimilarity, etc) in search time?
>
> The way I found to change the ranking function is by setting the
> similarity property of text fields in schema.xml as follows:
> ``
>
> However, this means we can only set the similarity/ranking function only
> in indexing time. As Solr is built over Lucene which allows changing the
> ranking function in search time, I find it not logical that Solr doesn’t
> support it, so it seems I’m missing something here!
>
> Any idea on how to achieve this?
>
> Reem
>