RE: Using the date field for searching

2015-08-11 Thread Bade, Vidya (Sagar)
You can use filter query and form the date as follows when a user enters just 
the year or year and month:

If just the year (1885) was entered - date:[1885-01-01T00:00:00Z TO 
1886-01-01T00:00:00Z]
If just the year and month (1885-06) were entered - date:[1885-06-01T00:00:00Z 
TO 1885-07-01T00:00:00Z]

Alternatively use DateRangeField as described at the bottom in the following 
webpage: 

https://cwiki.apache.org/confluence/display/solr/Working+with+Dates

:Sagar

-Original Message-
From: Scott Derrick [mailto:sc...@tnstaafl.net] 
Sent: Tuesday, August 11, 2015 3:02 PM
To: solr-user@lucene.apache.org
Subject: Using the date field for searching

If I query date:1885

I get an error

org.apache.solr.common.SolrException: Invalid Date String:'1885'

If I query date:1885*

I get no results.

and yet there are numerous docs with a year of 1885 in the date string, like so

arr name=datedate1885-02-08T00:00:00Z/date/arr

if I query date:1885-02-08T00:00:00Z

I get 9 results??

Do the users really have to specify a full xml compliant date string to use the 
date: field for searching?

thanks,

Scott


Re: Solr MLT with stream.body returns different results on each shard

2015-08-11 Thread Chris Hostetter

: I have a fresh install of Solr 5.2.1 with about 3 million docs freshly
: indexed (I can also reproduce this issue on 4.10.0). When I use the Solr
: MorelikeThisHandler with content stream I'm getting different results per
: shard.

I haven't looked at the code recently but i'm 99% certain that the MLT 
handler in general doesn't work with distributed (ie: sharded) queries.  
(unlike the MLT component and the recently added MLT qparser)

I suspect that in the specific case of stream.body, what you are seeing is 
that the interesting terms are being computed relative the local tf/idf 
stats for that shard, and then only local results from that shard are 
being returned.

: I also looked at using a standard MLT query, but I need to be able to
: stream in a fairly large block of text for comparison that is not in the
: index (different type of document). A standard MLT  query

Until/unless the MLT parser supports arbitrary text (there's some mention 
of this in SOLR-7639 but i'm not sure what the status of that is) you 
might find that just POSTing all of your text as a regular query (q) using 
dismax or edismax is suitable for your needs -- that's essentially the 
equivilent of what MLTHandler does with a stream.body, except it tries to 
only focus on interesting terms based on tf/idf, but if your fields 
are all configured with stopword files anyway, then the results and 
performance may be similar.


-Hoss
http://www.lucidworks.com/


Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread deniz
okay, to make everything clear, here are the steps:

- Creating configs etc and then running:

./zkcli.sh -cmd upconfig -n CoreA -d /path/to/core/configs/CoreA/conf/ -z
zk1:2181,zk2:2182,zk3:2183

- Then going to http://someserver:8983/solr/#/~cores

- Clicking Add Core:
http://lucene.472066.n3.nabble.com/file/n4222345/Screen_Shot_2015-08-11_at_14.png
 

Repateding the last step on other node as well

So this is invalid (incl https://wiki.apache.org/solr/CoreAdmin)? 



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222345.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr old log files are not archived or removed automatically.

2015-08-11 Thread Adrian Liew
Hi Erick,

1 how did you install/run your Solr? As a service or regular? See
the reference guide, Permanent Logging Settings for some info on the 
difference there.

What is the difference between regular or service?

2 what does your log4j.properties file look like?

Here are the contents in the log4j.properties file:

#  Logging level
solr.log=logs
log4j.rootLogger=INFO, file, CONSOLE

log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender

log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%-4r [%t] %-5p %c %x 
[%X{collection} %X{shard} %X{replica} %X{core}] \u2013 %m%n

#- size rotation with log cleanup.
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.MaxFileSize=4MB
log4j.appender.file.MaxBackupIndex=9

#- File to log to and log format
#log4j.appender.file.File=${solr.log}/solr.log
log4j.appender.file.File=C:/solr_logs/solr.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%-5p - %d{-MM-dd 
HH:mm:ss.SSS}; [%X{collection} %X{shard} %X{replica} %X{core}] %C; %m\n

log4j.logger.org.apache.zookeeper=WARN
log4j.logger.org.apache.hadoop=WARN

# set to INFO to enable infostream log messages
log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF

I am not sure how best I can limit the size of the solr_logs directory. Does 
log4j come with a feature to remove old log files with a given retention period?

Best regards,

Adrian Liew 

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, August 10, 2015 11:36 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr old log files are not archived or removed automatically.

1 how did you install/run your Solr? As a service or regular? See
the reference guide, Permanent Logging Settings for some info on the 
difference there.

2 what does your log4j.properties file look like?

Best,
Erick

On Mon, Aug 10, 2015 at 12:13 AM, Adrian Liew adrian.l...@avanade.com wrote:
 Hi there,

 I am using Solr v.5.2.1 on my local machine. I realized that old log files 
 are not removed in a timely manner by log4j. The logs which I am referring to 
 are the log files that reside within solr_directory\server\logs. So far I 
 have previous two months' worth of log files accumulated in the log 
 directory. Consequently, this causes my directory grow to such large sizes. I 
 will need to manually remove the old log files which is undesirable.

 Is this is a bug with Solr or a missing configuration that needs to be set?

 As far as I know, all Solr Logging configuration is done in the 
 solr_directory\server\resources\log4j.properties

 Appreciate the soonest reply.

 Thanks.


Re: Make search faster in Solr

2015-08-11 Thread Nitin Solanki
Okay davidphilip.

On Mon, Aug 10, 2015 at 8:24 PM davidphilip cherian 
davidphilipcher...@gmail.com wrote:

 Hi Nitin,

 32 shards for 16 million documents is too much. 2 shards should suffice
 considering your document sizes are moderate. Caches are to be monitored
 and tuned accordingly. You should study about caches a bit here

 https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig



 On Mon, Aug 10, 2015 at 4:34 PM, Nitin Solanki nitinml...@gmail.com
 wrote:

  Hi,
  I have 32 shards and single replica of each shards having 4 nodes
  over Solr cloud.
  I have indexed 16 million documents. Without cache, total time taken to
  search a document is 0.2 second. And with cache is 0.04 second.
  I don't do anything of cache. Caches are set by default in
 solrconfig.xml.
 
  How to make faster search without cache? Or how to make more faster with
  cache while searching. Which cache is used for searching?
 



Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread deniz
thanks for the details Anshum :)

I got one more question, could this kind of error logging might be also
triggered by the amount of incoming requests? I can see these errors only on
prod env, but testing env is totally fine, although the creation process is
exactly the same



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222348.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Concurrent Indexing and Searching in Solr.

2015-08-11 Thread Nitin Solanki
Hi Erick,
 Thanks a lot for your help. I will go through MongoDB.

On Mon, Aug 10, 2015 at 9:14 PM Erick Erickson erickerick...@gmail.com
wrote:

 bq:  I changed
 maxWarmingSearchers*2*/maxWarmingSearchers
 to maxWarmingSearchers*100*/maxWarmingSearchers. And apply simultaneous
 searching using 100 workers.

 Do not do this. This has nothing to do with the number of searcher
 threads. And with
 your update rate, especially if you continue to insist on adding
 commit=true to every
 update request, this will explode your memory requirements. To no good
 purpose
 whatsoever.

 bq: But MongoDB can handle concurrent searching and indexing faster.

 Because MongoDB is optimized for different kinds of operations. Solr
 is a ranking,
 free-text search engine. It's an apples-and-oranges comparison. If MongoDB
 meets your search needs, you should use it.

 Best,
 Erick

 On Sun, Aug 9, 2015 at 11:04 PM, Nitin Solanki nitinml...@gmail.com
 wrote:
  Hi,
   I used solr 5.2.1 version. It is fast, I think. But again, I am
 stuck
  on concurrent searching and threading. I changed
  maxWarmingSearchers*2*/maxWarmingSearchers
  to maxWarmingSearchers*100*/maxWarmingSearchers. And apply
 simultaneous
  searching using 100 workers. It works fast but not upto the mark.
 
  It increases searching from 1.5  to 0.5 seconds. But If I run only single
  worker then searching time is 0.03 seconds,  it is too fast but not
  possible with 100 workers simultaneously.
 
  As Shawn said - Making 100 concurrent indexing requests at the same time
  as 100
  concurrent queries will overwhelm *any* single Solr server. I got your
  point.
 
  But MongoDB can handle concurrent searching and indexing faster. Then why
  not solr? Sorry for this..
 
 
 
  On Mon, Aug 10, 2015 at 2:39 AM Shawn Heisey apa...@elyograg.org
 wrote:
 
  On 8/7/2015 1:15 PM, Nitin Solanki wrote:
   I wrote a python script for indexing and using
   urllib and urllib2 for indexing data via http..
 
  There are a number of Solr python clients.  Using a client makes your
  code much easier to write and understand.
 
  https://wiki.apache.org/solr/SolPython
 
  I have no experience with any of these clients, but I can say that the
  one encountered most often when Python developers come into the #solr
  IRC channel is pysolr.  Our wiki page says the last update for pysolr
  happened in December of 2013, but I can see that the last version on
  their web page is dated 2015-05-26.
 
  Making 100 concurrent indexing requests at the same time as 100
  concurrent queries will overwhelm *any* single Solr server.  In a
  previous message you said that you have 4 CPU cores.  The load you're
  trying to put on Solr will require at *LEAST* 200 threads.  It may be
  more than that.  Any single system is going to have trouble with that.
  A system with 4 cores will be *very* overloaded.
 
  Thanks,
  Shawn
 
 



Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread Anshum Gupta
How did you create your collections? Also, is that verbatim from the logs
or is it just because you obfuscated that part while posting it here?

On Mon, Aug 10, 2015 at 11:02 PM, deniz denizdurmu...@gmail.com wrote:

 Hello Anshum,

 thanks for the quick reply

 I know it is being forwarded one node to the leader node, but for
 collection
 names, it shows different collections while master node address is correct.

 Dunno if I am missing some points but my concern is the bold parts below:

 ERROR - 2015-08-11 05:04:34.592; [*CoreA* shard1 core_node2 *CoreA*]
 org.apache.solr.update.StreamingSolrClients$1; error
 org.apache.solr.common.SolrException: Bad Request
 request:

 http://server:8983/solr/*CoreB*/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Fserver2%3A8983%2Fsolr%2F*CoreB*%2Fwt=javabinversion=2

 So this is also normal?


 Anshum Gupta wrote
  Hi Deniz,
 
  Seems like the update that's being forwarded from a non-leader (original
  node that received the request) is failing. This could be due to multiple
  reasons, including issue with your schema vs document that you sent.
 
  To elaborate more, here's how a typical batched request in SolrCloud
  works.
 
  1. Batch sent from client.
  2. Received by node X.
  3. All documents that have their shard leader on node X, are processed
 and
  distributed to the replicas by node X. All other documents which belong
 to
  a shard who's leader isn't on Node X, get forwarded using the
  ConcurrentUpdateSolrClient to their respective leaders.
 
  There's nothing *strange* about this log, other than the fact that the
  update failed (and would have failed even if you would have directly sent
  the document to this node). Hope this made things clear.
 
  --
  Anshum Gupta





 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222338.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Anshum Gupta


RE: SolrNet and deep pagination

2015-08-11 Thread Adrian Liew
Thanks Chris. We opted to use v0.5 which is an alpha version. And yes you I 
should be referring the SolrNet Google Group.

Thanks for your help.

Regards,
Adrian

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: Tuesday, August 11, 2015 5:17 AM
To: solr-user@lucene.apache.org
Cc: Chong Kah Heng chong.kah.h...@avanade.com
Subject: Re: SolrNet and deep pagination


: Has anyone worked with deep pagination using SolrNet? The SolrNet
: version that I am using is v0.4.0.2002. I followed up with this article,
: https://github.com/mausch/SolrNet/blob/master/Documentation/CursorMark.md
: , however the version of SolrNet.dll does not expose the a StartOrCursor
: property in the QueryOptions class.


I don't know anything about SolrNet, but i do know that the URL you list above 
is for the documentation on the master branch.  If i try to look at the the 
same document on the 0.4.x branch, that document doesn't exist -- suggesting 
the feature isn't supported in the version of SolrNet you are using...

https://github.com/mausch/SolrNet/blob/0.4.x/Documentation/CursorMark.md
https://github.com/mausch/SolrNet/tree/0.4.x/Documentation

In fact, if i search the repo for StartOrCursor i see a file named 
StartOrCursor.cs exists on the master branch, but not on the 0.4.x branch...

https://github.com/mausch/SolrNet/blob/master/SolrNet/StartOrCursor.cs
https://github.com/mausch/SolrNet/blob/0.4.x/SolrNet/StartOrCursor.cs

...so it seems unlikely that this (class?) is supported in the release you are 
using.

Note: according to the docs, there is a SolrNet google group where this 
question is probably the most appopriate: 

https://github.com/mausch/SolrNet/blob/master/Documentation/README.md
https://groups.google.com/forum/#!forum/solrnet




-Hoss
http://www.lucidworks.com/


Re: Cluster down for long time after zookeeper disconnection

2015-08-11 Thread danny teichthal
1. Erik, thanks,  I agree that it is really serious, but I think that the 3
minutes on this case were not mandatory.
On my case it was a deadlock, which smells like some kind of bug.
One replica is waiting for other to come up, before it takes leadership,
while the other is waiting for the election results.
If I will be able to reproduce it on 5.2.1, is it legitimate to file a JIRA
issue for that?

2. Regarding session timeouts, there's something about configuration that I
don't understand.
 If zkClientTimeout is set to 30 seconds, how come see in the log that
session expired after ~50 seconds.
Maybe I have a mismatch between zookeeper and solr configuration?

3. Resuming the question of leaderVoteWait parameter, I have seen in a few
threads that it may be reduced to a minimum.
I'm not clear about the full meaning, but I understand that it is meant to
prevent lose of update on cluster startup.
Can anyone confirm/clarify that?




Links for leaderVoteWait:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3ccajt9wnhivirpn79kttcn8ekafevhhmqwkfl-+i16kbz0ogl...@mail.gmail.com%3E

http://qnalist.com/questions/4812859/waitforleadertoseedownstate-when-leader-is-down

Relevant part from My zookeeper conf:
tickTime=2000
initLimit=10
syncLimit=5



On Tue, Aug 11, 2015 at 1:06 AM, Erick Erickson erickerick...@gmail.com
wrote:

 Not that I know of. With ZK as the one source of truth, dropping below
 quorum
 is Really Serious, so having to wait 3 minutes or so for action to be
 taken is the
 fallback.

 Best,
 Erick

 On Mon, Aug 10, 2015 at 1:34 PM, danny teichthal dannyt...@gmail.com
 wrote:
  Erick, I assume you are referring to zkClientTimeout, it is set to 30
  seconds. I also see these messages on Solr side:
   Client session timed out, have not heard from server in 48865ms for
  sessionid 0x44efbb91b5f0001, closing socket connection and attempting
  reconnect.
  So, I'm not sure what was the actual disconnection duration time, but it
  could have been up to a minute.
  We are working on finding the network issues root cause, but assuming
  disconnections will always occur, are there any other options to overcome
  this issues?
 
 
 
  On Mon, Aug 10, 2015 at 11:18 PM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  I didn't see the zk timeout you set (just skimmed). But if your
 Zookeeper
  was
  down _very_ termporarily, it may suffice to up the ZK timeout. The
 default
  in the 10.4 time-frame (if I remember correctly) was 15 seconds which
 has
  proven to be too short in many circumstances.
 
  Of course if your ZK was down for minutest this wouldn't help.
 
  Best,
  Erick
 
  On Mon, Aug 10, 2015 at 1:06 PM, danny teichthal dannyt...@gmail.com
  wrote:
   Hi Alexander ,
   Thanks for your reply, I looked at the release notes.
   There is one bug fix - SOLR-7503
   https://issues.apache.org/jira/browse/SOLR-7503 – register cores
   asynchronously.
   It may reduce the registration time since it is done on parallel, but
   still, 3 minutes (leaderVoteWait) is a long time to recover from a few
   seconds of disconnection.
  
   Except from that one I don't see any bug fix that addresses the same
   problem.
   I am able to reproduce it on 4.10.4 pretty easily, I will also try it
  with
   5.2.1 and see if it reproduces.
  
   Anyway, since migrating to 5.2.1 is not an option for me in the short
  term,
   I'm left with the question if reducing leaderVoteWait may help here,
 and
   what may be the consequences.
   If i understand correctly, there might be a chance of losing updates
 that
   were made on leader.
   From my side it is a lot worse to lose availability for 3 minutes.
  
   I would really appreciate a feedback on this.
  
  
  
  
   On Mon, Aug 10, 2015 at 6:55 PM, Alexandre Rafalovitch 
  arafa...@gmail.com
   wrote:
  
   Did you look at release notes for Solr versions after your own?
  
   I am pretty sure some similar things were identified and/or resolved
   for 5.x. It may not help if you cannot migrate, but would at least
   give a confirmation and maybe workaround on what you are facing.
  
   Regards,
  Alex.
   
   Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
   http://www.solr-start.com/
  
  
   On 10 August 2015 at 11:37, danny teichthal dannyt...@gmail.com
  wrote:
Hi,
We are using Solr cloud with solr 4.10.4.
On the passed week we encountered a problem where all of our
 servers
disconnected from zookeeper cluster.
This might be ok, the problem is that after reconnecting to
 zookeeper
  it
looks like for every collection both replicas do not have a leader
 and
   are
stuck in some kind of a deadlock for a few minutes.
   
From what we understand:
One of the replicas assume it ill be the leader and at some point
   starting
to wait on leaderVoteWait, which is by default 3 minutes.
The other replica is stuck on this part of code for a few minutes:
 at
  
 

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread deniz
Hello Anshum,

thanks for the quick reply

I know it is being forwarded one node to the leader node, but for collection
names, it shows different collections while master node address is correct.

Dunno if I am missing some points but my concern is the bold parts below:

ERROR - 2015-08-11 05:04:34.592; [*CoreA* shard1 core_node2 *CoreA*]
org.apache.solr.update.StreamingSolrClients$1; error
org.apache.solr.common.SolrException: Bad Request
request:
http://server:8983/solr/*CoreB*/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Fserver2%3A8983%2Fsolr%2F*CoreB*%2Fwt=javabinversion=2

So this is also normal?


Anshum Gupta wrote
 Hi Deniz,
 
 Seems like the update that's being forwarded from a non-leader (original
 node that received the request) is failing. This could be due to multiple
 reasons, including issue with your schema vs document that you sent.
 
 To elaborate more, here's how a typical batched request in SolrCloud
 works.
 
 1. Batch sent from client.
 2. Received by node X.
 3. All documents that have their shard leader on node X, are processed and
 distributed to the replicas by node X. All other documents which belong to
 a shard who's leader isn't on Node X, get forwarded using the
 ConcurrentUpdateSolrClient to their respective leaders.
 
 There's nothing *strange* about this log, other than the fact that the
 update failed (and would have failed even if you would have directly sent
 the document to this node). Hope this made things clear.
 
 -- 
 Anshum Gupta





-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222338.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread Anshum Gupta
bq. adding it on admin interface of solr

Did you not use Collections Admin API? If you try to create your own cores
using the core admin APIs instead of using Collection Admin APIs, you could
really end up shooting yourself in your feet. Also, the only supported
mechanism to create a collection in Solr is via the Collection APIs.

On Mon, Aug 10, 2015 at 11:13 PM, deniz denizdurmu...@gmail.com wrote:

 I have created by simply creating configs and then using upconfig to upload
 to zookeeper, then adding it on admin interface of solr.

 I have only changed the ips of server and server1 and changed the
 core/collection names to CoreA and CoreB, in the logs CoreA and CoreB are
 different collections with different names.



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222341.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Anshum Gupta


Performance warning overlapping onDeckSearchers

2015-08-11 Thread Adrian Liew
Hi there,

Has anyone come across this issue, [some_index] PERFORMANCE WARNING: 
Overlapping onDeckSearchers=2?

I am currently using Solr v5.2.1.

What does this mean? Does this raise red flags?

I am currently encountering an issue whereby my Sitecore system is unable to 
update the index appropriately. I am not sure if this is linked to the warnings 
above.

Regards,
Adrian



Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread deniz
I have created by simply creating configs and then using upconfig to upload
to zookeeper, then adding it on admin interface of solr.

I have only changed the ips of server and server1 and changed the
core/collection names to CoreA and CoreB, in the logs CoreA and CoreB are
different collections with different names. 



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222341.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Make search faster in Solr

2015-08-11 Thread Nitin Solanki
Hi davidphilip,
Without caching, Can we do fast searching?

On Tue, Aug 11, 2015 at 11:43 AM Nitin Solanki nitinml...@gmail.com wrote:

 Okay davidphilip.

 On Mon, Aug 10, 2015 at 8:24 PM davidphilip cherian 
 davidphilipcher...@gmail.com wrote:

 Hi Nitin,

 32 shards for 16 million documents is too much. 2 shards should suffice
 considering your document sizes are moderate. Caches are to be monitored
 and tuned accordingly. You should study about caches a bit here

 https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig



 On Mon, Aug 10, 2015 at 4:34 PM, Nitin Solanki nitinml...@gmail.com
 wrote:

  Hi,
  I have 32 shards and single replica of each shards having 4
 nodes
  over Solr cloud.
  I have indexed 16 million documents. Without cache, total time taken to
  search a document is 0.2 second. And with cache is 0.04 second.
  I don't do anything of cache. Caches are set by default in
 solrconfig.xml.
 
  How to make faster search without cache? Or how to make more faster with
  cache while searching. Which cache is used for searching?
 




Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-11 Thread Anshum Gupta
It's not entirely invalid but the only supported mechanism to create
collections is via the Collections admin API:

https://cwiki.apache.org/confluence/display/solr/Collections+API



On Mon, Aug 10, 2015 at 11:53 PM, deniz denizdurmu...@gmail.com wrote:

 okay, to make everything clear, here are the steps:

 - Creating configs etc and then running:

 ./zkcli.sh -cmd upconfig -n CoreA -d /path/to/core/configs/CoreA/conf/ -z
 zk1:2181,zk2:2182,zk3:2183

 - Then going to http://someserver:8983/solr/#/~cores

 - Clicking Add Core:
 
 http://lucene.472066.n3.nabble.com/file/n4222345/Screen_Shot_2015-08-11_at_14.png
 

 Repateding the last step on other node as well

 So this is invalid (incl https://wiki.apache.org/solr/CoreAdmin)?



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Core-mismatch-in-org-apache-solr-update-StreamingSolrClients-Errors-for-ConcurrentUpdateSolrClient-tp4222335p4222345.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Anshum Gupta


Deadlock-like behavior when new IndexWriter created

2015-08-11 Thread Andrii Berezhynskyi
Hi,

I have solr5.2.1 set up in master-slave configuration. Very often it
happens that solr slave starts replicating (I can see it in admin panel)
but it is getting stuck at 0% and never proceeds further. Usually restart
of slave helps.

Relevant logs from slave:

INFO  - 2015-08-11 07:56:00.184; org.apache.solr.handler.IndexFetcher;
Master's generation: 26
INFO  - 2015-08-11 07:56:00.188; org.apache.solr.handler.IndexFetcher;
Slave's generation: 25
INFO  - 2015-08-11 07:56:00.189; org.apache.solr.handler.IndexFetcher;
Starting replication process
INFO  - 2015-08-11 07:56:00.205; org.apache.solr.handler.IndexFetcher;
Number of files in latest index in master: 10
INFO  - 2015-08-11 07:56:00.209;
org.apache.solr.core.CachingDirectoryFactory; return new directory for
/var/solr/data/catalog_article_1_de_DE/data/index.20150811075600209
*INFO  - 2015-08-11 07:56:00.212;
org.apache.solr.update.DefaultSolrCoreState; Creating new IndexWriter...*
*INFO  - 2015-08-11 07:56:00.221;
org.apache.solr.update.DefaultSolrCoreState; Waiting until IndexWriter is
unused... core=catalog_article_1_de_DE*
INFO  - 2015-08-11 07:56:00.522; org.apache.solr.core.SolrCore;
[catalog_article_1_de_DE] webapp=/solr path=/select params={} hits=0
status=0 QTime=1
INFO  - 2015-08-11 07:56:03.654; org.apache.solr.core.SolrCore;
[catalog_article_1_de_DE] webapp=/solr path=/select params={} hits=0
status=0 QTime=1


here is relevant solrconfig.xml entries:

updateHandler class=solr.DirectUpdateHandler2
updateLog
  str name=dir${solr.catalog_article_1_de_DE.data.dir:}/str
/updateLog

autoCommit
  maxDocs1/maxDocs
  maxTime30/maxTime
  openSearcherfalse/openSearcher
/autoCommit
autoSoftCommit
 maxTime15000/maxTime
 /autoSoftCommit
  /updateHandler

requestHandler name=/replication class=solr.ReplicationHandler
  lst name=master
str name=enable${enable.master:false}/str
str name=replicateAfteroptimize/str
str name=confFilesschema.xml,solrconfig.xml/str
  /lst
  lst name=slave
str name=enable${enable.slave:false}/str
str name=masterUrl${master.url:127.0.0.1:8983}/${solr.core.name
}/str
str name=pollInterval00:01:00/str
  /lst
/requestHandler

Has anybody faced the same problem?
Is it master's or slave's issue?
How can I debug/fix the problem?

Thanks
Andrii


Re: Count of distinct values in faceting.

2015-08-11 Thread Modassar Ather
Please read docVlaues as docValues in my mail above.

Regards,
Modassar

On Tue, Aug 11, 2015 at 4:01 PM, Modassar Ather modather1...@gmail.com
wrote:

 Hi,

 Count of distinct values can be retrieved by following ways. Please note
 that the Solr version is 5.2.1.
 1. Using cardinality=true.
 2. Using hll() facet function.

 Kindly help me understand:
  1. How accurate are them comparatively and better performance wise with
 millions of documents?
  2. Per my understanding the {!cardinality=1.0} returns the most accurate
 result. Is my understanding correct and if yes is it 100% accurate?
  3. How accurate result is returned by hll() function?
  4. I am getting following exception for the query :
 q=field:querystats=truestats.field={!cardinality=1.0}field. The exception
 is not seen once the cardinality is set to 0.9 or less.
 The field is docVlaues enabled and indexed=false. The same exception I
 tried to reproduce on non docVlaues field but could not. Please help me
 resolve the issue.
  ERROR - 2015-08-11 12:24:00.222; [core]
 org.apache.solr.common.SolrException;
 null:java.lang.ArrayIndexOutOfBoundsException: 3
 at
 net.agkn.hll.serialization.BigEndianAscendingWordSerializer.writeWord(BigEndianAscendingWordSerializer.java:152)
 at
 net.agkn.hll.util.BitVector.getRegisterContents(BitVector.java:247)
 at net.agkn.hll.HLL.toBytes(HLL.java:917)
 at net.agkn.hll.HLL.toBytes(HLL.java:869)
 at
 org.apache.solr.handler.component.AbstractStatsValues.getStatsValues(StatsValuesFactory.java:348)
 at
 org.apache.solr.handler.component.StatsComponent.convertToResponse(StatsComponent.java:151)
 at
 org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:62)
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:255)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
 at
 org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
 at
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
 at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
 at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
 at
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
 at
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
 at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
 at org.eclipse.jetty.server.Server.handle(Server.java:497)
 at
 org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
 at
 org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
 at
 org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
 at java.lang.Thread.run(Thread.java:745)

 Thanks,
 Modassar



Count of distinct values in faceting.

2015-08-11 Thread Modassar Ather
Hi,

Count of distinct values can be retrieved by following ways. Please note
that the Solr version is 5.2.1.
1. Using cardinality=true.
2. Using hll() facet function.

Kindly help me understand:
 1. How accurate are them comparatively and better performance wise with
millions of documents?
 2. Per my understanding the {!cardinality=1.0} returns the most accurate
result. Is my understanding correct and if yes is it 100% accurate?
 3. How accurate result is returned by hll() function?
 4. I am getting following exception for the query :
q=field:querystats=truestats.field={!cardinality=1.0}field. The exception
is not seen once the cardinality is set to 0.9 or less.
The field is docVlaues enabled and indexed=false. The same exception I
tried to reproduce on non docVlaues field but could not. Please help me
resolve the issue.
 ERROR - 2015-08-11 12:24:00.222; [core]
org.apache.solr.common.SolrException;
null:java.lang.ArrayIndexOutOfBoundsException: 3
at
net.agkn.hll.serialization.BigEndianAscendingWordSerializer.writeWord(BigEndianAscendingWordSerializer.java:152)
at
net.agkn.hll.util.BitVector.getRegisterContents(BitVector.java:247)
at net.agkn.hll.HLL.toBytes(HLL.java:917)
at net.agkn.hll.HLL.toBytes(HLL.java:869)
at
org.apache.solr.handler.component.AbstractStatsValues.getStatsValues(StatsValuesFactory.java:348)
at
org.apache.solr.handler.component.StatsComponent.convertToResponse(StatsComponent.java:151)
at
org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:62)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:255)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)

Thanks,
Modassar


Re: (possible)SimplePostTool problem --(Windows, Bitnami distribution)

2015-08-11 Thread pai911
Hi there!

I encountered the same problem as you did

Have you found the answer yet?

Would be really thankful if you could share ur experience!

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/possible-SimplePostTool-problem-Windows-Bitnami-distribution-tp4199980p4222382.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR Physical Memory leading to OOM

2015-08-11 Thread Shawn Heisey
On 8/10/2015 7:07 PM, rohit wrote:
 Thanks Shawn. I was looking at SOLR Admin UI and also using top command on
 server. 

The amount of free memory shown by tools like that is not a very good
way to determine what's happening with your memory.  As I said before,
it's completely normal for the OS to utilize almost all of your physical
memory, even if your programs only require a fraction of it.  It is not
a meaningful metric for success.

 Im running a endurance for 4 hr with 50/sec TPS and i see the physical
 memory keeps on increasing during the time and if we have schedule delta
 import during that time frame which can import upto 4 million docs. After
 the import I see again the memory increases and their comes a point when
 their is no more memory left which in turn leads to OOM. 

If you're hitting OOM, then you need to increase the heap size.  Solr is
requiring more memory than you have assigned to the heap.  This will
reduce the amount of memory available for the OS disk cache, which may
reduce performance.

There may be ways you can reduce heap usage by adjusting your
configuration or the way you use Solr.

http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

There is other good information on that page.  I would encourage you to
go to the top of the page and read all of it.

50 queries per second is a lot of load for a single server.  I would
only expect to see success with that many queries per second if the
index fits entirely into the OS disk cache.  It may also be necessary
for the index to be relatively small.

 I have seen one more thing if their is no activity on server no import , no
 search going.  I have not seen the memory coming down from the state which
 was created after test. 

In general, once Java grabs memory, it doesn't let it go.  As I
previously mentioned, it cannot grab more than you ask, plus some
overhead.  The overhead may be a few hundred mb, which is not very much
when you're talking about multiple gigabytes.

 Couple of things to notice: 
 
 1. We are storing data and indexing also. (not sure if that is causing
 problem). 
 2.  Is 8 GB enuf for 10 million or more data to index.
 3.  We have custom handler which extend solr handlers to return data when
 client calls solr handler. 

I couldn't tell you whether 8GB is enough for 10 million documents.
That depends on what's in those documents, what your schema.xml says,
how you query Solr, and a few other factors.  Even if you tell me the
answers to these questions, I *still* may not be able to say whether
it's enough.  I *might* be able to tell you that it's NOT enough,
though.  The only way to be absolutely sure is to prototype -- actually
try it out.

https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

8GB of RAM is a *very* small system in the world of Solr.  My systems
have 64GB of RAM, and I frequently wish that was 256GB.  My indexes are
somewhat larger than yours, though.

Thanks,
Shawn



Re: Solr old log files are not archived or removed automatically.

2015-08-11 Thread Shawn Heisey
On 8/11/2015 3:10 AM, Adrian Liew wrote:
 Hi Erick,
 
 1 how did you install/run your Solr? As a service or regular? See
 the reference guide, Permanent Logging Settings for some info on the 
 difference there.
 
 What is the difference between regular or service?

On certain operating systems, you can use a shell script that comes with
Solr 5.x to install Solr as a service, with an init script to start it
on boot.

Regular would mean manual start using the bin/solr script.

 2 what does your log4j.properties file look like?
 
 Here are the contents in the log4j.properties file:
 
 #  Logging level
 solr.log=logs
 log4j.rootLogger=INFO, file, CONSOLE
 
 log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
 
 log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
 log4j.appender.CONSOLE.layout.ConversionPattern=%-4r [%t] %-5p %c %x 
 [%X{collection} %X{shard} %X{replica} %X{core}] \u2013 %m%n
 
 #- size rotation with log cleanup.
 log4j.appender.file=org.apache.log4j.RollingFileAppender
 log4j.appender.file.MaxFileSize=4MB
 log4j.appender.file.MaxBackupIndex=9
 
 #- File to log to and log format
 #log4j.appender.file.File=${solr.log}/solr.log
 log4j.appender.file.File=C:/solr_logs/solr.log
 log4j.appender.file.layout=org.apache.log4j.PatternLayout
 log4j.appender.file.layout.ConversionPattern=%-5p - %d{-MM-dd 
 HH:mm:ss.SSS}; [%X{collection} %X{shard} %X{replica} %X{core}] %C; %m\n
 
 log4j.logger.org.apache.zookeeper=WARN
 log4j.logger.org.apache.hadoop=WARN
 
 # set to INFO to enable infostream log messages
 log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF
 
 I am not sure how best I can limit the size of the solr_logs directory. Does 
 log4j come with a feature to remove old log files with a given retention 
 period?

The section of the log4j.properties file entitled size rotation with
log cleanup describes the built-in rotation for the solr log.  It will
keep nine backup logfiles, and each one will be limited in size to 4MB.
 That means that the maximum size of the logs for *solr* is about 40MB.

If you aren't seeing this behavior, then there are a few possible
problems.  Your properties file may have a bug in it.  It looks correct
to me, but I haven't tried to actually validate it.  It might not be
Solr (log4j) that's making the problem logfiles.  It could be Jetty, or
something else entirely.  Your Java VM might not be using the properties
file that you included here.

Thanks,
Shawn



Re: Performance warning overlapping onDeckSearchers

2015-08-11 Thread Shawn Heisey
On 8/11/2015 3:02 AM, Adrian Liew wrote:
 Has anyone come across this issue, [some_index] PERFORMANCE WARNING: 
 Overlapping onDeckSearchers=2?
 
 I am currently using Solr v5.2.1.
 
 What does this mean? Does this raise red flags?
 
 I am currently encountering an issue whereby my Sitecore system is unable to 
 update the index appropriately. I am not sure if this is linked to the 
 warnings above.

https://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F

What the wiki page doesn't explicitly state is that increasing
maxWarmingSearchers is usually the wrong way to solve this, because that
can actually make the problem *worse*.  It is implied by the things the
page DOES say, but it is not stated.

Thanks,
Shawn



Re: Deadlock-like behavior when new IndexWriter created

2015-08-11 Thread Erick Erickson
Hmm, it would be _really_ helpful if next time it happens you could
get a stack trace (see jstack, should have come with your Java). As it
happens we're chasing another deadlock and it'd be interesting to see
if they're related.

Thanks!
Erick

On Tue, Aug 11, 2015 at 1:12 AM, Andrii Berezhynskyi
andrii.berezhyns...@home24.de wrote:
 Hi,

 I have solr5.2.1 set up in master-slave configuration. Very often it
 happens that solr slave starts replicating (I can see it in admin panel)
 but it is getting stuck at 0% and never proceeds further. Usually restart
 of slave helps.

 Relevant logs from slave:

 INFO  - 2015-08-11 07:56:00.184; org.apache.solr.handler.IndexFetcher;
 Master's generation: 26
 INFO  - 2015-08-11 07:56:00.188; org.apache.solr.handler.IndexFetcher;
 Slave's generation: 25
 INFO  - 2015-08-11 07:56:00.189; org.apache.solr.handler.IndexFetcher;
 Starting replication process
 INFO  - 2015-08-11 07:56:00.205; org.apache.solr.handler.IndexFetcher;
 Number of files in latest index in master: 10
 INFO  - 2015-08-11 07:56:00.209;
 org.apache.solr.core.CachingDirectoryFactory; return new directory for
 /var/solr/data/catalog_article_1_de_DE/data/index.20150811075600209
 *INFO  - 2015-08-11 07:56:00.212;
 org.apache.solr.update.DefaultSolrCoreState; Creating new IndexWriter...*
 *INFO  - 2015-08-11 07:56:00.221;
 org.apache.solr.update.DefaultSolrCoreState; Waiting until IndexWriter is
 unused... core=catalog_article_1_de_DE*
 INFO  - 2015-08-11 07:56:00.522; org.apache.solr.core.SolrCore;
 [catalog_article_1_de_DE] webapp=/solr path=/select params={} hits=0
 status=0 QTime=1
 INFO  - 2015-08-11 07:56:03.654; org.apache.solr.core.SolrCore;
 [catalog_article_1_de_DE] webapp=/solr path=/select params={} hits=0
 status=0 QTime=1
 

 here is relevant solrconfig.xml entries:

 updateHandler class=solr.DirectUpdateHandler2
 updateLog
   str name=dir${solr.catalog_article_1_de_DE.data.dir:}/str
 /updateLog

 autoCommit
   maxDocs1/maxDocs
   maxTime30/maxTime
   openSearcherfalse/openSearcher
 /autoCommit
 autoSoftCommit
  maxTime15000/maxTime
  /autoSoftCommit
   /updateHandler

 requestHandler name=/replication class=solr.ReplicationHandler
   lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=confFilesschema.xml,solrconfig.xml/str
   /lst
   lst name=slave
 str name=enable${enable.slave:false}/str
 str name=masterUrl${master.url:127.0.0.1:8983}/${solr.core.name
 }/str
 str name=pollInterval00:01:00/str
   /lst
 /requestHandler

 Has anybody faced the same problem?
 Is it master's or slave's issue?
 How can I debug/fix the problem?

 Thanks
 Andrii


Solr MLT with stream.body returns different results on each shard

2015-08-11 Thread Aaron Gibbons
I have a fresh install of Solr 5.2.1 with about 3 million docs freshly
indexed (I can also reproduce this issue on 4.10.0). When I use the Solr
MorelikeThisHandler with content stream I'm getting different results per
shard.

I also looked at using a standard MLT query, but I need to be able to
stream in a fairly large block of text for comparison that is not in the
index (different type of document). A standard MLT  query
http://testsolr2:8983/solr/mega/select?q=electronicsmlt.flt=textmlt.mintf=0fl=id,score
appears to return consistent results between shards.

Any reason why the content stream query would be different between shards?
Thank you for your help!
Aaron


*Content Stream Example:*
http://testsolr1:8983/solr/mega/mlt?stream.body=electronicsmlt.flt=textmlt.mintf=0fl=id,score
*Returns: *
response
lst name=responseHeader
int name=status0/int
int name=QTime3/int
/lst
result name=response numFound=1590 start=0

http://testsolr2:8983/solr/mega/mlt?stream.body=electronicsmlt.flt=textmlt.mintf=0fl=id,score

*Returns: *
response
lst name=responseHeader
int name=status0/int
int name=QTime1/int
/lst
result name=response numFound=1619 start=0


Using the date field for searching

2015-08-11 Thread Scott Derrick

If I query date:1885

I get an error

org.apache.solr.common.SolrException: Invalid Date String:'1885'

If I query date:1885*

I get no results.

and yet there are numerous docs with a year of 1885 in the date string, 
like so


arr name=datedate1885-02-08T00:00:00Z/date/arr

if I query date:1885-02-08T00:00:00Z

I get 9 results??

Do the users really have to specify a full xml compliant date string to 
use the date: field for searching?


thanks,

Scott


Re: SOLR Physical Memory leading to OOM

2015-08-11 Thread rohit
Thanks Shawn a lot !!. 

Just wanted to clarify we have solrCloud so when doing my testing its not a
single server where im hitting. I have multiple servers. At a time we have 4
leaders n 4 replicas which are communicated using zookeeper. 

So, in total we have 8 servers and zookeeper is install on 5 of them. 

As per your other article , we are planning to move from 1.7 java either to
1.7.0_72-b14 or java 8




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Physical-Memory-leading-to-OOM-tp499p4222434.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: (possible)SimplePostTool problem --(Windows, Bitnami distribution)

2015-08-11 Thread Erik Hatcher
What was the actual command-line used for the failing attempts?

Try using -Dauto=yes (java -Dauto=yes -Dc=tika -jar post.jar ….)

Check out “post.jar -h” for more details on command-line options.

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com http://www.lucidworks.com/




 On Apr 15, 2015, at 3:23 PM, kenadian adr...@r-2.ca wrote:
 
 Hello all, 
 my Bitnami/*Solr-5.0.0* instalation is not able to index any type of
 file(found in the provided examples folders or anywhere else) except HTML. 
 
 Tested on the files in exampledocs folder
 (books.csv,books.json,...,utf8-example.xml, vidcard.xml) I get:
 for *.csv* files I get the reponse Unexpected character 'i'  (depending on
 what is the 1st character in file),
 for *.xml* files I get the response ERROR: unknown field 'id' 
 for *.pdf* files I get the response Invalid UTF-8 middle byte 0xe5
 and so forth.
 Even *.TXT* files are not handled:
 I get the reponse Unexpected character 'T'  (depending on what is the 1st
 character in file--This is a test of TXT extraction in Solr, it is only a
 test. Do not panic.)
 
 
 The only type that works is *HTML* :
 
 C:\Bitnami\solr-5.0.0-0\apache-solr\solr\exampledocsjava -Dc=tika -jar
 post.jar  *.html
 
 SimplePostTool version 5.0.0
 Posting files to [base] url http://localhost:8983/solr/tika/update using
 content-type application/xml...
 POSTing file sample.html to [base]
 1 files indexed.
 COMMITting Solr index changes to http://localhost:8983/solr/tika/update...
 Time spent: 0:00:00.313
 
 I use Windows 8.1, java version 1.8.0_40.
 
 Any ideas of how to fix this? Many thanks.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/possible-SimplePostTool-problem-Windows-Bitnami-distribution-tp4199980.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Choosing the order of the fields to be displayed at output

2015-08-11 Thread Shawn Heisey
On 8/11/2015 9:36 PM, Zheng Lin Edwin Yeo wrote:
 I'm using Solr 5.2.1. I understand that for JSON format, Solr writes out
 the fields of each document in the order they are found in the index as it
 is the fastest and most efficient for Solr to return the data.
 
 However, this causes confusion as each of the records has fields arranged
 in different order, as user are allowed to update the field after the
 document is index. Whenever a field is updated, that field will be
 displayed at the bottom of the record.
 
 Is there a way to choose the order of the fields to be displayed at the
 output, so that the order will be consistent for all the records?

Solr simply returns the fields in the order that Java naturally stores
the information, which from a user perspective, is not very predictable,
and may change from one version of code to the next, or when Java is
upgraded.

I think that deciding information display order order is a job for
client code.  The application that makes the request to Solr can pick
the pieces that need to be displayed to the user and decide what order
they should be in.

Thanks,
Shawn



Re: Streaming API running a simple query

2015-08-11 Thread Selvam
Hi All,

I have written a blog to cover this nested merge expressions, see
http://knackforge.com/blog/selvam/solr-streaming-expressions for more
details.

Thanks.

On Mon, Aug 10, 2015 at 3:51 PM, Selvam s.selvams...@gmail.com wrote:

 Hi,

 Thanks, that seems to be working!

 On Sat, Aug 8, 2015 at 9:28 PM, Joel Bernstein joels...@gmail.com wrote:

 This sounds doable using nested merge functions like this:

 merge(search(...),
merge(search(...), search(),...), ...)

 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Sat, Aug 8, 2015 at 8:08 AM, Selvam s.selvams...@gmail.com wrote:

  Hi,
 
  I needed to run a multiple subqueries each with its own limit of rows.
 
  For eg: to get 30 users from country India with age greater than 30 and
 50
  users from England who are all male.
 
  Thanks again.
  On 08-Aug-2015 5:30 pm, Joel Bernstein joels...@gmail.com wrote:
 
   Can you describe your use case?
  
   Joel Bernstein
   http://joelsolr.blogspot.com/
  
   On Sat, Aug 8, 2015 at 7:36 AM, Selvam s.selvams...@gmail.com
 wrote:
  
Hi,
   
Thanks, good to know, in fact my requirement needs to merge multiple
expressions, while current streaming expressions supports only two
expression. Do you think we can expect that in future versions?
On 07-Aug-2015 6:46 pm, Joel Bernstein joels...@gmail.com
 wrote:
   
 Hi,

 There is a new error handling framework in trunk (SOLR-7441) for
 the
 Streaming API, Streaming Expressions.

 So if you're purely in testing mode, it will be much easier to
 work
  in
 trunk then Solr 5.2.

 If you run into errors in trunk that are still confusing please
   continue
to
 report them so we can get all the error messages covered.

 Thanks,

 Joel


 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Fri, Aug 7, 2015 at 6:19 AM, Selvam s.selvams...@gmail.com
  wrote:

  Hi,
 
  Sorry, it is working now.
 
  curl --data-urlencode
  'stream=search(gettingstarted,q=*:*,fl=id,sort=id asc)'
  http://localhost:8983/solr/gettingstarted/stream
 
  I missed *'asc'* in sort :)
 
  Thanks for the help Shawn Heisey.
 
  On Fri, Aug 7, 2015 at 3:46 PM, Selvam s.selvams...@gmail.com
   wrote:
 
   Hi,
  
   Thanks for your update, yes, I was missing the cloud mode, I
 am
  new
to
  the
   world of Solr cloud. Now I have enabled a single node (with
 two
shards
 
   replicas) that runs on 8983 port along with zookeeper running
 on
   9983
  port.
   When I run,
  
curl --data-urlencode
   'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
   http://localhost:8983/solr/gettingstarted/stream
  
   Again, I get
  
   Unable to construct instance of
   org.apache.solr.client.solrj.io.stream.CloudSolrStream
   .
   .
  
   Caused by: java.lang.reflect.InvocationTargetException
   .
   .
   Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
  
   I tried different port, 9983 as well, which returns Empty
 reply
   from
   server. I think I miss some obvious configuration.
  
  
  
  
   On Fri, Aug 7, 2015 at 2:04 PM, Shawn Heisey 
  apa...@elyograg.org
  wrote:
  
   On 8/7/2015 1:37 AM, Selvam wrote:
   
 
  
 https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
   
I tried this from my linux terminal,
1)   curl --data-urlencode
'stream=search(gettingstarted,q=*:*,fl=id,sort=id)'
http://localhost:8983/solr/gettingstarted/stream
   
Threw zkHost error. Then tried with,
   
2)   curl --data-urlencode
   
  
 

   
  
 
 'stream=search(gettingstarted,zkHost=localhost:8983,q=*:*,fl=id,sort=id)'
http://localhost:8983/solr/gettingstarted/stream
   
It throws me java.lang.ArrayIndexOutOfBoundsException:
  1\n\tat
   
  
 

   
  
 
 org.apache.solr.client.solrj.io.stream.CloudSolrStream.parseComp(CloudSolrStream.java:260)
  
   The documentation page you linked seems to indicate that this
  is a
   feature that only works in SolrCloud.  Your inclusion of
   localhost:8983 as the zkHost suggests that either you are
 NOT
 running
   in cloud mode, or that you do not understand what zkHost
 means.
  
   Zookeeper runs on a different port than Solr.  8983 is Solr's
   port.
 If
   you are running a 5.x cloud with the embedded zookeeper, it
 is
   most
   likely running on port 9983.  If you are running in cloud
 mode
   with
a
   properly configured external zookeeper, then your zkHost
  parameter
 will
   probably have three hosts in it with port 2181.
  
   Thanks,
   Shawn
  
  
  
  
   --
   Regards,
   Selvam
   KnackForge 

Re: Filter Out Facet Results

2015-08-11 Thread Erik Hatcher
One solution is to filter these out at indexing time.  The StopFilter with a 
custom stop list file could do the trick - you’ll probably need to adjust your 
field type definition to be a TextField instead of a StrField, use a 
KeywordTokenizer and then a StopFilter.

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com http://www.lucidworks.com/




 On Aug 10, 2015, at 6:28 PM, Paden rumsey...@gmail.com wrote:
 
 Hello,
 
 I'm trying to figure out how to filter out particular facets out of my
 results. I'm doing some Named Entity Extraction and putting them up as
 faceting information. However, not all the results I get are exact. For
 example, the string w 5th street will appear in the Person facet list.
 These entities are the same every time. I know what they will be so I can
 predictably say which ones will be wrong. I was wondering if there was a way
 to write into the solrconfig to filter out these bad entities. I know that
 filter query can be a great way to INCLUDE a facet in the search or narrow a
 search based on the facet. But I'm not quite sure how to filter results out
 
 Thanks in advance for any help you can provide. 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Filter-Out-Facet-Results-tp493.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using the date field for searching

2015-08-11 Thread Scott Derrick

Sagar,

thanks,

Scott

 Original Message 
Subject: Re: Using the date field for searching
From: Bade, Vidya (Sagar) vb...@webmd.net
To: solr-user@lucene.apache.org solr-user@lucene.apache.org
Date: 08/11/2015 03:05 PM


You can use filter query and form the date as follows when a user enters just 
the year or year and month:

If just the year (1885) was entered - date:[1885-01-01T00:00:00Z TO 
1886-01-01T00:00:00Z]
If just the year and month (1885-06) were entered - date:[1885-06-01T00:00:00Z 
TO 1885-07-01T00:00:00Z]

Alternatively use DateRangeField as described at the bottom in the following 
webpage:

https://cwiki.apache.org/confluence/display/solr/Working+with+Dates

:Sagar

-Original Message-
From: Scott Derrick [mailto:sc...@tnstaafl.net]
Sent: Tuesday, August 11, 2015 3:02 PM
To: solr-user@lucene.apache.org
Subject: Using the date field for searching

If I query date:1885

I get an error

org.apache.solr.common.SolrException: Invalid Date String:'1885'

If I query date:1885*

I get no results.

and yet there are numerous docs with a year of 1885 in the date string, like so

arr name=datedate1885-02-08T00:00:00Z/date/arr

if I query date:1885-02-08T00:00:00Z

I get 9 results??

Do the users really have to specify a full xml compliant date string to use the 
date: field for searching?

thanks,

Scott



--
Sin makes its own hell, and goodness its own heaven.
Mary Baker Eddy


Re: Highlighting

2015-08-11 Thread Erik Hatcher
Scott - doesn’t look you’ve specified hl.fl specifying which field(s) to 
highlight.

p.s. Erick Erickson surely likes your e-mail domain :)


—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com http://www.lucidworks.com/




 On Aug 11, 2015, at 9:02 PM, Scott Derrick sc...@tnstaafl.net wrote:
 
 I guess I really don't get Highlighting in Solr.
 
 We are transitioning from Google Custom Search which generally sucks, but 
 does return nicely formatted highlighted fragment.
 
 I turn highlighting on hl=true in the query and I get a highlighting section 
 returned at the bottom of the page, each identified by the document file name 
 with a empty {} .  It doesn't matter what I search for, plain text, a field, 
 I get a list of documents followed by an empty brace?
 
 highlighting: {
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10089/A10089.html: {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./L3/L3.html: {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10646/A10646.html: {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./V03482/V03482.html: {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html: {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./645A.66.043/645A.66.043.html:
  {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./352.48.001/352.48.001.html:
  {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./144.23.001/144.23.001.html:
  {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./L18512/L18512.html: {}
  }
 
 I haven't made any changes to the default settings
 
   highlighting
  !-- Configure the standard fragmenter --
  !-- This could most likely be commented out in the default case --
  fragmenter name=gap
  default=true
  class=solr.highlight.GapFragmenter
lst name=defaults
  int name=hl.fragsize100/int
/lst
  /fragmenter
 
  !-- A regular-expression-based fragmenter
   (for sentence extraction)
--
  fragmenter name=regex
  class=solr.highlight.RegexFragmenter
lst name=defaults
  !-- slightly smaller fragsizes work better because of slop --
  int name=hl.fragsize70/int
  !-- allow 50% slop on fragment sizes --
  float name=hl.regex.slop0.5/float
  !-- a basic sentence pattern --
  str name=hl.regex.pattern[-\w ,/\n\quot;apos;]{20,200}/str
/lst
  /fragmenter
 
  !-- Configure the standard formatter --
  formatter name=html
 default=true
 class=solr.highlight.HtmlFormatter
lst name=defaults
  str name=hl.simple.pre![CDATA[em]]/str
  str name=hl.simple.post![CDATA[/em]]/str
/lst
  /formatter
 
  !-- Configure the standard encoder --
  encoder name=html
   class=solr.highlight.HtmlEncoder /
 
  !-- Configure the standard fragListBuilder --
  fragListBuilder name=simple
   class=solr.highlight.SimpleFragListBuilder/
 
  !-- Configure the single fragListBuilder --
  fragListBuilder name=single
   class=solr.highlight.SingleFragListBuilder/
 
  !-- Configure the weighted fragListBuilder --
  fragListBuilder name=weighted
   default=true
   class=solr.highlight.WeightedFragListBuilder/
 
  !-- default tag FragmentsBuilder --
  fragmentsBuilder name=default
default=true
class=solr.highlight.ScoreOrderFragmentsBuilder
!--
lst name=defaults
  str name=hl.multiValuedSeparatorChar//str
/lst
--
  /fragmentsBuilder
 
  !-- multi-colored tag FragmentsBuilder --
  fragmentsBuilder name=colored
class=solr.highlight.ScoreOrderFragmentsBuilder
lst name=defaults
  str name=hl.tag.pre![CDATA[
   b style=background:yellow,b style=background:lawgreen,
   b style=background:aquamarine,b 
 style=background:magenta,
   b style=background:palegreen,b style=background:coral,
   b style=background:wheat,b style=background:khaki,
   b style=background:lime,b 
 style=background:deepskyblue]]/str
  str name=hl.tag.post![CDATA[/b]]/str
/lst
  /fragmentsBuilder
 
  boundaryScanner name=default
   default=true
   class=solr.highlight.SimpleBoundaryScanner
lst name=defaults
  str name=hl.bs.maxScan10/str
  str name=hl.bs.chars.,!? #9;#10;#13;/str
/lst
  /boundaryScanner
 
  boundaryScanner name=breakIterator
   class=solr.highlight.BreakIteratorBoundaryScanner
lst name=defaults
  !-- type should be one of CHARACTER, WORD(default), 

Choosing the order of the fields to be displayed at output

2015-08-11 Thread Zheng Lin Edwin Yeo
Hi,

I'm using Solr 5.2.1. I understand that for JSON format, Solr writes out
the fields of each document in the order they are found in the index as it
is the fastest and most efficient for Solr to return the data.

However, this causes confusion as each of the records has fields arranged
in different order, as user are allowed to update the field after the
document is index. Whenever a field is updated, that field will be
displayed at the bottom of the record.

Is there a way to choose the order of the fields to be displayed at the
output, so that the order will be consistent for all the records?

Regards,
Edwin


Cross core join

2015-08-11 Thread Nagasharath
I have a scenario(we are badly affected) where I have to join two cores of two 
different nodes. 

I knew that there is a jira 
(https://issues.apache.org/jira/plugins/servlet/mobile#issue/SOLR-7090) open in 
support of this, is there any alternate solution that I can work around until 
this gets released.

Thanks,
Sharath

Highlighting

2015-08-11 Thread Scott Derrick

I guess I really don't get Highlighting in Solr.

We are transitioning from Google Custom Search which generally sucks, 
but does return nicely formatted highlighted fragment.


I turn highlighting on hl=true in the query and I get a highlighting 
section returned at the bottom of the page, each identified by the 
document file name with a empty {} .  It doesn't matter what I search 
for, plain text, a field, I get a list of documents followed by an empty 
brace?


highlighting: {

/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./A10089/A10089.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./L3/L3.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./A10646/A10646.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./V03482/V03482.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./645A.66.043/645A.66.043.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./352.48.001/352.48.001.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./144.23.001/144.23.001.html: 
{},


/home/scott/workspace/mbel-work/tei2html/build/web/./L18512/L18512.html: 
{}

  }

I haven't made any changes to the default settings

   highlighting
  !-- Configure the standard fragmenter --
  !-- This could most likely be commented out in the default 
case --

  fragmenter name=gap
  default=true
  class=solr.highlight.GapFragmenter
lst name=defaults
  int name=hl.fragsize100/int
/lst
  /fragmenter

  !-- A regular-expression-based fragmenter
   (for sentence extraction)
--
  fragmenter name=regex
  class=solr.highlight.RegexFragmenter
lst name=defaults
  !-- slightly smaller fragsizes work better because of slop --
  int name=hl.fragsize70/int
  !-- allow 50% slop on fragment sizes --
  float name=hl.regex.slop0.5/float
  !-- a basic sentence pattern --
  str name=hl.regex.pattern[-\w 
,/\n\quot;apos;]{20,200}/str

/lst
  /fragmenter

  !-- Configure the standard formatter --
  formatter name=html
 default=true
 class=solr.highlight.HtmlFormatter
lst name=defaults
  str name=hl.simple.pre![CDATA[em]]/str
  str name=hl.simple.post![CDATA[/em]]/str
/lst
  /formatter

  !-- Configure the standard encoder --
  encoder name=html
   class=solr.highlight.HtmlEncoder /

  !-- Configure the standard fragListBuilder --
  fragListBuilder name=simple
   class=solr.highlight.SimpleFragListBuilder/

  !-- Configure the single fragListBuilder --
  fragListBuilder name=single
   class=solr.highlight.SingleFragListBuilder/

  !-- Configure the weighted fragListBuilder --
  fragListBuilder name=weighted
   default=true
   class=solr.highlight.WeightedFragListBuilder/

  !-- default tag FragmentsBuilder --
  fragmentsBuilder name=default
default=true
class=solr.highlight.ScoreOrderFragmentsBuilder
!--
lst name=defaults
  str name=hl.multiValuedSeparatorChar//str
/lst
--
  /fragmentsBuilder

  !-- multi-colored tag FragmentsBuilder --
  fragmentsBuilder name=colored
class=solr.highlight.ScoreOrderFragmentsBuilder
lst name=defaults
  str name=hl.tag.pre![CDATA[
   b style=background:yellow,b 
style=background:lawgreen,
   b style=background:aquamarine,b 
style=background:magenta,
   b style=background:palegreen,b 
style=background:coral,

   b style=background:wheat,b style=background:khaki,
   b style=background:lime,b 
style=background:deepskyblue]]/str

  str name=hl.tag.post![CDATA[/b]]/str
/lst
  /fragmentsBuilder

  boundaryScanner name=default
   default=true
   class=solr.highlight.SimpleBoundaryScanner
lst name=defaults
  str name=hl.bs.maxScan10/str
  str name=hl.bs.chars.,!? #9;#10;#13;/str
/lst
  /boundaryScanner

  boundaryScanner name=breakIterator
   class=solr.highlight.BreakIteratorBoundaryScanner
lst name=defaults
  !-- type should be one of CHARACTER, WORD(default), LINE and 
SENTENCE --

  str name=hl.bs.typeWORD/str
  !-- language and country are used when constructing Locale 
object.  --
  !-- And the Locale object will be used when getting instance 
of BreakIterator --

  str name=hl.bs.languageen/str
  str name=hl.bs.countryUS/str
  

Re: Highlighting

2015-08-11 Thread Erick Erickson
bq: Erick Erickson surely likes your e-mail domain :)

Yep, I envy that one!

On Tue, Aug 11, 2015 at 6:27 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 Scott - doesn’t look you’ve specified hl.fl specifying which field(s) to 
 highlight.

 p.s. Erick Erickson surely likes your e-mail domain :)


 —
 Erik Hatcher, Senior Solutions Architect
 http://www.lucidworks.com http://www.lucidworks.com/




 On Aug 11, 2015, at 9:02 PM, Scott Derrick sc...@tnstaafl.net wrote:

 I guess I really don't get Highlighting in Solr.

 We are transitioning from Google Custom Search which generally sucks, but 
 does return nicely formatted highlighted fragment.

 I turn highlighting on hl=true in the query and I get a highlighting section 
 returned at the bottom of the page, each identified by the document file 
 name with a empty {} .  It doesn't matter what I search for, plain text, a 
 field, I get a list of documents followed by an empty brace?

 highlighting: {
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10089/A10089.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./L3/L3.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10646/A10646.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./V03482/V03482.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html: 
 {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./645A.66.043/645A.66.043.html:
  {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./352.48.001/352.48.001.html:
  {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./144.23.001/144.23.001.html:
  {},
 /home/scott/workspace/mbel-work/tei2html/build/web/./L18512/L18512.html: {}
  }

 I haven't made any changes to the default settings

   highlighting
  !-- Configure the standard fragmenter --
  !-- This could most likely be commented out in the default case --
  fragmenter name=gap
  default=true
  class=solr.highlight.GapFragmenter
lst name=defaults
  int name=hl.fragsize100/int
/lst
  /fragmenter

  !-- A regular-expression-based fragmenter
   (for sentence extraction)
--
  fragmenter name=regex
  class=solr.highlight.RegexFragmenter
lst name=defaults
  !-- slightly smaller fragsizes work better because of slop --
  int name=hl.fragsize70/int
  !-- allow 50% slop on fragment sizes --
  float name=hl.regex.slop0.5/float
  !-- a basic sentence pattern --
  str name=hl.regex.pattern[-\w ,/\n\quot;apos;]{20,200}/str
/lst
  /fragmenter

  !-- Configure the standard formatter --
  formatter name=html
 default=true
 class=solr.highlight.HtmlFormatter
lst name=defaults
  str name=hl.simple.pre![CDATA[em]]/str
  str name=hl.simple.post![CDATA[/em]]/str
/lst
  /formatter

  !-- Configure the standard encoder --
  encoder name=html
   class=solr.highlight.HtmlEncoder /

  !-- Configure the standard fragListBuilder --
  fragListBuilder name=simple
   class=solr.highlight.SimpleFragListBuilder/

  !-- Configure the single fragListBuilder --
  fragListBuilder name=single
   class=solr.highlight.SingleFragListBuilder/

  !-- Configure the weighted fragListBuilder --
  fragListBuilder name=weighted
   default=true
   class=solr.highlight.WeightedFragListBuilder/

  !-- default tag FragmentsBuilder --
  fragmentsBuilder name=default
default=true
class=solr.highlight.ScoreOrderFragmentsBuilder
!--
lst name=defaults
  str name=hl.multiValuedSeparatorChar//str
/lst
--
  /fragmentsBuilder

  !-- multi-colored tag FragmentsBuilder --
  fragmentsBuilder name=colored
class=solr.highlight.ScoreOrderFragmentsBuilder
lst name=defaults
  str name=hl.tag.pre![CDATA[
   b style=background:yellow,b style=background:lawgreen,
   b style=background:aquamarine,b 
 style=background:magenta,
   b style=background:palegreen,b style=background:coral,
   b style=background:wheat,b style=background:khaki,
   b style=background:lime,b 
 style=background:deepskyblue]]/str
  str name=hl.tag.post![CDATA[/b]]/str
/lst
  /fragmentsBuilder

  boundaryScanner name=default
   default=true
   class=solr.highlight.SimpleBoundaryScanner
lst name=defaults
  str name=hl.bs.maxScan10/str
  str name=hl.bs.chars.,!? #9;#10;#13;/str
/lst
  /boundaryScanner

  boundaryScanner name=breakIterator