Re: Is anybody using Config API/configoverlay.json, useParams/params.json, and/or initParams?

2016-02-26 Thread Alexandre Rafalovitch
Thanks Erik,

I know the examples use all of that. It was quite a surprise to
discover the films and files examples. I even felt a need to write the
blog post explaining where ALL of the examples and Solr homes hide in
the distribution:
http://blog.outerthoughts.com/2015/11/oh-solr-home-where-art-thou/

And I know films example showcases reload-not-required
useParams/params.json override.

However, I am more interested in the production usage. Hence the
original question.

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 27 February 2016 at 08:42, Erik Hatcher  wrote:
> data_driven /browse does. And example/files builds upon that a lot more.  I 
> did it that way to personally explore the configset feature.
>
>Erik
>
>> On Feb 26, 2016, at 16:12, Alexandre Rafalovitch  wrote:
>>
>> Hi,
>>
>> I am creating an explanation of solrconfig.xml for the beginners and
>> want to know whether anybody is actually using overrides and
>> initParams in the wild. Sometimes, features exist for edge cases, but
>> may not be worth spending much attention on in the beginner docs.
>>
>> Any feedback (on the list or direct) would be appreciated.
>>
>> Regards,
>>   Alex.
>> P.s. Of course my next project is Solr troubleshooting guide and then
>> I MUST explain these and the order they will apply. Oh joy.
>>
>> 
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/


Re: Newest docs added

2016-02-26 Thread Shawn Heisey
On 2/26/2016 3:53 PM, Toke Eskildsen wrote:
> MarkG  wrote:
>> Is there a way anyone can recommend to identify newly added docs to a Solr
>> index. Ie: I have some new docs. I update the index with the new doc and
>> this happens on a regular basis, say every 4 weeks. I want to be able to
>> distinguish the docs that are new given in a certain number of days, say 10
>> days from the time the index had the new doc added.
> https://lucene.apache.org/solr/4_7_0/solr-core/org/apache/solr/update/processor/TimestampUpdateProcessorFactory.html
> and a 'timestamp:[NOW/DAY-10DAYS TO *]' query?

I didn't know about the timestamp update processor.  That's pretty cool,
and now that I really think about it, would be required for SolrCloud to
operate sanely.

If you're not running SolrCloud, then it can be handled with a config
that's even easier.  You can add a field like the following to your
schema, and make sure that your indexing never includes this field:

   

That's copied straight out of one of the schemas I'm using.

Thanks,
Shawn



Re: Newest docs added

2016-02-26 Thread Toke Eskildsen
MarkG  wrote:
> Is there a way anyone can recommend to identify newly added docs to a Solr
> index. Ie: I have some new docs. I update the index with the new doc and
> this happens on a regular basis, say every 4 weeks. I want to be able to
> distinguish the docs that are new given in a certain number of days, say 10
> days from the time the index had the new doc added.

https://lucene.apache.org/solr/4_7_0/solr-core/org/apache/solr/update/processor/TimestampUpdateProcessorFactory.html
and a 'timestamp:[NOW/DAY-10DAYS TO *]' query?


- Toke Eskildsen


Newest docs added

2016-02-26 Thread MarkG
Is there a way anyone can recommend to identify newly added docs to a Solr
index. Ie: I have some new docs. I update the index with the new doc and
this happens on a regular basis, say every 4 weeks. I want to be able to
distinguish the docs that are new given in a certain number of days, say 10
days from the time the index had the new doc added.


Is there a way to do this with Solr ?


Thanks.


Re: Disable phrase search in edismax?

2016-02-26 Thread Ahmet Arslan


Hi,

If you don't set (phrase fields) pf* parameters, phrase creation is 
automatically disabled, no?

Ahmet 

On Friday, February 26, 2016 11:51 PM, Walter Underwood  
wrote:



I’m creating a query from MLT terms, then sending it to edismax. The 
neighboring words in the query are not meaningful phrases.

Is there a way to turn off phrase creation and search for one query? Or should 
I separate them all with “OR”?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


Re: Thread Usage

2016-02-26 Thread Toke Eskildsen
Azazel K  wrote:

[Toke: 1 shard instead of multiple?]

> The nodes were unstable when we had single shard setup.
> It used to run OOM frequently.

Fair enough.

[Toke: Use a queue instead of 1000+ concurrent requests?]

> There are 16CPU on each node.  Requests are live  with
> upstream client impact so they cannot be put in a queue(!!).

Using more threads only helps is there are unused resources. With 4*16CPU 
cores, I would expect CPU & IO-system to be saturated well before 1000 
concurrent requests.

Having no real limit on the number of concurrent requests makes is very hard to 
handle sudden spikes: Unless each call has trivial memory overhead, there must 
be a lot of free room on the heap for those special occasions. 

Related: https://issues.apache.org/jira/browse/SOLR-7344

> Also in one of the other threads, Shawn Heisey mentions to increase thread
> size to 10 and tweaking process limit settings on OS.

Since you are running distributed searches, the number of ingoing connections 
(the maxThreads parameter) should be "high enough" to avoid deadlocks. 100K is 
fine - it is practically the same as infinite as things will break down 
elsewhere before that number is reached. Limiting the number of concurrent 
requests to SolrCloud must currently be done from the outside.

- Toke Eskildsen


Disable phrase search in edismax?

2016-02-26 Thread Walter Underwood
I’m creating a query from MLT terms, then sending it to edismax. The 
neighboring words in the query are not meaningful phrases.

Is there a way to turn off phrase creation and search for one query? Or should 
I separate them all with “OR”?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)




Re: Is anybody using Config API/configoverlay.json, useParams/params.json, and/or initParams?

2016-02-26 Thread Erik Hatcher
data_driven /browse does. And example/files builds upon that a lot more.  I did 
it that way to personally explore the configset feature. 

   Erik 

> On Feb 26, 2016, at 16:12, Alexandre Rafalovitch  wrote:
> 
> Hi,
> 
> I am creating an explanation of solrconfig.xml for the beginners and
> want to know whether anybody is actually using overrides and
> initParams in the wild. Sometimes, features exist for edge cases, but
> may not be worth spending much attention on in the beginner docs.
> 
> Any feedback (on the list or direct) would be appreciated.
> 
> Regards,
>   Alex.
> P.s. Of course my next project is Solr troubleshooting guide and then
> I MUST explain these and the order they will apply. Oh joy.
> 
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/


Re: Thread Usage

2016-02-26 Thread Azazel K
> There is a non-trivial overhead for sharding: Using a single shard increases 
> throughput. Have you tried with 1 >shard to see if the latency is acceptable 
> for that?

The nodes were unstable when we had single shard setup.  It used to run OOM 
frequently.  Ops team setup a cronjob to clear out memory and increased swap 
space.  But it wasn't stable and still caused random outages.

>First guess: You are updating too frequently and hitting multiple overlapping 
>searchers, deteriorating >performance which leads to more overlapping 
>searchers and so on. Try looking in the log:
>https://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers>.3DX.22_mean_in_my_logs.3F

You're right.  It's a heavy index and heavy read system with 30 secs of soft 
commit and 10 mins autoCommit.  I will take a look at the link you gave.

Anyway, 1000 threads sounds high. How many CPUs are on your machines? 32 on 
each? That is a total of 128 CPUs for your 4 machines, meaning that each CPU is 
working on about 10 concurrent requests. They might be competing for resources: 
Have you tried limiting the amount of concurrent request and using a queue? 
That might give you better performance (and lower heap requirements a bit).

There are 16CPU on each node.  Requests are live  with upstream client impact 
so they cannot be put in a queue(!!). We are planning to re-build based on data 
type to reduce the load, but it's still few months away.

Also in one of the other threads, Shawn Heisey mentions to increase thread size 
to 10 and tweaking process limit settings on OS.  I haven't dug deep into 
it yet and see if it applies to this case, but according to his recommendation 
our setup seems to be running on minimum settings.

The maxThreads parameter in the Jetty config defaults to 200, and it is
quite easy to exceed this.  In the Jetty that comes packaged with Solr,
this setting has been changed to 1, which effectively removes the
limit for a typical Solr install.  Because you are running 4.4 and your
message indicates you are using "service jetty" commands, chances are
that you are NOT using the jetty that came with Solr.  The first thing I
would try is increasing the maxThreads parameter to 1.

The process limit is increased in /etc/security/limits.conf.  Here are
the additions that I make to this file on my Solr servers, to increase
the limits on the number of processes/threads and open files, both of
which default to 1024:

solrhardnproc   6144
solrsoftnproc   4096

solrhardnofile  65535
solrsoftnofile  49151

Let me know what u think.

Thanks,
A

From: Toke Eskildsen 
Sent: Friday, February 26, 2016 11:30 AM
To: solr_user lucene_apache
Subject: Re: Thread Usage

Azazel K  wrote:
> We have solr cluster with 2 shards running 2 nodes on each shard.
> They are beefy physical boxes with index size of 162 GB , RAM of
> about 96 GB and around 153M documents.

There is a non-trivial overhead for sharding: Using a single shard increases 
throughput. Have you tried with 1 shard to see if the latency is acceptable for 
that?

> Two times this week we have seen the thread usage spike from the
> usual 1000 to 4000 on all nodes at the same time and bring down
> the cluster.

First guess: You are updating too frequently and hitting multiple overlapping 
searchers, deteriorating performance which leads to more overlapping searchers 
and so on. Try looking in the log:
https://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F


Anyway, 1000 threads sounds high. How many CPUs are on your machines? 32 on 
each? That is a total of 128 CPUs for your 4 machines, meaning that each CPU is 
working on about 10 concurrent requests. They might be competing for resources: 
Have you tried limiting the amount of concurrent request and using a queue? 
That might give you better performance (and lower heap requirements a bit).

- Toke Eskildsen


Is anybody using Config API/configoverlay.json, useParams/params.json, and/or initParams?

2016-02-26 Thread Alexandre Rafalovitch
Hi,

I am creating an explanation of solrconfig.xml for the beginners and
want to know whether anybody is actually using overrides and
initParams in the wild. Sometimes, features exist for edge cases, but
may not be worth spending much attention on in the beginner docs.

Any feedback (on the list or direct) would be appreciated.

Regards,
   Alex.
P.s. Of course my next project is Solr troubleshooting guide and then
I MUST explain these and the order they will apply. Oh joy.


Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


Re: Shard State vs Replica State

2016-02-26 Thread Jeff Wartes

I believe the shard state is a reflection of whether that shard is still in use 
by the collection, and has nothing to do with the state of the replicas. I 
think doing a split-shard operation would create two new shards, and mark the 
old one as inactive, for example.




On 2/26/16, 8:50 AM, "Dennis Gove"  wrote:

>In clusterstate.json (or just state.json in new versions) I'm seeing the
>following
>
>"shard1":{
>"range":"8000-d554",
>"state":"active",
>"replicas":{
>  "core_node7":{
>"core":"people_shard1_replica3",
>"base_url":"http://192.168.2.32:8983/solr;,
>"node_name":"192.168.2.32:8983_solr",
>"state":"down"},
>  "core_node9":{
>"core":"people_shard1_replica2",
>"base_url":"http://192.168.2.32:8983/solr;,
>"node_name":"192.168.2.32:8983_solr",
>"state":"down"},
>  "core_node2":{
>"core":"people_shard1_replica1",
>"base_url":"http://192.168.2.32:8983/solr;,
>"node_name":"192.168.2.32:8983_solr",
>"state":"down"}
>}
>},
>
>All replicas are down (I hosed the index for one of the replicas on purpose
>to simulate this) and each replica is showing its state accurately as
>"down". But the shard state is still showing "active". I would expect the
>shard state to reflect the availability of that shard (ie, the best state
>across all the replicas). For example, if one replica is active then the
>shard state is active, if two replicas are recovering and one is down then
>the shard state shows recovering, etc...
>
>What I'm seeing, however, doesn't match my expectation so I'm wondering
>what is shard state showing?
>
>Thanks,
>Dennis


Escaping characters in a nested query

2016-02-26 Thread Jamie Johnson
When using nested queries of the form q=_query_:"my_awesome:query", what
needs to be escaped in the query portion?  Just using the admin UI the
following works

_query_:"+field\\:with\\:special"
_query_:"+field\\:with\\~special"
_query_:"+field\\:with\\"

but the same doesn't work for quotes, i.e.

_query_:"+field\\:with\\"special"

throws a org.apache.solr.search.SyntaxError.  If I do

_query_:"+field\\:with\\\"special" it executes, though I am not sure why
quotes require different escaping.

I am currently running solr 4.10.4, any thoughts?


Re: Thread Usage

2016-02-26 Thread Toke Eskildsen
Azazel K  wrote:
> We have solr cluster with 2 shards running 2 nodes on each shard.
> They are beefy physical boxes with index size of 162 GB , RAM of
> about 96 GB and around 153M documents.

There is a non-trivial overhead for sharding: Using a single shard increases 
throughput. Have you tried with 1 shard to see if the latency is acceptable for 
that?

> Two times this week we have seen the thread usage spike from the
> usual 1000 to 4000 on all nodes at the same time and bring down
> the cluster.

First guess: You are updating too frequently and hitting multiple overlapping 
searchers, deteriorating performance which leads to more overlapping searchers 
and so on. Try looking in the log:
https://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F


Anyway, 1000 threads sounds high. How many CPUs are on your machines? 32 on 
each? That is a total of 128 CPUs for your 4 machines, meaning that each CPU is 
working on about 10 concurrent requests. They might be competing for resources: 
Have you tried limiting the amount of concurrent request and using a queue? 
That might give you better performance (and lower heap requirements a bit).

- Toke Eskildsen


[ANNOUNCE] YCSB 0.7.0 Release

2016-02-26 Thread Kevin Risden
On behalf of the development community, I am pleased to announce the
release of YCSB 0.7.0.

Highlights:

* GemFire binding replaced with Apache Geode (incubating) binding
* Apache Solr binding was added
* OrientDB binding improvements
* HBase Kerberos support and use single connection
* Accumulo improvements
* JDBC improvements
* Couchbase scan implementation
* MongoDB improvements
* Elasticsearch version increase to 2.1.1

Full release notes, including links to source and convenience binaries:
https://github.com/brianfrankcooper/YCSB/releases/tag/0.7.0

This release covers changes from the last 1 month.


Re: Query time de-boost

2016-02-26 Thread shamik
Thanks Walter,   I've tried this earlier and it works. But the problem in my
case is that I've boosting on few Source parameters as well. My ideal "bq"
should like this:

 *bq=Source:simplecontent^10 Source:Help^20 (*:*
-ContentGroup-local:("Developer"))^99* 

But this is not going to work.

I'm working on the functional query side to see if this can be done.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-time-de-boost-tp4259309p4260077.html
Sent from the Solr - User mailing list archive at Nabble.com.


Thread Usage

2016-02-26 Thread Azazel K
Hi,


We have solr cluster with 2 shards running 2 nodes on each shard.  They are 
beefy physical boxes with index size of 162 GB , RAM of about 96 GB and around 
153M documents.


Two times this week we have seen the thread usage spike from the usual 1000 to 
4000 on all nodes at the same time and bring down the cluster.  We had to 
divert the traffic(search and update), perform a rolling restart each time, and 
put them back in.  Has anyone faced this issue before?  We don't have any other 
process running on the box that could cause such a huge spike in thread usage 
on all nodes at the same time.


Any pointers appreciated.


Thanks

A


Re: Solr does not receive any documents by nutch

2016-02-26 Thread Shawn Heisey
On 2/26/2016 9:22 AM, Merlin Morgenstern wrote:
> during the nutch run there are no activites inside the logfile as it seems.

Are you looking at the actual *logfile*, or the "Logging" tab in the
admin UI?  The Logging tab will only show you entries that are at least
WARN severity.  Most of what Solr logs is at INFO and will not show up
in the admin UI.

If the logfile doesn't have any activity during the run, then you will
need to ask the nutch mailing list for help, to figure out why it is not
properly sending requests to Solr.  I thought about a firewall, but
you're using localhost, which I would not expect a firewall to block.

Thanks,
Shawn



Shard State vs Replica State

2016-02-26 Thread Dennis Gove
In clusterstate.json (or just state.json in new versions) I'm seeing the
following

"shard1":{
"range":"8000-d554",
"state":"active",
"replicas":{
  "core_node7":{
"core":"people_shard1_replica3",
"base_url":"http://192.168.2.32:8983/solr;,
"node_name":"192.168.2.32:8983_solr",
"state":"down"},
  "core_node9":{
"core":"people_shard1_replica2",
"base_url":"http://192.168.2.32:8983/solr;,
"node_name":"192.168.2.32:8983_solr",
"state":"down"},
  "core_node2":{
"core":"people_shard1_replica1",
"base_url":"http://192.168.2.32:8983/solr;,
"node_name":"192.168.2.32:8983_solr",
"state":"down"}
}
},

All replicas are down (I hosed the index for one of the replicas on purpose
to simulate this) and each replica is showing its state accurately as
"down". But the shard state is still showing "active". I would expect the
shard state to reflect the availability of that shard (ie, the best state
across all the replicas). For example, if one replica is active then the
shard state is active, if two replicas are recovering and one is down then
the shard state shows recovering, etc...

What I'm seeing, however, doesn't match my expectation so I'm wondering
what is shard state showing?

Thanks,
Dennis


Re: Solr does not receive any documents by nutch

2016-02-26 Thread Shawn Heisey
On 2/26/2016 9:22 AM, Merlin Morgenstern wrote:
> during the nutch run there are no activites inside the logfile as it seems.
> However the logfile from the admin interface shows the following:
>
> 2/26/2016, 5:20:04 PM WARN null SolrConfig Couldn't add files from
> /usr/local/Cellar/solr/5.4.1/contrib/extraction/lib filtered by .*\.jar to
> classpath: /usr/local/Cellar/solr/5.4.1/contrib/extraction/lib
> 2/26/2016, 5:20:04 PM WARN null SolrConfig Couldn't add files from
> /usr/local/Cellar/solr/5.4.1/dist filtered by solr-cell-\d.*\.jar to
> classpath: /usr/local/Cellar/solr/5.4.1/dist

These are warnings from  configurations in your solrconfig.xml file
that point to locations that are not valid.  If this is the only thing
you have in the logfile that looks suspicious, you can ignore it, or
remove the  config.

Thanks,
Shawn



Re: Solr does not receive any documents by nutch

2016-02-26 Thread Merlin Morgenstern
during the nutch run there are no activites inside the logfile as it seems.
However the logfile from the admin interface shows the following:

2/26/2016, 5:20:04 PM WARN null SolrConfig Couldn't add files from
/usr/local/Cellar/solr/5.4.1/contrib/extraction/lib filtered by .*\.jar to
classpath: /usr/local/Cellar/solr/5.4.1/contrib/extraction/lib
2/26/2016, 5:20:04 PM WARN null SolrConfig Couldn't add files from
/usr/local/Cellar/solr/5.4.1/dist filtered by solr-cell-\d.*\.jar to
classpath: /usr/local/Cellar/solr/5.4.1/dist

2016-02-26 17:14 GMT+01:00 Shawn Heisey :

> On 2/26/2016 8:34 AM, Merlin Morgenstern wrote:
> > Unfortunatelly no documents get added to solr and no error log entries
> show
> > up. It seems as it would be working,  but the documents are not there.
>
> Is there anything happening in the Solr logfile at all during the nutch
> run?  I'm talking about any activity at all, not just errors.
>
> For the 4.x install I have no way to know where your logfile is, but if
> you used the service installer script for the newer version, the log
> will usually be in /var/solr/logs.
>
> Thanks,
> Shawn
>
>


Re: Solr does not receive any documents by nutch

2016-02-26 Thread Shawn Heisey
On 2/26/2016 8:34 AM, Merlin Morgenstern wrote:
> Unfortunatelly no documents get added to solr and no error log entries show
> up. It seems as it would be working,  but the documents are not there.

Is there anything happening in the Solr logfile at all during the nutch
run?  I'm talking about any activity at all, not just errors.

For the 4.x install I have no way to know where your logfile is, but if
you used the service installer script for the newer version, the log
will usually be in /var/solr/logs.

Thanks,
Shawn



Solr does not receive any documents by nutch

2016-02-26 Thread Merlin Morgenstern
I have nutch 1.11 installed together with solr 4.10.4 AND solr 5.4.1 on OS
X 10.11.

Nutch and Solr seem to work as nutch starts to index and solr shows the
admin interface together with the configured core.

Unfortunatelly no documents get added to solr and no error log entries show
up. It seems as it would be working,  but the documents are not there.

The command I am using is:

./bin/crawl -i -D solr.server.url=http://localhost:8983/solr/collection1
urls/ crawl/ 1

AND

./bin/crawl -i -D solr.server.url=http://localhost:8984/solr/crawler urls/
crawl/ 1

Nutch starts to crawl one configured domain and shows "fetching ...", I
then stop the process (ctrl c) after about 3 minutes as it would otherwise
crawl the entire domain.

Solr schema is configured and all other configurations found inside
tutorials has been done as well.

How could I approach this problem and get started? Thank you in advance for
any help!


Re: Deleting by query

2016-02-26 Thread Marc Burt

Thanks Jan,

That worked.

Kind Regards,

Marc

On 02/26/2016 01:43 PM, Jan Høydahl wrote:

Hi

Try this instead

/solr/de/update?stream.body=last_seen:[* TO 
2016-02-24T00:00:00Z]=true
…that is if you have streaming enabled in solrconfig. Else do a POST instead
Note that I put a commit=true at the end, so you will see the changes 
immediately.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com


26. feb. 2016 kl. 13.57 skrev Marc Burt :

Hi,

I'm trying to delete by query using the following:

/solr/de/update?last_seen:[* TO 
2016-02-24T00:00:00.00Z]

/solr/de/select?q=last_seen:[* TO 2016-02-24T00:00:00.00Z] returns the correct 
documents to be deleted.

Last time I attempted this using the above I somehow managed to delete all 
documents in the node rather than only the documents returned by the query.

Can anyone confirm that this is the correct method to delete documents by query?

--

Kind Regards,

Marc





Re: Solr | index | Lock Type

2016-02-26 Thread Shawn Heisey
On 2/26/2016 7:48 AM, Prateek Jain J wrote:
> WARN  - 2016-02-26 05:49:29.191; org.apache.solr.core.SolrCore; [cm_history] 
> WARNING: Solr index directory '/foo/solr/cm_history/data/index/' is locked.  
> Unlocking...
> WARN  - 2016-02-26 05:49:29.680; org.apache.solr.rest.ManagedResource; No 
> stored data found for /rest/managed
> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain 
> timed out: 
> SimpleFSLock@/foo/solr/data/index/write.lock

Looks like the index cannot be locked.  Perhaps the write.lock file
needs to be deleted before starting Solr.  Deleting that file might fix
it.  You will want to be absolutely sure that there are no running Solr
instances that are trying to use that index directory.

Why are you using the Simple lock?  The Native lock is the default, and
unless your index data is on an NFS share, is almost always the best
choice.  FYI: NFS shares are not recommended.  Regular filesystems are
supported a lot better.

Thanks,
Shawn



Re: Query time de-boost

2016-02-26 Thread Jack Krupansky
Could you share your actual numbers and test case? IOW, the document score
without ^0.01 and with ^0.01.

Again, to repeat, the specific boost factor may be positive, but the effect
of a fractional boost is to reduce, not add, to the score, so that a score
of 0.5 boosted by 0.1 would become 0.05. IOW, it de-boosts occurrences of
the term.

The point remains that you do not need a "negative boost" to de-boost a
term.


-- Jack Krupansky

On Fri, Feb 26, 2016 at 4:01 AM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi Jack,
> I just checked on 5.5 and 0.1 is positive boost.
>
> Regards,
> Emir
>
>
> On 26.02.2016 01:11, Jack Krupansky wrote:
>
>> 0.1 is a fractional boost - all intra-query boosts are multiplicative, not
>> additive, so term^0.1 reduces the term by 90%.
>>
>> -- Jack Krupansky
>>
>> On Wed, Feb 24, 2016 at 11:29 AM, shamik  wrote:
>>
>> Binoy, 0.1 is still a positive boost. With title getting the highest
>>> weight,
>>> this won't make any difference. I've tried this as well.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>>
>>> http://lucene.472066.n3.nabble.com/Query-time-de-boost-tp4259309p4259552.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


Solr | index | Lock Type

2016-02-26 Thread Prateek Jain J

Hi All,

We are seeing an issue with solr where solr is failing to initialize the cores 
with the following errors. We have gone through the solr documentation on these 
errors and its mentioned that this could happen when solr is running in 
clustered mode. But in our case solr is deployed in 2N-active deployment model. 
Also all the other configurations are done as recommended by solr.

WARN  - 2016-02-26 05:49:29.050; org.apache.solr.schema.IndexSchema; no 
uniqueKey specified in schema.
WARN  - 2016-02-26 05:49:29.075; org.apache.solr.schema.IndexSchema; no 
uniqueKey specified in schema.
WARN  - 2016-02-26 05:49:29.191; org.apache.solr.core.SolrCore; [cm_history] 
WARNING: Solr index directory '/foo/solr/cm_history/data/index/' is locked.  
Unlocking...
WARN  - 2016-02-26 05:49:29.680; org.apache.solr.rest.ManagedResource; No 
stored data found for /rest/managed
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed 
out: 
SimpleFSLock@/foo/solr/data/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:89)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:710)
at 
org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:77)
at 
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)


ERROR - 2016-02-26 07:15:01.412; org.apache.solr.update.SolrIndexWriter; 
SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE 
RESOURCE LEAK!!!
ERROR - 2016-02-26 07:15:01.413; org.apache.solr.update.SolrIndexWriter; Error 
closing IndexWriter, trying rollback
java.lang.NullPointerException
   at 
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:985)
   at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:935)
   at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:897)
   at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:132)

Here is the scenario:


1.   We are using solr 4.8.1

2.   This installation runs 3 different cores in it. All cores use 
${solr.lock.type:simple}.

3.   We are seeing above issue while solr starts coming up for the first 
time.

Any pointers what could be corrected here.


Regards,
Prateek Jain



Update command not working

2016-02-26 Thread Mike Thomsen
I posted this to http://localhost:8983/solr/default-collection/update and
it treated it like I was adding a whole document, not a partial update:

{
"id": "0be0daa1-a6ee-46d0-ba05-717a9c6ae283",
"tags": {
"add": [ "news article" ]
}
}

In the logs, I found this:

2016-02-26 14:07:50.831 ERROR (qtp2096057945-17) [c:default-collection
s:shard1_1 r:core_node21 x:default-collection] o.a.s.h.RequestHandlerBase
org.apache.solr.common.SolrException:
[doc=0be0daa1-a6ee-46d0-ba05-717a9c6ae283] missing required field: data_type
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
at
org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:83)
at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:273)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:207)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:169)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)

Does  this make any sense?  I sent updates just fine a day or two ago like
that, now it is acting like the update request is a whole new document.

Thanks,

Mike


Re: Deleting by query

2016-02-26 Thread Jan Høydahl
Hi

Try this instead

/solr/de/update?stream.body=last_seen:[* TO 
2016-02-24T00:00:00Z]=true
…that is if you have streaming enabled in solrconfig. Else do a POST instead
Note that I put a commit=true at the end, so you will see the changes 
immediately.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 26. feb. 2016 kl. 13.57 skrev Marc Burt :
> 
> Hi,
> 
> I'm trying to delete by query using the following:
> 
> /solr/de/update?last_seen:[* TO 
> 2016-02-24T00:00:00.00Z]
> 
> /solr/de/select?q=last_seen:[* TO 2016-02-24T00:00:00.00Z] returns the 
> correct documents to be deleted.
> 
> Last time I attempted this using the above I somehow managed to delete all 
> documents in the node rather than only the documents returned by the query.
> 
> Can anyone confirm that this is the correct method to delete documents by 
> query?
> 
> -- 
> 
> Kind Regards,
> 
> Marc
> 



Re: Is it different? q=(field1:value1 OR field2:value2) and q=field1:value1 OR field2:value2

2016-02-26 Thread John Blythe
not that i'm aware of. i think you could also simply have q=field1:value
field2:value in which the OR is implied

-- 
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | j...@curvolabs.com
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713

On Fri, Feb 26, 2016 at 8:31 AM, Aurélien MAZOYER <
aurelien.mazo...@francelabs.com> wrote:

> Hi,
>
> I think both the two queries are rewrited to the same query. You can use
> the debugQuery=on parameter to see how the query is rewrited and then
> compare if you get the same result for each query.
>
> Regards,
>
> Aurélien
>
> Le 26/02/2016 14:27, vitaly bulgakov a écrit :
>
>> Is there a difference when we put query in brackets?
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Is-it-different-q-field1-value1-OR-field2-value2-and-q-field1-value1-OR-field2-value2-tp4259976.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>


Re: Is it different? q=(field1:value1 OR field2:value2) and q=field1:value1 OR field2:value2

2016-02-26 Thread Aurélien MAZOYER

Hi,

I think both the two queries are rewrited to the same query. You can use 
the debugQuery=on parameter to see how the query is rewrited and then 
compare if you get the same result for each query.


Regards,

Aurélien

Le 26/02/2016 14:27, vitaly bulgakov a écrit :

Is there a difference when we put query in brackets?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-different-q-field1-value1-OR-field2-value2-and-q-field1-value1-OR-field2-value2-tp4259976.html
Sent from the Solr - User mailing list archive at Nabble.com.




Is it different? q=(field1:value1 OR field2:value2) and q=field1:value1 OR field2:value2

2016-02-26 Thread vitaly bulgakov
Is there a difference when we put query in brackets? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-different-q-field1-value1-OR-field2-value2-and-q-field1-value1-OR-field2-value2-tp4259976.html
Sent from the Solr - User mailing list archive at Nabble.com.


Deleting by query

2016-02-26 Thread Marc Burt

Hi,

I'm trying to delete by query using the following:

/solr/de/update?last_seen:[* TO 
2016-02-24T00:00:00.00Z]


/solr/de/select?q=last_seen:[* TO 2016-02-24T00:00:00.00Z] returns the 
correct documents to be deleted.


Last time I attempted this using the above I somehow managed to delete 
all documents in the node rather than only the documents returned by the 
query.


Can anyone confirm that this is the correct method to delete documents 
by query?


--

Kind Regards,

Marc



Re: Query time de-boost

2016-02-26 Thread Emir Arnautovic

Hi Jack,
I just checked on 5.5 and 0.1 is positive boost.

Regards,
Emir

On 26.02.2016 01:11, Jack Krupansky wrote:

0.1 is a fractional boost - all intra-query boosts are multiplicative, not
additive, so term^0.1 reduces the term by 90%.

-- Jack Krupansky

On Wed, Feb 24, 2016 at 11:29 AM, shamik  wrote:


Binoy, 0.1 is still a positive boost. With title getting the highest
weight,
this won't make any difference. I've tried this as well.



--
View this message in context:
http://lucene.472066.n3.nabble.com/Query-time-de-boost-tp4259309p4259552.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/