Re: master replication: what are 'master (searching)' and 'master (replicable)' fields

2017-07-31 Thread Erick Erickson
This isn't a problem. The index on the master node in master/slave
(searching) is the last time the _master_ opened a searcher whereas
(replicable) is the last hard commit. These are identical if you to
hard commits with openSearcher=true.

They are different if you have openSearcher=false for your hard
commits and then do a soft commit.

For SolrCloud these are irrelevant distinctions.

For a lot about the difference between commits, see:
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Mon, Jul 31, 2017 at 10:33 AM, Nawab Zada Asad Iqbal
 wrote:
> Hi,
>
> From the  solr console, I see  'master (searching)' and 'master
> (replicable)' fields on `host/solr/#/core_1/replication` page and wondering
> how does it impact me given that I don't have any replicas.
>
> If I don't have any replicas, does it make any impact on performance by
> enabling or disabling replication?
>
> Also, when i am indexing new documents, I see `master (searching)` size
> growing even when I disable the replication from this UI. What does it
> imply?
>
>
> Regards
> Nawab


RE: Arabic words search in solr

2017-07-31 Thread Phil Scadden
Further to that. What results do you get when you put those indexed terms into 
the Analysis tool on the Solr UI?

-Original Message-
From: Phil Scadden [mailto:p.scad...@gns.cri.nz]
Sent: Tuesday, 1 August 2017 9:06 a.m.
To: solr-user@lucene.apache.org
Subject: RE: Arabic words search in solr

Am I correct in assuming that you have the problem searching only when there is 
a hyphen in your indexed text? If you, then it would suggest that you need to 
use a different tokenizer when indexing - it looks like the hyphen is removed 
and words each side are concatenated - hence need both terms to find the text.

-Original Message-
From: mohanmca01 [mailto:mohanmc...@gmail.com]
Sent: Tuesday, 1 August 2017 1:18 a.m.
To: solr-user@lucene.apache.org
Subject: Re: Arabic words search in solr

Please help me on this...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4348372.html
Sent from the Solr - User mailing list archive at Nabble.com.
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.


programmatically setting filter query not working for me

2017-07-31 Thread Steve Pruitt
My use case is programmatically setting a query filter before executing the 
query.  I have a search component in the /select first-components list.
This component determines the filter query value and sets it in the process 
method.  I pass in a custom param to trigger the filter creation

I grab the parms from the request
SolrParams solrParams = rb.req.getParams();

I next create a collection called rawParams with
Map rawParams = 
SolrParams.toMultiMap(solrParams.toNamedList());

I create the structure to hold my filters and 
String[] filters = getFilter...

and then assign the new params like this
rawParams.put(CommonParams.FQ, filters);
solrParams = new MultiMapSolrParams(rawParams);
rb.req.setParams(solrParams);

The filter returns only a subset of the documents matching the terms.  In my 
test case its 2 out of 21 documents.
The problem is the response is all 21 documents.

I have a simple search component in the last-components list simply to set a 
breakpoint to see the query results.

I have tried two alternate tests using the Solr Admin console.

I first verified the filter query by directly pasting it in the fq field and it 
works ok.  I got back the expected 2 documents.

I next did the above and also included my custom params.  And this worked.  As 
expected, the SolarParms value of the req has a double entry for the filter.  
The values are identical.

So without using the fq field in the Console, I can't determine why setting via 
the Solr parms with Commons.FQ key doesn't work.

As a separate experiment, created a custom handler and with a custom component 
sub classing QueryComponent.  I do the filter query setup as above and calling 
super.process(rb) works.
But, I don't want to use a custom handler.

Thanks in advance if anyone points out where my assumptions are wrong.

Thanks,
Steve


RE: Arabic words search in solr

2017-07-31 Thread Phil Scadden
Am I correct in assuming that you have the problem searching only when there is 
a hyphen in your indexed text? If you, then it would suggest that you need to 
use a different tokenizer when indexing - it looks like the hyphen is removed 
and words each side are concatenated - hence need both terms to find the text.

-Original Message-
From: mohanmca01 [mailto:mohanmc...@gmail.com]
Sent: Tuesday, 1 August 2017 1:18 a.m.
To: solr-user@lucene.apache.org
Subject: Re: Arabic words search in solr

Please help me on this...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4348372.html
Sent from the Solr - User mailing list archive at Nabble.com.
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.


Move index directory to another partition

2017-07-31 Thread Mahmoud Almokadem
Hello,

I've a SolrCloud of four instances on Amazon and the EBS volumes that
contain the data on everynode is going to be full, unfortunately Amazon
doesn't support expanding the EBS. So, I'll attach larger EBS volumes to
move the index to.

I can stop the updates on the index, but I'm afraid to use "cp" command to
copy the files that are "on merge" operation.

The copy operation may take several  hours.

How can I move the data directory without stopping the instance?

Thanks,
Mahmoud


Re: Disadvantages of having many cores

2017-07-31 Thread Otis Gospodnetić
Hi,

Core per day is not too bad.  I assume you'll want to keep 7 days or maybe
30 or 60 or 180 days worth of logs.  That won't result in too many cores,
given adequate hardware.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Fri, Jul 28, 2017 at 9:04 AM, Chellasamy G 
wrote:

> Hi,
>
>
>
> I am working on a log management tool and considering to use solr to
> index/search the logs.
>
> I have few doubts about how to organize or create the cores.
>
>
>
> The tool  should process 200 million events per day with each event
> containing 40 to 50 fields. Currently I have planned to create a core per
> day pushing all the data to the day's core. This may lead to the creation
> of many cores. Is this a good design? If not please suggest a good
> design.(Also, if multiple cores are used, will it slowdown the solr
> process' uptime)
>
>
>
>
>
> Thanks,
>
> Satyan
>
>
>
>


Re: logging support in Lucene code

2017-07-31 Thread Nawab Zada Asad Iqbal
Thanks Shawn for the detailed context.
I saw some Logger (java.util.logging) in one class in lucene folder, hence
I thought that logging is now properly supported. Since, i am using solr
(and indirectly lucene), I will use whatever solr is using.

Not depending on any concrete logger is good for lucene, as it is included
in other projects too.


Regards
Nawab

On Fri, Jul 28, 2017 at 6:57 AM, Shawn Heisey  wrote:

> On 7/27/2017 10:57 AM, Nawab Zada Asad Iqbal wrote:
> > I see a lot of discussion on this topic from almost 10 years ago: e.g.,
> > https://issues.apache.org/jira/browse/LUCENE-1482
> >
> > For 4.5, I relied on 'System.out.println' for writing information for
> > debugging in production.
> >
> > In 6.6, I notice that some classes in Lucene are instantiating a Logger,
> > should I use Logger instead? I tried to log with it, but I don't see any
> > output in logs.
>
> You're asking about this on a Solr list, not a Lucene list.  I am not
> subscribed to the main Lucene user list, so I do not know if you have
> also sent this question to that list.
>
> Solr uses slf4j for logging.  Many of its dependencies have chosen other
> logging frameworks.
>
> https://www.slf4j.org/
>
> With slf4j, you can utilize just about any supported logging
> implementation to do the actual end logging.  The end implementation
> chosen by the Solr project for version 4.3 and later is log4j 1.x.
>
> It is my understanding that Lucene's core module has zero dependencies
> -- it's pure Java.  That would include any external logging
> implementation.  I do not know if the core module even uses
> java.util.logging ... a quick grep for "Logger" suggests that there are
> no loggers in use in the core module at all, but it's possible that I
> have not scanned for the correct text.  I did notice that
> TestIndexWriter uses a PrintStream for logging, and Shalin's reply has
> reminded me about the infoStream feature.
>
> Looking at the source code, it does appear that some of the other Lucene
> modules do use a logger. Some of them appear to use the logger built
> into java, others seem to use one of the third-party implementations
> like slf4j.  Some of the dependent jars pulled in for non-core Lucene
> modules depend on various logging implementations.
>
> Logging frameworks can be the center of a religious flamewar.  Opinions
> run strong.  IMHO, if you are writing your own code, the best option is
> slf4j, bound to whatever end logging implementation you are most
> comfortable using.  You can install slf4j jars to intercept logging sent
> to the other common logging implementations and direct those through
> slf4j so they end up in the same place as everything else.
>
> Note if you want to use log4j2 as your end logging destination with
> slf4j: log4j2 comes with jars implementing the slf4j classes, so you're
> probably going to want to use those.
>
> Thanks,
> Shawn
>
>


Re: logging support in Lucene code

2017-07-31 Thread Nawab Zada Asad Iqbal
Thanks Shalin
I actually had that config true, so it seems that I may not be exercising
the right scenario to execute that logline.

On Fri, Jul 28, 2017 at 1:47 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> Lucene does not use a logger framework. But if you are using Solr then you
> can route the infoStream logging to Solr's log files by setting an option
> in the solrconfig.xml. See
> http://lucene.apache.org/solr/guide/6_6/indexconfig-in-solrconfig.html#
> IndexConfiginSolrConfig-OtherIndexingSettings
>
> On Fri, Jul 28, 2017 at 11:13 AM, Nawab Zada Asad Iqbal 
> wrote:
>
> > Any doughnut for me ?
> >
> >
> > Regards
> > Nawab
> >
> > On Thu, Jul 27, 2017 at 9:57 AM Nawab Zada Asad Iqbal 
> > wrote:
> >
> > > Hi,
> > >
> > > I see a lot of discussion on this topic from almost 10 years ago: e.g.,
> > > https://issues.apache.org/jira/browse/LUCENE-1482
> > >
> > > For 4.5, I relied on 'System.out.println' for writing information for
> > > debugging in production.
> > >
> > > In 6.6, I notice that some classes in Lucene are instantiating a
> Logger,
> > > should I use Logger instead? I tried to log with it, but I don't see
> any
> > > output in logs.
> > >
> > >
> > > Regards
> > > Nawab
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


master replication: what are 'master (searching)' and 'master (replicable)' fields

2017-07-31 Thread Nawab Zada Asad Iqbal
Hi,

>From the  solr console, I see  'master (searching)' and 'master
(replicable)' fields on `host/solr/#/core_1/replication` page and wondering
how does it impact me given that I don't have any replicas.

If I don't have any replicas, does it make any impact on performance by
enabling or disabling replication?

Also, when i am indexing new documents, I see `master (searching)` size
growing even when I disable the replication from this UI. What does it
imply?


Regards
Nawab


Re: HTTP ERROR 504 - Optimize

2017-07-31 Thread Erick Erickson
When? When you optimize? During queries? If the latter, I doubt you'll fix
it with optimization.

On Jul 31, 2017 1:19 AM, "marotosg"  wrote:

> Basically an issue with loadbalancer timeout.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/HTTP-ERROR-504-Optimize-tp4345815p4348330.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Arabic words search in solr

2017-07-31 Thread mohanmca01
Please help me on this...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4348372.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Input and Output format

2017-07-31 Thread Ranganath B N

Hi All,

 Can you point me to some of the implementations  of Solr Input and Output 
format? I wanted to know them to  understand the distributed implementation 
approach.


Thanks,
Ranganath B. N.



Re: Getting IO Exception while Indexing

2017-07-31 Thread mesenthil1
We printed in most of the places but could not get any significant
differences between successful and error documents.  We modified our logic
to use direct http client and posted the JSON messages  directly to solr
cloud.  Most of the ids are fine now. 

But we still see same issue with minimal documents. When we run the same
code from different linux boxes, it is fine. When enabled apache dumpio,the
payload is not completely passed to apache while executing form this
machine.  While collecting apache dump on error_log, we see the following
error message

"(70008)Partial results are valid but processing is incomplete: proxy:
prefetch request body failed to"

As the request payload [incomplete or partial json]  is not full, the
request is not forwarded to solr itself. it fails in apache level and
returned as 400.   In client side getting Connection reset exception.

Any help would be really helpful.







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-IO-Exception-while-Indexing-Documents-in-SolrCloud-tp4346801p4348367.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: edismax, pf2 and use of both AND and OR parameter

2017-07-31 Thread Niraj Aswani
Hi Aman,

Thank you very much your reply.

Let me elaborate my question a bit more using your example in this case.

AFAIK, what the pf2 parameter is doing to the query is adding the following
phrase queries:

(_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail")

There are three phrases being checked here:
- system memory
- memory oem
- oem retail

However, what I actually expected it to look like is the following:
- system memory
- memory oem
- memory retail

My understanding of the edismax parser is that it interprets the AND / OR
parameters correctly so it should generate the bi-gram phrases respecting
the AND /OR parameters as well, right?

Am I missing something here?

Regards,
Niraj

On Mon, Jul 31, 2017 at 4:24 AM, Aman Tandon 
wrote:

> Hi Niraj,
>
> Should I expect it to check the following bigram phrases?
>
> Yes it will check.
>
> ex- documents & query is given below
>
> http://localhost:8983/solr/myfile/select?wt=xml=name;
> indent=on=*System
> AND Memory AND (OEM OR Retail)*=50=json&*qf=_text_=_text_*
> =true=edismax
>
> 
> 
> 
> 
> A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System
> Memory - OEM
> 
> 
> 
> 
> 
> 
> CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
> System Memory - Retail
> 
> 
> 
> 
> 
> 
> CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
> Dual Channel Kit System Memory - Retail
> 
> 
> 
> 
>
>
> *Below is the parsed query*
>
> 
> +(+(_text_:system) +(_text_:memory) +((_text_:oem) (_text_:retail)))
> ((_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail"))
> 
>
> In case if you are in such scenarios where you need to knwo what query will
> form, then you could us the debug=true to know more about the query &
> timings of different component.
>
> *And when the ps2 is not specified default ps will be applied on pf2.*
>
> I hope this helps.
>
> With Regards
> Aman Tandon
>
> On Mon, Jul 31, 2017 at 4:18 AM, Niraj Aswani 
> wrote:
>
> > Hi,
> >
> > I am using solr 4.4 and bit confused about how does the edismax parser
> > treat the pf2 parameter when both the AND and OR operators are used in
> the
> > query with ps2=0
> >
> > For example:
> >
> > pf2=title^100
> > q=HDMI AND Video AND (Wire OR Cable)
> >
> > Should I expect it to check the following bigram phrases?
> >
> > hdmi video
> > video wire
> > video cable
> >
> > Regards
> > Niraj
> >
>


Restore fails - File missing

2017-07-31 Thread marotosg
Hi,

I am trying to do a backup and restore of 1 collection on SolrCloud version
6.1.0.
I tried a few times and no issues but suddenly after indexing the collection
from scratch and doing a backup. I got an issue on the restore.

After the restore,the collection fails complaining about 1 file missing
"_6i.si". My cluster is compose from 1shard and 3 servers with the 1shard
replicated. I tried to do the backup calling every server separately but I
am getting the same error on restore.

I went and manually check the 3 servers and looks like 1 of them has that
file but not the other two. 

Just to try to solve the issue I did an optimization of the index and backup
and restore succeed.

Anyone has any idea what could be happening here?

Thanks,
Sergio



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Restore-fails-File-missing-tp4348332.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: HTTP ERROR 504 - Optimize

2017-07-31 Thread marotosg
Basically an issue with loadbalancer timeout.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/HTTP-ERROR-504-Optimize-tp4345815p4348330.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boost by Integer value on top of query

2017-07-31 Thread marotosg
Thanks a lot for the answer.
I finally achieve this using boost and scale function on top of my query
https://wiki.apache.org/solr/FunctionQuery#scale

Thanks to scale no matter how big are the values or small on my People and
Assignment Columns I can range them to a value between 1 and 2. 

{!boost b=sum(scale(PeopleTotal,1,2),scale(AssignmentsTotal,1,2))}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boost-by-Integer-value-on-top-of-query-tp4346948p4348329.html
Sent from the Solr - User mailing list archive at Nabble.com.