date:20160503

Re: OOM script executed

2016-05-03 Thread Bastien Latard - MDPI AG


Hi Tomás,

Thank you for your email.
You said "have big caches or request big pages (e.g. 100k docs)"...
Does a fq cache all the potential results, or only the ones the query 
returns?

e.g.: select?q=*:*=bPublic:true=10

=> with this query, if I have 60 millions of public documents, would it 
cache 10 or 60 millions of IDs?
...and does it cache it the filter cache (from fq) in the OS cache or in 
java heap?


kr,
Bastien

On 04/05/2016 02:31, Tomás Fernández Löbbe wrote:

You could use some memory analyzer tools (e.g. jmap), that could give you a
hint. But if you are migrating, I'd start to see if you changed something
from the previous version, including jvm settings, schema/solrconfig.
If nothing is different, I'd try to identify which feature is consuming
more memory. If you use faceting/stats/suggester, or you have big caches or
request big pages (e.g. 100k docs) or use Solr Cell for extracting content,
those are some usual suspects. Try to narrow it down, it could be many
things. Turn on/off features as you look at the memory (you could use
something like jconsole/jvisualvm/jstat) and see when it spikes, compare
with the previous version. That's that I would do at least.

If you get to narrow it down to a specific feature, then you can come back
to the users list and ask with some more specifics, that way someone could
point you to the solution, or maybe file a JIRA if it turns out to be a bug.

Tomás

On Mon, May 2, 2016 at 11:34 PM, Bastien Latard - MDPI AG <
lat...@mdpi.com.invalid> wrote:


Hi Tomás,

Thanks for your answer.
How could I see what's using memory?
I tried to add "-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/solr/logs/OOM_Heap_dump/"
...but this doesn't seem to be really helpful...

Kind regards,
Bastien


On 02/05/2016 22:55, Tomás Fernández Löbbe wrote:


You could, but before that I'd try to see what's using your memory and see
if you can decrease that. Maybe identify why you are running OOM now and
not with your previous Solr version (assuming you weren't, and that you
are
running with the same JVM settings). A bigger heap usually means more work
to the GC and less memory available for the OS cache.

Tomás

On Sun, May 1, 2016 at 11:20 PM, Bastien Latard - MDPI AG <
lat...@mdpi.com.invalid> wrote:

Hi Guys,

I got several times the OOM script executed since I upgraded to Solr6.0:

$ cat solr_oom_killer-8983-2016-04-29_15_16_51.log
Running OOM killer script for process 26044 for Solr on port 8983

Does it mean that I need to increase my JAVA Heap?
Or should I do anything else?

Here are some further logs:
$ cat solr_gc_log_20160502_0730:
}
{Heap before GC invocations=1674 (full 91):
   par new generation   total 1747648K, used 1747135K [0x0005c000,
0x00064000, 0x00064000)
eden space 1398144K, 100% used [0x0005c000,
0x00061556,
0x00061556)
from space 349504K,  99% used [0x00061556, 0x00062aa2fc30,
0x00062aab)
to   space 349504K,   0% used [0x00062aab, 0x00062aab,
0x00064000)
   concurrent mark-sweep generation total 6291456K, used 6291455K
[0x00064000, 0x0007c000, 0x0007c000)
   Metaspace   used 39845K, capacity 40346K, committed 40704K,
reserved
1085440K
class spaceused 4142K, capacity 4273K, committed 4368K, reserved
1048576K
2016-04-29T21:15:41.970+0200: 20356.359: [Full GC (Allocation Failure)
2016-04-29T21:15:41.970+0200: 20356.359: [CMS:
6291455K->6291456K(6291456K), 12.5694653 secs]
8038591K->8038590K(8039104K), [Metaspace: 39845K->39845K(1085440K)],
12.5695497 secs] [Times: user=12.57 sys=0.00, real=12.57 secs]


Kind regards,
Bastien




Kind regards,
Bastien Latard
Web engineer
--
MDPI AG
Postfach, CH-4005 Basel, Switzerland
Office: Klybeckstrasse 64, CH-4057
Tel. +41 61 683 77 35
Fax: +41 61 302 89 18
E-mail:
lat...@mdpi.com
http://www.mdpi.com/




Kind regards,
Bastien Latard
Web engineer
--
MDPI AG
Postfach, CH-4005 Basel, Switzerland
Office: Klybeckstrasse 64, CH-4057
Tel. +41 61 683 77 35
Fax: +41 61 302 89 18
E-mail:
lat...@mdpi.com
http://www.mdpi.com/

Re: Include and exclude feature with multi valued fileds

2016-05-03 Thread Anil

Hi Ahmet,

Thanks for the response. Following are sample documents.

Doc 1 :

id : 1
customers : ["facebook', "google"]
issueId:1231
description: Some description

Doc2 :

id : 2
customers : ["twitter", "google"]
issueId:1231
description: Some description

Doc3 :

id : 2
customers : ["facebook', "amazon"]
issueId:1233
description: Some description

Query pattern : Get documents which include facebook as customer but not
google
Expected documents : id = 1, 3
Used Query : customers:"facebook" and -customers:"google"
Actual documents : id = 2

Please let me know if I have to change the query to see the expected
documents. Thanks.

Regards,
Anil

On 3 May 2016 at 19:44, Ahmet Arslan  wrote:

> Can you provide us example documents? Which you want to match which you
> don't?
>
>
>
> On Tuesday, May 3, 2016 3:15 PM, Anil  wrote:
> Any inputs please ?
>
>
> On 2 May 2016 at 18:18, Anil  wrote:
>
> > HI,
> >
> > i have created a document with multi valued fields.
> >
> > Eg :
> > An issue is impacting multiple customers, products, versions etc.
> >
> > In my issue document, i have created customers, products, versions as
> > multi valued fields.
> >
> > how to find all issues that are impacting google (customer) but not
> > facebook (customer) ?
> >
> > Google and facebook can be part of in single issue document.
> >
> > Please let me know if you have any questions. Thanks.
> >
> > Regards,
> > Anil
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

Re: SOLR edismax and mm request parameter

2016-05-03 Thread Ahmet Arslan

Hi Mark,

You could do something like this:

_query_:{!dismax qf='field1' mm='100%' v=$qq} 
OR 
_query_:{!dismax qf='field2' mm='100%' v=$qq}
OR
_query_:{!dismax qf='field3' mm='100%' v=$qq}


https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries

Ahmet



On Wednesday, May 4, 2016 4:59 AM, Mark Robinson  
wrote:
Hi,
On further checking cld identify that *blue *is indeed appearing in one of
the qf fields.My bad!

Cld someone pls help me with the 2nd question.

Thanks!
Mark.



On Tue, May 3, 2016 at 8:03 PM, Mark Robinson 
wrote:

> Hi,
>
> I made a typo err in the prev mail for my first question when I listed the
> query terms.
> Let me re-type both questions here once again pls.
> Sorry for any inconvenience.
>
> 1.
> My understanding of the mm parameter related to edismax is that,
> if mm=100%,  only if ALL my query terms appear across any of the qf fields
> will I get back
> documents ... ie all the terms *need not be present in one single field* ..
> they just need to be present across any of the fields in my qf list.
>
> But my query for  the terms:-
> *blue stainless washer*
> ... returns a document which has *Stainless Washer *in one of my qf
> fields, but *blue *is not there in any of the qf fields. Then how did it
> get returned even though I had given mm=100% (100%25 when I typed directly
> in browser). Any suggestions please.. In fact this is my first record!
>
> 2.
> Another question I have is:-
> With edismax can I enforce that all my query terms should appear in ANY of
> my qf fields to qualify as a result document? I know all terms appearing in
> a single field can give a boost if we use the "pf" query parameter
> accordingly. But how can I insist that to qualify as a result, the doc
> should have ALL of my query term in one or more of the qf fields?
>
>
> Cld some one pls help.
>
> Thanks!
>
> Mark
>
> On Tue, May 3, 2016 at 6:28 PM, Mark Robinson 
> wrote:
>
>> Hi,
>>
>> 1.
>> My understanding of the mm parameter related to edismax is that,
>> if mm=100%,  only if ALL my query terms appear across any of the qf
>> fields will I get back
>> documents ... ie all the terms *need not be present in one single field*
>> .. they just need to be present across any of the fields in my qf list.
>>
>> But my query for  the terms:-
>> *blue stainless washer*
>> ... returns a document which has *Stainless Washer *in one of my qf
>> fields, but *refrigerator *is not there in any of the qf fields. Then
>> how did it get returned even though I had given mm=100% (100%25 when I
>> typed directly in browser). Any suggestions please.
>>
>> 2.
>> Another question I have is:-
>> With edismax can I enforce that all my query terms should appear in ANY
>> of my qf fields to qualify as a result document? I know all terms appearing
>> in a single field can give a boost if we use the "pf" query parameter
>> accordingly. But how can I insist that to qualify as a result, the doc
>> should have ALL of my query term in one or more of the qf fields?
>>
>>
>> Cld some one pls help.
>>
>> Thanks!
>> Mark.
>>
>
>

Re: BlockJoinFacetComponent on solr 4.10

2016-05-03 Thread tkg_cangkul


hi shawn, thx for your reply.

i've try to remove




and yes it works when i installing casebox but i don't know where is the 
change on casebox when i'm not include that class.

i will try to use a version of casebox before 15th march like your suggest.

Thx again for your help shawn.
:cheers

On 03/05/16 22:21, Shawn Heisey wrote:

On 5/2/2016 2:59 AM, tkg_cangkul wrote:

hi i wanna asking a question about using


 BlockJoinFacetComponent

in solr 4. how can i use that library on solr 4.10.4
i want to install casebox with solr 4.10.4 but i have this error.




when i check in solr-core-4.10.4.jar there is no class Block JoinFacet
COmponent. i found it in solr-core.6.0.0.jar . is it any try for me to
can use it on solr 4.?
pls help

The BlockJoinFacetComponent class was introduced in Solr 5.5.

https://issues.apache.org/jira/browse/SOLR-5743

The error you are seeing is because of this text in the solrconfig.xml
provided with casebox:




Removing that XML and any other XML where that component is referenced
would fix the error, but I would imagine that that casebox would not
work correctly after the change.

I do not know if you can copy the class from a newer version into the
4.10 source code.  From what I've seen, Solr code changes so fast from
release to release that you would probably run into big problems trying
to adapt a plugin that's intended for 5.5 and newer to a 4.x version.
Even if you copy the code for the missing class, that code probably
depends on other changes and additions that have happened in the many
releases between 4.10.4 and 5.5.

Your best bet is to talk to the casebox people.  They can tell you
exactly what is required for their software, and whether or not they are
willing to support such an old version of Solr.  If I had to guess, they
probably *can't* support older Solr in their newest versions.  Their
software probably relies on capability (like BlockJoinFacetComponent)
only available in newer Solr versions.

If you were to use a version of casebox before March 15th, it *might*
work.  This commit was made on that date, and among other changes, added
BlockJoinFacetComponent to solrconfig.xml:

https://github.com/KETSE/casebox/commit/3fdc48564efd35e8a0229cb86bdab0ad4f493511

Thanks,
Shawn

Re: SOLR edismax and mm request parameter

2016-05-03 Thread Mark Robinson

Hi,
On further checking cld identify that *blue *is indeed appearing in one of
the qf fields.My bad!

Cld someone pls help me with the 2nd question.

Thanks!
Mark.


On Tue, May 3, 2016 at 8:03 PM, Mark Robinson 
wrote:

> Hi,
>
> I made a typo err in the prev mail for my first question when I listed the
> query terms.
> Let me re-type both questions here once again pls.
> Sorry for any inconvenience.
>
> 1.
> My understanding of the mm parameter related to edismax is that,
> if mm=100%,  only if ALL my query terms appear across any of the qf fields
> will I get back
> documents ... ie all the terms *need not be present in one single field* ..
> they just need to be present across any of the fields in my qf list.
>
> But my query for  the terms:-
> *blue stainless washer*
> ... returns a document which has *Stainless Washer *in one of my qf
> fields, but *blue *is not there in any of the qf fields. Then how did it
> get returned even though I had given mm=100% (100%25 when I typed directly
> in browser). Any suggestions please.. In fact this is my first record!
>
> 2.
> Another question I have is:-
> With edismax can I enforce that all my query terms should appear in ANY of
> my qf fields to qualify as a result document? I know all terms appearing in
> a single field can give a boost if we use the "pf" query parameter
> accordingly. But how can I insist that to qualify as a result, the doc
> should have ALL of my query term in one or more of the qf fields?
>
>
> Cld some one pls help.
>
> Thanks!
>
> Mark
>
> On Tue, May 3, 2016 at 6:28 PM, Mark Robinson 
> wrote:
>
>> Hi,
>>
>> 1.
>> My understanding of the mm parameter related to edismax is that,
>> if mm=100%,  only if ALL my query terms appear across any of the qf
>> fields will I get back
>> documents ... ie all the terms *need not be present in one single field*
>> .. they just need to be present across any of the fields in my qf list.
>>
>> But my query for  the terms:-
>> *blue stainless washer*
>> ... returns a document which has *Stainless Washer *in one of my qf
>> fields, but *refrigerator *is not there in any of the qf fields. Then
>> how did it get returned even though I had given mm=100% (100%25 when I
>> typed directly in browser). Any suggestions please.
>>
>> 2.
>> Another question I have is:-
>> With edismax can I enforce that all my query terms should appear in ANY
>> of my qf fields to qualify as a result document? I know all terms appearing
>> in a single field can give a boost if we use the "pf" query parameter
>> accordingly. But how can I insist that to qualify as a result, the doc
>> should have ALL of my query term in one or more of the qf fields?
>>
>>
>> Cld some one pls help.
>>
>> Thanks!
>> Mark.
>>
>
>

Re: Parallel SQL Interface returns "java.lang.NullPointerException" after reloading collection

2016-05-03 Thread Ryan Yacyshyn

Thanks Joel! :)

On Tue, 3 May 2016, 23:37 Joel Bernstein,  wrote:

> Ryan, there is a patch (for the master branch) up on SOLR-9059 that
> resolves the issue. This will be in 6.1 and 6.0.1 if there is one. Thanks
> for the bug report!
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, May 3, 2016 at 10:41 AM, Joel Bernstein 
> wrote:
>
> > I opened SOLR-9059.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Tue, May 3, 2016 at 10:31 AM, Joel Bernstein 
> > wrote:
> >
> >> What I believe is happening is that the core is closing on the reload,
> >> which is triggering the closeHook and shutting down all the connections
> in
> >> SolrClientCache.
> >>
> >> When the core reopens the connections are all still closed because the
> >> SolrClientCache is instantiated statically with the creation of the
> >> StreamHandler.
> >>
> >> So I think the correct fix is to create the SolrClientCache in inform(),
> >> that way it will get recreated with each reload. As long as the
> closeHook
> >> has closed the existing SolrClientCache this shouldn't cause any
> connection
> >> leaks with reloads.
> >>
> >>
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> On Tue, May 3, 2016 at 10:01 AM, Joel Bernstein 
> >> wrote:
> >>
> >>> I'll look into this today.
> >>>
> >>> Joel Bernstein
> >>> http://joelsolr.blogspot.com/
> >>>
> >>> On Tue, May 3, 2016 at 9:22 AM, Kevin Risden <
> risd...@avalonconsult.com>
> >>> wrote:
> >>>
>  What I think is happening is that since the CloudSolrClient is from
> the
>  SolrCache and the collection was reloaded. zkStateReader is actually
>  null
>  since there was no cloudSolrClient.connect() call after the reload. I
>  think
>  that would cause the NPE on anything that uses the zkStateReader like
>  getClusterState().
> 
>  ZkStateReader zkStateReader = cloudSolrClient.getZkStateReader();
>  ClusterState clusterState = zkStateReader.getClusterState();
> 
> 
>  Kevin Risden
>  Apache Lucene/Solr Committer
>  Hadoop and Search Tech Lead | Avalon Consulting, LLC
>  
>  M: 732 213 8417
>  LinkedIn  |
>  Google+
>   | Twitter
>  
> 
> 
> 
> -
>  This message (including any attachments) contains confidential
>  information
>  intended for a specific individual and purpose, and is protected by
>  law. If
>  you are not the intended recipient, you should delete this message.
> Any
>  disclosure, copying, or distribution of this message, or the taking of
>  any
>  action based on it, is strictly prohibited.
> 
>  On Mon, May 2, 2016 at 9:58 PM, Joel Bernstein 
>  wrote:
> 
>  > Looks like the loop below is throwing a Null pointer. I suspect the
>  > collection has not yet come back online. In theory this should be
> self
>  > healing and when the collection comes back online it should start
>  working
>  > again. If not then that would be a bug.
>  >
>  > for(String col : clusterState.getCollections()) {
>  >
>  >
>  > Joel Bernstein
>  > http://joelsolr.blogspot.com/
>  >
>  > On Mon, May 2, 2016 at 10:06 PM, Ryan Yacyshyn <
>  ryan.yacys...@gmail.com>
>  > wrote:
>  >
>  > > Yes stack trace can be found here:
>  > >
>  > > http://pastie.org/10821638
>  > >
>  > >
>  > >
>  > > On Mon, 2 May 2016 at 01:05 Joel Bernstein 
>  wrote:
>  > >
>  > > > Can you post your stack trace? I suspect this has to do with how
>  the
>  > > > Streaming API is interacting with SolrCloud. We can probably
> also
>  > create
>  > > a
>  > > > jira ticket for this.
>  > > >
>  > > > Joel Bernstein
>  > > > http://joelsolr.blogspot.com/
>  > > >
>  > > > On Sun, May 1, 2016 at 4:02 AM, Ryan Yacyshyn <
>  ryan.yacys...@gmail.com
>  > >
>  > > > wrote:
>  > > >
>  > > > > Hi all,
>  > > > >
>  > > > > I'm exploring with parallel SQL queries and found something
>  strange
>  > > after
>  > > > > reloading the collection: the same query will return a
>  > > > > java.lang.NullPointerException error. Here are my steps on a
>  fresh
>  > > > install
>  > > > > of Solr 6.0.0.
>  > > > >
>  > > > > *Start Solr in cloud mode with example*
>  > > > > bin/solr -e cloud -noprompt
>  > > > >
>  > > > > *Index some data*
>  > > > > bin/post -c gettingstarted example/exampledocs/*.xml
>  > > > >
>  > > > > *Send query, which

Re: OOM script executed

2016-05-03 Thread Tomás Fernández Löbbe

You could use some memory analyzer tools (e.g. jmap), that could give you a
hint. But if you are migrating, I'd start to see if you changed something
from the previous version, including jvm settings, schema/solrconfig.
If nothing is different, I'd try to identify which feature is consuming
more memory. If you use faceting/stats/suggester, or you have big caches or
request big pages (e.g. 100k docs) or use Solr Cell for extracting content,
those are some usual suspects. Try to narrow it down, it could be many
things. Turn on/off features as you look at the memory (you could use
something like jconsole/jvisualvm/jstat) and see when it spikes, compare
with the previous version. That's that I would do at least.

If you get to narrow it down to a specific feature, then you can come back
to the users list and ask with some more specifics, that way someone could
point you to the solution, or maybe file a JIRA if it turns out to be a bug.

Tomás

On Mon, May 2, 2016 at 11:34 PM, Bastien Latard - MDPI AG <
lat...@mdpi.com.invalid> wrote:

> Hi Tomás,
>
> Thanks for your answer.
> How could I see what's using memory?
> I tried to add "-XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/var/solr/logs/OOM_Heap_dump/"
> ...but this doesn't seem to be really helpful...
>
> Kind regards,
> Bastien
>
>
> On 02/05/2016 22:55, Tomás Fernández Löbbe wrote:
>
>> You could, but before that I'd try to see what's using your memory and see
>> if you can decrease that. Maybe identify why you are running OOM now and
>> not with your previous Solr version (assuming you weren't, and that you
>> are
>> running with the same JVM settings). A bigger heap usually means more work
>> to the GC and less memory available for the OS cache.
>>
>> Tomás
>>
>> On Sun, May 1, 2016 at 11:20 PM, Bastien Latard - MDPI AG <
>> lat...@mdpi.com.invalid> wrote:
>>
>> Hi Guys,
>>>
>>> I got several times the OOM script executed since I upgraded to Solr6.0:
>>>
>>> $ cat solr_oom_killer-8983-2016-04-29_15_16_51.log
>>> Running OOM killer script for process 26044 for Solr on port 8983
>>>
>>> Does it mean that I need to increase my JAVA Heap?
>>> Or should I do anything else?
>>>
>>> Here are some further logs:
>>> $ cat solr_gc_log_20160502_0730:
>>> }
>>> {Heap before GC invocations=1674 (full 91):
>>>   par new generation   total 1747648K, used 1747135K [0x0005c000,
>>> 0x00064000, 0x00064000)
>>>eden space 1398144K, 100% used [0x0005c000,
>>> 0x00061556,
>>> 0x00061556)
>>>from space 349504K,  99% used [0x00061556, 0x00062aa2fc30,
>>> 0x00062aab)
>>>to   space 349504K,   0% used [0x00062aab, 0x00062aab,
>>> 0x00064000)
>>>   concurrent mark-sweep generation total 6291456K, used 6291455K
>>> [0x00064000, 0x0007c000, 0x0007c000)
>>>   Metaspace   used 39845K, capacity 40346K, committed 40704K,
>>> reserved
>>> 1085440K
>>>class spaceused 4142K, capacity 4273K, committed 4368K, reserved
>>> 1048576K
>>> 2016-04-29T21:15:41.970+0200: 20356.359: [Full GC (Allocation Failure)
>>> 2016-04-29T21:15:41.970+0200: 20356.359: [CMS:
>>> 6291455K->6291456K(6291456K), 12.5694653 secs]
>>> 8038591K->8038590K(8039104K), [Metaspace: 39845K->39845K(1085440K)],
>>> 12.5695497 secs] [Times: user=12.57 sys=0.00, real=12.57 secs]
>>>
>>>
>>> Kind regards,
>>> Bastien
>>>
>>>
>>>
> Kind regards,
> Bastien Latard
> Web engineer
> --
> MDPI AG
> Postfach, CH-4005 Basel, Switzerland
> Office: Klybeckstrasse 64, CH-4057
> Tel. +41 61 683 77 35
> Fax: +41 61 302 89 18
> E-mail:
> lat...@mdpi.com
> http://www.mdpi.com/
>
>

Re: SOLR edismax and mm request parameter

2016-05-03 Thread Mark Robinson

Hi,

I made a typo err in the prev mail for my first question when I listed the
query terms.
Let me re-type both questions here once again pls.
Sorry for any inconvenience.

1.
My understanding of the mm parameter related to edismax is that,
if mm=100%,  only if ALL my query terms appear across any of the qf fields
will I get back
documents ... ie all the terms *need not be present in one single field* ..
they just need to be present across any of the fields in my qf list.

But my query for  the terms:-
*blue stainless washer*
... returns a document which has *Stainless Washer *in one of my qf fields,
but *blue *is not there in any of the qf fields. Then how did it get
returned even though I had given mm=100% (100%25 when I typed directly in
browser). Any suggestions please.. In fact this is my first record!

2.
Another question I have is:-
With edismax can I enforce that all my query terms should appear in ANY of
my qf fields to qualify as a result document? I know all terms appearing in
a single field can give a boost if we use the "pf" query parameter
accordingly. But how can I insist that to qualify as a result, the doc
should have ALL of my query term in one or more of the qf fields?

Cld some one pls help.

Thanks!

Mark

On Tue, May 3, 2016 at 6:28 PM, Mark Robinson 
wrote:

> Hi,
>
> 1.
> My understanding of the mm parameter related to edismax is that,
> if mm=100%,  only if ALL my query terms appear across any of the qf fields
> will I get back
> documents ... ie all the terms *need not be present in one single field*
> .. they just need to be present across any of the fields in my qf list.
>
> But my query for  the terms:-
> *blue stainless washer*
> ... returns a document which has *Stainless Washer *in one of my qf
> fields, but *refrigerator *is not there in any of the qf fields. Then how
> did it get returned even though I had given mm=100% (100%25 when I typed
> directly in browser). Any suggestions please.
>
> 2.
> Another question I have is:-
> With edismax can I enforce that all my query terms should appear in ANY of
> my qf fields to qualify as a result document? I know all terms appearing in
> a single field can give a boost if we use the "pf" query parameter
> accordingly. But how can I insist that to qualify as a result, the doc
> should have ALL of my query term in one or more of the qf fields?
>
>
> Cld some one pls help.
>
> Thanks!
> Mark.
>

Re: Results of facet differs with change in facet.limit.

2016-05-03 Thread Erick Erickson

Hmm, I'd be interested what you get if you restrict your
queries to individual shards using =false. This
will go to the individual shard you address and no others.

Does the facet count change in those circumstances?

Best,
Erick

On Tue, May 3, 2016 at 4:48 AM, Modassar Ather  wrote:
> I tried to reproduce the same issue with a field of following type but
> could not.
>  stored="false" omitNorms="true"/>
>
> Please share your inputs.
>
> Best,
> Modassar
>
> On Tue, May 3, 2016 at 10:32 AM, Modassar Ather 
> wrote:
>
>> Hi,
>>
>> Kindly share your inputs on this issue.
>>
>> Thanks,
>> Modassar
>>
>> On Mon, May 2, 2016 at 3:53 PM, Modassar Ather 
>> wrote:
>>
>>> Hi,
>>>
>>> I have a field f which is defined as follows on solr 5.x. It is 12 shard
>>> cluster with no replica.
>>>
>>> >> stored="false" indexed="false" docValues="true"/>
>>>
>>> When I facet on this field with different facet.limit I get different
>>> facet count.
>>>
>>> E.g.
>>> Query : text_field:term=f=100
>>> Result :
>>> 1225
>>> 1082
>>> 1076
>>>
>>> Query : text_field:term=f=200
>>> 1366
>>> 1321
>>> 1315
>>>
>>> I am noticing lesser document in facets whereas the numFound during
>>> search is more. Please refer to following query for details.
>>>
>>> Query : text_field:term=f
>>> Result :
>>> 1225
>>> 1082
>>> 1076
>>>
>>> Query : text_field:term AND f:val1
>>> Result: numFound=1366
>>>
>>> Kindly help me understand this behavior or let me know if it is an issue.
>>>
>>> Thanks,
>>> Modassar
>>>
>>
>>

Re: solr.StrField or solr.StringField?

2016-05-03 Thread Erick Erickson

Not a typo any more, Shawn fixed it today.




On Tue, May 3, 2016 at 1:34 PM, Jack Krupansky  wrote:
> Yeah, that's a typo. The same typo is in the official Solr Reference Guide:
> https://cwiki.apache.org/confluence/display/solr/Putting+the+Pieces+Together
>
> [ATTN: Solr Ref Guide Team!]
>
>
> -- Jack Krupansky
>
> On Tue, May 3, 2016 at 4:14 PM, John Bickerstaff 
> wrote:
>
>> I'm assuming it's another "class" or data type that someone built - but I'm
>> afraid I don't know any more than that.
>>
>> An alternative possibility (supported by at least one of the links on that
>> page you linked) is that it's just a typo -- people typing quickly and
>> forgetting the exact (truncated) spelling of the field.
>>
>> In that case, it's talking about using it for faceting and IIRC, you want a
>> non-analyzed field for that - preserve it exactly as it is for facet
>> queries -- that suggests to me that the author actually meant StrField
>>
>> 
>>
>> I might want to index the same data differently in three different fields
>> (perhaps using the Solr copyField
>>  directive):
>>
>>- For searching: Tokenized, case-folded, punctuation-stripped:
>>   - schildt / herbert / wolpert / lewis / davies / p
>>- For sorting: Untokenized, case-folded, punctuation-stripped:
>>   - schildt herbert wolpert lewis davies p
>>-
>>
>>For faceting: Primary author only, using a solr.StringField:
>>- Schildt, Herbert
>>
>> Then when the user drills down on the "Schildt, Herbert" string I would
>> reissue the query with an added fq=author:"Schild, Herbert" parameter.
>>
>> On Tue, May 3, 2016 at 2:01 PM, Steven White  wrote:
>>
>> > Thanks John.
>> >
>> > Yes, the out-of-the-box schema.xml does not have solr.StringField.
>> > However, a number of Solr pages on the web mention solr.StringField [1]
>> and
>> > thus I'm not sure if that's a typo, a real thing and such it is missing
>> > from the official Solr wiki's.
>> >
>> > Steve
>> >
>> > [1] https://wiki.apache.org/solr/SolrFacetingOverview,
>> >
>> >
>> http://grokbase.com/t/lucene/solr-commits/06cw5038rk/solr-wiki-update-of-solrfacetingoverview-by-jjlarrea
>> > ,
>> >
>> > On Tue, May 3, 2016 at 3:35 PM, John Bickerstaff <
>> j...@johnbickerstaff.com
>> > >
>> > wrote:
>> >
>> > > My default schema.xml does not have an entry for solr.StringField so I
>> > > can't tell you what that one does.
>> > >
>> > > If you look for solr.StrField in the schema.xml file, you'll get some
>> > idea
>> > > of how it's defined.  The default setting is for it not to be analyzed.
>> > >
>> > > On Tue, May 3, 2016 at 10:16 AM, Steven White 
>> > > wrote:
>> > >
>> > > > Hi Everyone,
>> > > >
>> > > > Is solr.StrField and solr.StringField the same thing?
>> > > >
>> > > > Thanks in advanced!
>> > > >
>> > > > Steve
>> > > >
>> > >
>> >
>>

Re: [Installation] Solr log directory

2016-05-03 Thread John Bickerstaff

Hoss - I'm guessing this is all in the install script that gets created
when you run that command (can't remember it) on the tar.gz file...

In other words, Yunee can edit that file, find those variables (like
SOLR_SERVICE) and change them from what they're set to by default to
whatever he wants...

On Tue, May 3, 2016 at 4:31 PM, Chris Hostetter 
wrote:

>
> : I have a question for installing solr server. Using '
> : install_solr_service.sh' with option -d , the solr home directory can be
> : set. But the default log directory is under $SOLR_HOME/logs.
> :
> : Is it possible to specify the logs directory separately from solr home
> directory during installation?
>
> install_solr_service.sh doesn't do anything special as far where logs
> should live  -- it just writes out a (default) "/etc/default/$
> SOLR_SERVICE.in.sh" (if
> it doesn't already exist) that specifies a (default) log directory for
> solr to use once the service starts
>
> you are absolutely expected to overwrite that "$SOLR_SERVICE.in.sh" file
> with your own specific settings -- in fact you *must* to configure things
> like ZooKeeper or SSL -- after the installation script finishes, and you
> are welcome to change the SOLR_LOGS_DIR setting to anything you want.
>
>
>
> -Hoss
> http://www.lucidworks.com/
>

Re: Phrases and edismax

2016-05-03 Thread Mark Robinson

Sorry Eric.
I will check and raise it under SOLR project.

Thanks!
Mark.

On Mon, May 2, 2016 at 11:43 PM, Erick Erickson 
wrote:

> Mark:
>
> KYLIN-1644? This should be SOLR-. I suspect you entered the JIRA
> in the wrong Apache project.
>
>
> Erick
>
> On Mon, May 2, 2016 at 8:05 PM, Mark Robinson 
> wrote:
> > Hi Eric,
> >
> > I have raised a JIRA:-   *KYLIN-1644*   with the problem mentioned.
> >
> > Thanks!
> > Mark.
> >
> > On Sun, May 1, 2016 at 5:25 PM, Mark Robinson 
> > wrote:
> >
> >> Thanks much Eric for checking in detail.
> >> Yes I found the first term being left out in pf.
> >> Because of that I had some cases where a couple of unwanted records came
> >> in the results with higher priority than the normal ones. When I checked
> >> they matched from the 2nd term onwards.
> >>
> >> As suggested I wud raise a  JIRA.
> >>
> >> Thanks!
> >> Mark
> >>
> >> On Sat, Apr 30, 2016 at 1:20 PM, Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> >>
> >>> Looks like a bug in edismax to me when you field-qualify
> >>> the terms.
> >>>
> >>> As an aside, there's no need to specify the field when you only
> >>> want it to go against the fields defined in "qf" and "pf" etc. And,
> >>> that's a work-around for this particular case. But still:
> >>>
> >>> So here's what I get on 5x:
> >>> q=(erick men truck)=edismax=name=name
> >>> correctly returns:
> >>> "+((name:erick) (name:men) (name:truck)) (name:"erick men truck")",
> >>>
> >>> But,
> >>> q=name:(erick men truck)=edismax=name=name
> >>> incorrectly returns:
> >>> "+(name:erick name:men name:truck) (name:"men truck")",
> >>>
> >>> And this:
> >>> q=name:(erick men truck)=edismax=name=features
> >>> incorrectly gives this.
> >>>
> >>> "+(name:erick name:men name:truck) (features:"men truck")",
> >>>
> >>> Confusingly, the terms (with "erick" left out, strike 1)
> >>> goes against the pf field even though it's fully qualified against the
> >>> name field. Not entirely sure whether this is intended or not frankly.
> >>>
> >>> Please go ahead and raise a JIRA.
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>> On Fri, Apr 29, 2016 at 7:55 AM, Mark Robinson <
> mark123lea...@gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > q=productType:(two piece bathtub white)
> >>> > =edismax=productType^20.0=productType^15.0
> >>> >
> >>> > In the debug section this is what I see:-
> >>> > 
> >>> > (+(productType:two productType:piec productType:bathtub
> >>> productType:white)
> >>> > DisjunctionMaxQuery((productType:"piec bathtub
> white"^20.0)))/no_coord
> >>> > 
> >>> >
> >>> > My question is related to the "pf" (phrases) section of edismax.
> >>> > As shown in the debug section why is the phrase taken as "piec
> bathtub
> >>> > white". Why is the first word "two" not considered in the phrase
> fields
> >>> > section.
> >>> > I am looking for queries with the words "two piece bathtub white"
> being
> >>> > together to be boosted and not "piece bathtub white" only to be
> boosted.
> >>> >
> >>> > Could some one help me understand what I am missing?
> >>> >
> >>> > Thanks!
> >>> > Mark
> >>>
> >>
> >>
>

Re: [Installation] Solr log directory

2016-05-03 Thread Chris Hostetter


: I have a question for installing solr server. Using ' 
: install_solr_service.sh' with option -d , the solr home directory can be 
: set. But the default log directory is under $SOLR_HOME/logs.
: 
: Is it possible to specify the logs directory separately from solr home 
directory during installation? 

install_solr_service.sh doesn't do anything special as far where logs 
should live  -- it just writes out a (default) 
"/etc/default/$SOLR_SERVICE.in.sh" (if 
it doesn't already exist) that specifies a (default) log directory for 
solr to use once the service starts

you are absolutely expected to overwrite that "$SOLR_SERVICE.in.sh" file 
with your own specific settings -- in fact you *must* to configure things 
like ZooKeeper or SSL -- after the installation script finishes, and you 
are welcome to change the SOLR_LOGS_DIR setting to anything you want.



-Hoss
http://www.lucidworks.com/

SOLR edismax and mm request parameter

2016-05-03 Thread Mark Robinson

Hi,

1.
My understanding of the mm parameter related to edismax is that,
if mm=100%,  only if ALL my query terms appear across any of the qf fields
will I get back
documents ... ie all the terms *need not be present in one single field* ..
they just need to be present across any of the fields in my qf list.

But my query for  the terms:-
*blue stainless washer*
... returns a document which has *Stainless Washer *in one of my qf fields,
but *refrigerator *is not there in any of the qf fields. Then how did it
get returned even though I had given mm=100% (100%25 when I typed directly
in browser). Any suggestions please.

2.
Another question I have is:-
With edismax can I enforce that all my query terms should appear in ANY of
my qf fields to qualify as a result document? I know all terms appearing in
a single field can give a boost if we use the "pf" query parameter
accordingly. But how can I insist that to qualify as a result, the doc
should have ALL of my query term in one or more of the qf fields?


Cld some one pls help.

Thanks!
Mark.

RE: [Installation] Solr log directory

2016-05-03 Thread Yunee Lee

That's correct.
But I want to separate solr home directory and log directory. 
For example,  solr_home = /server/solr
solr_logs = /var/solr/logs

-Original Message-
From: John Bickerstaff [mailto:j...@johnbickerstaff.com] 
Sent: Tuesday, May 3, 2016 6:16 PM
To: solr-user@lucene.apache.org
Subject: Re: [Installation] Solr log directory

I think you should be able to change $SOLR_HOME to any valid path.

For example: /var/logs/solr_logs

On Tue, May 3, 2016 at 4:02 PM, Yunee Lee  wrote:

> Hi, solr experts.
>
> I have a question for installing solr server.
> Using ' install_solr_service.sh' with option -d , the solr home 
> directory can be set. But the default log directory is under $SOLR_HOME/logs.
>
> Is it possible to specify the logs directory separately from solr home 
> directory during installation?
>
> Thank you for your help.
>
> -Y
>
>
>

Re: [Installation] Solr log directory

2016-05-03 Thread John Bickerstaff

I think you should be able to change $SOLR_HOME to any valid path.

For example: /var/logs/solr_logs

On Tue, May 3, 2016 at 4:02 PM, Yunee Lee  wrote:

> Hi, solr experts.
>
> I have a question for installing solr server.
> Using ' install_solr_service.sh' with option -d , the solr home directory
> can be set. But the default log directory is under $SOLR_HOME/logs.
>
> Is it possible to specify the logs directory separately from solr home
> directory during installation?
>
> Thank you for your help.
>
> -Y
>
>
>

[Installation] Solr log directory

2016-05-03 Thread Yunee Lee

Hi, solr experts.

I have a question for installing solr server. 
Using ' install_solr_service.sh' with option -d , the solr home directory can 
be set. But the default log directory is under $SOLR_HOME/logs.

Is it possible to specify the logs directory separately from solr home 
directory during installation? 

Thank you for your help.

-Y

Re: solr.StrField or solr.StringField?

2016-05-03 Thread Jack Krupansky

Yeah, that's a typo. The same typo is in the official Solr Reference Guide:
https://cwiki.apache.org/confluence/display/solr/Putting+the+Pieces+Together

[ATTN: Solr Ref Guide Team!]


-- Jack Krupansky

On Tue, May 3, 2016 at 4:14 PM, John Bickerstaff 
wrote:

> I'm assuming it's another "class" or data type that someone built - but I'm
> afraid I don't know any more than that.
>
> An alternative possibility (supported by at least one of the links on that
> page you linked) is that it's just a typo -- people typing quickly and
> forgetting the exact (truncated) spelling of the field.
>
> In that case, it's talking about using it for faceting and IIRC, you want a
> non-analyzed field for that - preserve it exactly as it is for facet
> queries -- that suggests to me that the author actually meant StrField
>
> 
>
> I might want to index the same data differently in three different fields
> (perhaps using the Solr copyField
>  directive):
>
>- For searching: Tokenized, case-folded, punctuation-stripped:
>   - schildt / herbert / wolpert / lewis / davies / p
>- For sorting: Untokenized, case-folded, punctuation-stripped:
>   - schildt herbert wolpert lewis davies p
>-
>
>For faceting: Primary author only, using a solr.StringField:
>- Schildt, Herbert
>
> Then when the user drills down on the "Schildt, Herbert" string I would
> reissue the query with an added fq=author:"Schild, Herbert" parameter.
>
> On Tue, May 3, 2016 at 2:01 PM, Steven White  wrote:
>
> > Thanks John.
> >
> > Yes, the out-of-the-box schema.xml does not have solr.StringField.
> > However, a number of Solr pages on the web mention solr.StringField [1]
> and
> > thus I'm not sure if that's a typo, a real thing and such it is missing
> > from the official Solr wiki's.
> >
> > Steve
> >
> > [1] https://wiki.apache.org/solr/SolrFacetingOverview,
> >
> >
> http://grokbase.com/t/lucene/solr-commits/06cw5038rk/solr-wiki-update-of-solrfacetingoverview-by-jjlarrea
> > ,
> >
> > On Tue, May 3, 2016 at 3:35 PM, John Bickerstaff <
> j...@johnbickerstaff.com
> > >
> > wrote:
> >
> > > My default schema.xml does not have an entry for solr.StringField so I
> > > can't tell you what that one does.
> > >
> > > If you look for solr.StrField in the schema.xml file, you'll get some
> > idea
> > > of how it's defined.  The default setting is for it not to be analyzed.
> > >
> > > On Tue, May 3, 2016 at 10:16 AM, Steven White 
> > > wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > Is solr.StrField and solr.StringField the same thing?
> > > >
> > > > Thanks in advanced!
> > > >
> > > > Steve
> > > >
> > >
> >
>

Re: solr.StrField or solr.StringField?

2016-05-03 Thread John Bickerstaff

You'll note that the "name" of the field in schema.xml is "string" and the
class is solr.StrField.

Easy to get confused when you're writing something up quickly... in a sense
the "string" field IS a solr.StrField

... but I could be wrong of course.



On Tue, May 3, 2016 at 2:14 PM, John Bickerstaff 
wrote:

> I'm assuming it's another "class" or data type that someone built - but
> I'm afraid I don't know any more than that.
>
> An alternative possibility (supported by at least one of the links on that
> page you linked) is that it's just a typo -- people typing quickly and
> forgetting the exact (truncated) spelling of the field.
>
> In that case, it's talking about using it for faceting and IIRC, you want
> a non-analyzed field for that - preserve it exactly as it is for facet
> queries -- that suggests to me that the author actually meant StrField
>
> 
>
> I might want to index the same data differently in three different fields
> (perhaps using the Solr copyField
>  directive):
>
>- For searching: Tokenized, case-folded, punctuation-stripped:
>   - schildt / herbert / wolpert / lewis / davies / p
>- For sorting: Untokenized, case-folded, punctuation-stripped:
>   - schildt herbert wolpert lewis davies p
>-
>
>For faceting: Primary author only, using a solr.StringField:
>- Schildt, Herbert
>
> Then when the user drills down on the "Schildt, Herbert" string I would
> reissue the query with an added fq=author:"Schild, Herbert" parameter.
>
> On Tue, May 3, 2016 at 2:01 PM, Steven White  wrote:
>
>> Thanks John.
>>
>> Yes, the out-of-the-box schema.xml does not have solr.StringField.
>> However, a number of Solr pages on the web mention solr.StringField [1]
>> and
>> thus I'm not sure if that's a typo, a real thing and such it is missing
>> from the official Solr wiki's.
>>
>> Steve
>>
>> [1] https://wiki.apache.org/solr/SolrFacetingOverview,
>>
>> http://grokbase.com/t/lucene/solr-commits/06cw5038rk/solr-wiki-update-of-solrfacetingoverview-by-jjlarrea
>> ,
>>
>> On Tue, May 3, 2016 at 3:35 PM, John Bickerstaff <
>> j...@johnbickerstaff.com>
>> wrote:
>>
>> > My default schema.xml does not have an entry for solr.StringField so I
>> > can't tell you what that one does.
>> >
>> > If you look for solr.StrField in the schema.xml file, you'll get some
>> idea
>> > of how it's defined.  The default setting is for it not to be analyzed.
>> >
>> > On Tue, May 3, 2016 at 10:16 AM, Steven White 
>> > wrote:
>> >
>> > > Hi Everyone,
>> > >
>> > > Is solr.StrField and solr.StringField the same thing?
>> > >
>> > > Thanks in advanced!
>> > >
>> > > Steve
>> > >
>> >
>>
>
>

Re: solr.StrField or solr.StringField?

2016-05-03 Thread John Bickerstaff

I'm assuming it's another "class" or data type that someone built - but I'm
afraid I don't know any more than that.

An alternative possibility (supported by at least one of the links on that
page you linked) is that it's just a typo -- people typing quickly and
forgetting the exact (truncated) spelling of the field.

In that case, it's talking about using it for faceting and IIRC, you want a
non-analyzed field for that - preserve it exactly as it is for facet
queries -- that suggests to me that the author actually meant StrField



I might want to index the same data differently in three different fields
(perhaps using the Solr copyField
 directive):

   - For searching: Tokenized, case-folded, punctuation-stripped:
  - schildt / herbert / wolpert / lewis / davies / p
   - For sorting: Untokenized, case-folded, punctuation-stripped:
  - schildt herbert wolpert lewis davies p
   -

   For faceting: Primary author only, using a solr.StringField:
   - Schildt, Herbert

Then when the user drills down on the "Schildt, Herbert" string I would
reissue the query with an added fq=author:"Schild, Herbert" parameter.

On Tue, May 3, 2016 at 2:01 PM, Steven White  wrote:

> Thanks John.
>
> Yes, the out-of-the-box schema.xml does not have solr.StringField.
> However, a number of Solr pages on the web mention solr.StringField [1] and
> thus I'm not sure if that's a typo, a real thing and such it is missing
> from the official Solr wiki's.
>
> Steve
>
> [1] https://wiki.apache.org/solr/SolrFacetingOverview,
>
> http://grokbase.com/t/lucene/solr-commits/06cw5038rk/solr-wiki-update-of-solrfacetingoverview-by-jjlarrea
> ,
>
> On Tue, May 3, 2016 at 3:35 PM, John Bickerstaff  >
> wrote:
>
> > My default schema.xml does not have an entry for solr.StringField so I
> > can't tell you what that one does.
> >
> > If you look for solr.StrField in the schema.xml file, you'll get some
> idea
> > of how it's defined.  The default setting is for it not to be analyzed.
> >
> > On Tue, May 3, 2016 at 10:16 AM, Steven White 
> > wrote:
> >
> > > Hi Everyone,
> > >
> > > Is solr.StrField and solr.StringField the same thing?
> > >
> > > Thanks in advanced!
> > >
> > > Steve
> > >
> >
>

Re: solr.StrField or solr.StringField?

2016-05-03 Thread Steven White

Thanks John.

Yes, the out-of-the-box schema.xml does not have solr.StringField.
However, a number of Solr pages on the web mention solr.StringField [1] and
thus I'm not sure if that's a typo, a real thing and such it is missing
from the official Solr wiki's.

Steve

[1] https://wiki.apache.org/solr/SolrFacetingOverview,
http://grokbase.com/t/lucene/solr-commits/06cw5038rk/solr-wiki-update-of-solrfacetingoverview-by-jjlarrea
,

On Tue, May 3, 2016 at 3:35 PM, John Bickerstaff 
wrote:

> My default schema.xml does not have an entry for solr.StringField so I
> can't tell you what that one does.
>
> If you look for solr.StrField in the schema.xml file, you'll get some idea
> of how it's defined.  The default setting is for it not to be analyzed.
>
> On Tue, May 3, 2016 at 10:16 AM, Steven White 
> wrote:
>
> > Hi Everyone,
> >
> > Is solr.StrField and solr.StringField the same thing?
> >
> > Thanks in advanced!
> >
> > Steve
> >
>

Re: solr.StrField or solr.StringField?

2016-05-03 Thread John Bickerstaff

My default schema.xml does not have an entry for solr.StringField so I
can't tell you what that one does.

If you look for solr.StrField in the schema.xml file, you'll get some idea
of how it's defined.  The default setting is for it not to be analyzed.

On Tue, May 3, 2016 at 10:16 AM, Steven White  wrote:

> Hi Everyone,
>
> Is solr.StrField and solr.StringField the same thing?
>
> Thanks in advanced!
>
> Steve
>

Re: Collection API Change

2016-05-03 Thread Erick Erickson

I'm really confused by this

bq: I am proposing that the Collection API be consistent...

How can it? The collections API simply allows collection manipulation
(creation, deletion and the like) through an HTTP call. Where do you
think it _can_ get any intelligence about the cluster state? There's
no "there" there.

Or are we talking about submitting queries as opposed to the
Collections API? Again, the design here is to accept any incoming HTTP
request so you'd have to intercept the HTTP request somewhere and then
"do the right thing" with it. And this is what CloudSolrClient already
does.

If you're thinking of some kind of non-java layer to route things,
then the you can get the status from Zookeeper through HTTP calls to
Solr (or indeed, call Zookeeper directly) to make those decisions.

So I really don't understand the proposal at all.

Best,
Erick

On Tue, May 3, 2016 at 10:26 AM,   wrote:
> I am proposing that the Collection API be consistent with SolrJ 
> (CloudSolrClient).
> SolrJ's design philosophy uses the list of zookeeper addresses to gather 
> cloud information,
> and then intelligently sends the actual request to the single solr node that 
> is best able to serve it.
> This information includes what solr nodes are alive, and what 
> collections/cores reside,
> and where and whom the current leader is.
>
> Why can't the Collection API or something else expose this functionality.
> Otherwise, you have to use a load balancer (software or hardware) to present 
> a single point of access for your code,
> and let it forward the request to one of the solr nodes.  It should be able 
> to track which ones are alive and failover to the other(s).
> I really don't think this is necessary, because Solr can already do this, 
> internally.
>
> Maybe an intelligent way of submitting queries to the cloud, without checking 
> the cloud state
> is difficult, but the code already exists.
>
> Thank you
> Will McGinnis

Collection API Change

2016-05-03 Thread wmcginnis

I am proposing that the Collection API be consistent with SolrJ 
(CloudSolrClient).
SolrJ's design philosophy uses the list of zookeeper addresses to gather cloud 
information,
and then intelligently sends the actual request to the single solr node that is 
best able to serve it.
This information includes what solr nodes are alive, and what collections/cores 
reside,
and where and whom the current leader is.

Why can't the Collection API or something else expose this functionality.
Otherwise, you have to use a load balancer (software or hardware) to present a 
single point of access for your code, 
and let it forward the request to one of the solr nodes.  It should be able to 
track which ones are alive and failover to the other(s).
I really don't think this is necessary, because Solr can already do this, 
internally.

Maybe an intelligent way of submitting queries to the cloud, without checking 
the cloud state
is difficult, but the code already exists.

Thank you
Will McGinnis

solr.StrField or solr.StringField?

2016-05-03 Thread Steven White

Hi Everyone,

Is solr.StrField and solr.StringField the same thing?

Thanks in advanced!

Steve

Re: Does EML files with inline images affect the indexing speed

2016-05-03 Thread Zheng Lin Edwin Yeo

Yes should be, as it is the Tika extract handler that does the extracting
of the content for indexing.

Thank you.

Regards,
Edwin


On 3 May 2016 at 19:12, Alexandre Rafalovitch  wrote:

> This is an extract handler, right?
>
> If so, this is a question better for the Apache Tina list. That's what
> doing the parsing.
>
> Regards,
> Alex
> On 3 May 2016 7:53 pm, "Zheng Lin Edwin Yeo"  wrote:
>
> > Hi,
> >
> > I would like to find out, if the presence of inline images in EML files
> > will slow down the indexing speed significantly?
> >
> > Even though the content of the EML files are in Plain Text instead of
> HTML.
> > but I still found that the indexing performance is not up to expectation
> > yet. Average speed which I'm getting are around 0.3GB/hr.
> >
> > I'm using Solr 5.4.0 on SolrCloud.
> >
> > Regards,
> > Edwin
> >
>

Re: Parallel SQL Interface returns "java.lang.NullPointerException" after reloading collection

2016-05-03 Thread Joel Bernstein

Ryan, there is a patch (for the master branch) up on SOLR-9059 that
resolves the issue. This will be in 6.1 and 6.0.1 if there is one. Thanks
for the bug report!

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, May 3, 2016 at 10:41 AM, Joel Bernstein  wrote:

> I opened SOLR-9059.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, May 3, 2016 at 10:31 AM, Joel Bernstein 
> wrote:
>
>> What I believe is happening is that the core is closing on the reload,
>> which is triggering the closeHook and shutting down all the connections in
>> SolrClientCache.
>>
>> When the core reopens the connections are all still closed because the
>> SolrClientCache is instantiated statically with the creation of the
>> StreamHandler.
>>
>> So I think the correct fix is to create the SolrClientCache in inform(),
>> that way it will get recreated with each reload. As long as the closeHook
>> has closed the existing SolrClientCache this shouldn't cause any connection
>> leaks with reloads.
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Tue, May 3, 2016 at 10:01 AM, Joel Bernstein 
>> wrote:
>>
>>> I'll look into this today.
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>> On Tue, May 3, 2016 at 9:22 AM, Kevin Risden 
>>> wrote:
>>>
 What I think is happening is that since the CloudSolrClient is from the
 SolrCache and the collection was reloaded. zkStateReader is actually
 null
 since there was no cloudSolrClient.connect() call after the reload. I
 think
 that would cause the NPE on anything that uses the zkStateReader like
 getClusterState().

 ZkStateReader zkStateReader = cloudSolrClient.getZkStateReader();
 ClusterState clusterState = zkStateReader.getClusterState();


 Kevin Risden
 Apache Lucene/Solr Committer
 Hadoop and Search Tech Lead | Avalon Consulting, LLC
 
 M: 732 213 8417
 LinkedIn  |
 Google+
  | Twitter
 


 -
 This message (including any attachments) contains confidential
 information
 intended for a specific individual and purpose, and is protected by
 law. If
 you are not the intended recipient, you should delete this message. Any
 disclosure, copying, or distribution of this message, or the taking of
 any
 action based on it, is strictly prohibited.

 On Mon, May 2, 2016 at 9:58 PM, Joel Bernstein 
 wrote:

 > Looks like the loop below is throwing a Null pointer. I suspect the
 > collection has not yet come back online. In theory this should be self
 > healing and when the collection comes back online it should start
 working
 > again. If not then that would be a bug.
 >
 > for(String col : clusterState.getCollections()) {
 >
 >
 > Joel Bernstein
 > http://joelsolr.blogspot.com/
 >
 > On Mon, May 2, 2016 at 10:06 PM, Ryan Yacyshyn <
 ryan.yacys...@gmail.com>
 > wrote:
 >
 > > Yes stack trace can be found here:
 > >
 > > http://pastie.org/10821638
 > >
 > >
 > >
 > > On Mon, 2 May 2016 at 01:05 Joel Bernstein 
 wrote:
 > >
 > > > Can you post your stack trace? I suspect this has to do with how
 the
 > > > Streaming API is interacting with SolrCloud. We can probably also
 > create
 > > a
 > > > jira ticket for this.
 > > >
 > > > Joel Bernstein
 > > > http://joelsolr.blogspot.com/
 > > >
 > > > On Sun, May 1, 2016 at 4:02 AM, Ryan Yacyshyn <
 ryan.yacys...@gmail.com
 > >
 > > > wrote:
 > > >
 > > > > Hi all,
 > > > >
 > > > > I'm exploring with parallel SQL queries and found something
 strange
 > > after
 > > > > reloading the collection: the same query will return a
 > > > > java.lang.NullPointerException error. Here are my steps on a
 fresh
 > > > install
 > > > > of Solr 6.0.0.
 > > > >
 > > > > *Start Solr in cloud mode with example*
 > > > > bin/solr -e cloud -noprompt
 > > > >
 > > > > *Index some data*
 > > > > bin/post -c gettingstarted example/exampledocs/*.xml
 > > > >
 > > > > *Send query, which works*
 > > > > curl --data-urlencode 'stmt=select id,name from gettingstarted
 where
 > > > > inStock = true limit 2'
 > http://localhost:8983/solr/gettingstarted/sql
 > > > >
 > > > > *Reload the collection*
 > > > > curl '
 > > > >
 > > > >
 > > >
 > >
 >
 http://localhost:8983/solr/admin/collections?action=RELOAD=gettingstarted

Re: BlockJoinFacetComponent on solr 4.10

2016-05-03 Thread Shawn Heisey

On 5/2/2016 2:59 AM, tkg_cangkul wrote:
> hi i wanna asking a question about using
>
>
> BlockJoinFacetComponent 
>
> in solr 4. how can i use that library on solr 4.10.4
> i want to install casebox with solr 4.10.4 but i have this error.
>

>
> when i check in solr-core-4.10.4.jar there is no class Block JoinFacet
> COmponent. i found it in solr-core.6.0.0.jar . is it any try for me to
> can use it on solr 4.?
> pls help

The BlockJoinFacetComponent class was introduced in Solr 5.5.

https://issues.apache.org/jira/browse/SOLR-5743

The error you are seeing is because of this text in the solrconfig.xml
provided with casebox:

Removing that XML and any other XML where that component is referenced
would fix the error, but I would imagine that that casebox would not
work correctly after the change.

I do not know if you can copy the class from a newer version into the
4.10 source code.  From what I've seen, Solr code changes so fast from
release to release that you would probably run into big problems trying
to adapt a plugin that's intended for 5.5 and newer to a 4.x version. 
Even if you copy the code for the missing class, that code probably
depends on other changes and additions that have happened in the many
releases between 4.10.4 and 5.5.

Your best bet is to talk to the casebox people.  They can tell you
exactly what is required for their software, and whether or not they are
willing to support such an old version of Solr.  If I had to guess, they
probably *can't* support older Solr in their newest versions.  Their
software probably relies on capability (like BlockJoinFacetComponent)
only available in newer Solr versions.

If you were to use a version of casebox before March 15th, it *might*
work.  This commit was made on that date, and among other changes, added
BlockJoinFacetComponent to solrconfig.xml:

https://github.com/KETSE/casebox/commit/3fdc48564efd35e8a0229cb86bdab0ad4f493511

Thanks,
Shawn

Re: Bringing Old Collections Up Again

2016-05-03 Thread Shawn Heisey

On 5/2/2016 5:16 AM, Salman Ansari wrote:
> After several trials, it did start Solr on both machines but *non of the
> previous collections came back normally.* When I look at the admin page, it
> shows errors as follows
>
> *[Collection_name]_shard2_replica2:*
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Index locked for write for core '[Collection_name]_shard2_replica2'. Solr
> now longer supports forceful unlocking via 'unlockOnStartup'. Please verify
> locks manually!

I thought I had already sent this, but looks like it's still hanging
around.  Erick said much the same thing in his reply.

It sounds like Solr is being forcibly killed, not gracefully shut down,
when the machine is rebooted after installing updates.  When Solr is
forcibly killed, you probably need to find and delete a file named
"write.lock" in each of your index directories.  If that file exists
before Solr starts, Solr will not be able to lock the index and will not
load that index.  Restarting Solr after fixing the problem might be the
only solution.

You should not allow these systems to automatically reboot for updates,
or you should see if you can extend the amount of time that the reboot
will wait for processes to shut down normally before forcibly killing
them.  A graceful shutdown of a Solr process handling a lot of large
indexes can take quite a while.

A question for pondering:  Should the start script check for
index/write.lock files and fail to start (with a clear error message) if
any are found?  If we do that, we probably also need a "cleanlocks"
command on the solr script (and probably the init script) to delete
those files.

Thanks,
Shawn

Re: Requesting to be added to ContributorsGroup

2016-05-03 Thread Steve Rowe

Welcome Sheece,

I’ve added you to the ContributorsGroup.

--
Steve
www.lucidworks.com

> On May 3, 2016, at 10:03 AM, Syed Gardezi  wrote:
> 
> Hello,
> I am a Master student as part of Free and Open Source Software 
> Development COMP8440 - http://programsandcourses.anu.edu.au/course/COMP8440 
> at Australian National University. I have selected 
> http://wiki.apache.org/solr/ to contribute too. Kindly add me too 
> ContributorsGroup. Thank you.
> 
> wiki username: sheecegardezi
> 
> Regards,
> Sheece
>

Requesting to be added to ContributorsGroup

2016-05-03 Thread Syed Gardezi

Hello,
 I am a Master student as part of Free and Open Source Software Development 
COMP8440 - http://programsandcourses.anu.edu.au/course/COMP8440 at Australian 
National University. I have selected http://wiki.apache.org/solr/ to contribute 
too. Kindly add me too ContributorsGroup. Thank you.

wiki username: sheecegardezi

Regards,
Sheece

Re: Parallel SQL Interface returns "java.lang.NullPointerException" after reloading collection

2016-05-03 Thread Joel Bernstein

I opened SOLR-9059.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, May 3, 2016 at 10:31 AM, Joel Bernstein  wrote:

> What I believe is happening is that the core is closing on the reload,
> which is triggering the closeHook and shutting down all the connections in
> SolrClientCache.
>
> When the core reopens the connections are all still closed because the
> SolrClientCache is instantiated statically with the creation of the
> StreamHandler.
>
> So I think the correct fix is to create the SolrClientCache in inform(),
> that way it will get recreated with each reload. As long as the closeHook
> has closed the existing SolrClientCache this shouldn't cause any connection
> leaks with reloads.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, May 3, 2016 at 10:01 AM, Joel Bernstein 
> wrote:
>
>> I'll look into this today.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Tue, May 3, 2016 at 9:22 AM, Kevin Risden 
>> wrote:
>>
>>> What I think is happening is that since the CloudSolrClient is from the
>>> SolrCache and the collection was reloaded. zkStateReader is actually null
>>> since there was no cloudSolrClient.connect() call after the reload. I
>>> think
>>> that would cause the NPE on anything that uses the zkStateReader like
>>> getClusterState().
>>>
>>> ZkStateReader zkStateReader = cloudSolrClient.getZkStateReader();
>>> ClusterState clusterState = zkStateReader.getClusterState();
>>>
>>>
>>> Kevin Risden
>>> Apache Lucene/Solr Committer
>>> Hadoop and Search Tech Lead | Avalon Consulting, LLC
>>> 
>>> M: 732 213 8417
>>> LinkedIn  |
>>> Google+
>>>  | Twitter
>>> 
>>>
>>>
>>> -
>>> This message (including any attachments) contains confidential
>>> information
>>> intended for a specific individual and purpose, and is protected by law.
>>> If
>>> you are not the intended recipient, you should delete this message. Any
>>> disclosure, copying, or distribution of this message, or the taking of
>>> any
>>> action based on it, is strictly prohibited.
>>>
>>> On Mon, May 2, 2016 at 9:58 PM, Joel Bernstein 
>>> wrote:
>>>
>>> > Looks like the loop below is throwing a Null pointer. I suspect the
>>> > collection has not yet come back online. In theory this should be self
>>> > healing and when the collection comes back online it should start
>>> working
>>> > again. If not then that would be a bug.
>>> >
>>> > for(String col : clusterState.getCollections()) {
>>> >
>>> >
>>> > Joel Bernstein
>>> > http://joelsolr.blogspot.com/
>>> >
>>> > On Mon, May 2, 2016 at 10:06 PM, Ryan Yacyshyn <
>>> ryan.yacys...@gmail.com>
>>> > wrote:
>>> >
>>> > > Yes stack trace can be found here:
>>> > >
>>> > > http://pastie.org/10821638
>>> > >
>>> > >
>>> > >
>>> > > On Mon, 2 May 2016 at 01:05 Joel Bernstein 
>>> wrote:
>>> > >
>>> > > > Can you post your stack trace? I suspect this has to do with how
>>> the
>>> > > > Streaming API is interacting with SolrCloud. We can probably also
>>> > create
>>> > > a
>>> > > > jira ticket for this.
>>> > > >
>>> > > > Joel Bernstein
>>> > > > http://joelsolr.blogspot.com/
>>> > > >
>>> > > > On Sun, May 1, 2016 at 4:02 AM, Ryan Yacyshyn <
>>> ryan.yacys...@gmail.com
>>> > >
>>> > > > wrote:
>>> > > >
>>> > > > > Hi all,
>>> > > > >
>>> > > > > I'm exploring with parallel SQL queries and found something
>>> strange
>>> > > after
>>> > > > > reloading the collection: the same query will return a
>>> > > > > java.lang.NullPointerException error. Here are my steps on a
>>> fresh
>>> > > > install
>>> > > > > of Solr 6.0.0.
>>> > > > >
>>> > > > > *Start Solr in cloud mode with example*
>>> > > > > bin/solr -e cloud -noprompt
>>> > > > >
>>> > > > > *Index some data*
>>> > > > > bin/post -c gettingstarted example/exampledocs/*.xml
>>> > > > >
>>> > > > > *Send query, which works*
>>> > > > > curl --data-urlencode 'stmt=select id,name from gettingstarted
>>> where
>>> > > > > inStock = true limit 2'
>>> > http://localhost:8983/solr/gettingstarted/sql
>>> > > > >
>>> > > > > *Reload the collection*
>>> > > > > curl '
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> http://localhost:8983/solr/admin/collections?action=RELOAD=gettingstarted
>>> > > > > '
>>> > > > >
>>> > > > > After reloading, running the exact query above will return the
>>> null
>>> > > > pointer
>>> > > > > exception error. Any idea why?
>>> > > > >
>>> > > > > If I stop all Solr severs and restart, then it's fine.
>>> > > > >
>>> > > > > *java -version*
>>> > > > > java version "1.8.0_25"
>>> > > > > Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
>>> > > > > Java HotSpot(TM) 64-Bit Server VM (build

Re: Parallel SQL Interface returns "java.lang.NullPointerException" after reloading collection

2016-05-03 Thread Joel Bernstein

What I believe is happening is that the core is closing on the reload,
which is triggering the closeHook and shutting down all the connections in
SolrClientCache.

When the core reopens the connections are all still closed because the
SolrClientCache is instantiated statically with the creation of the
StreamHandler.

So I think the correct fix is to create the SolrClientCache in inform(),
that way it will get recreated with each reload. As long as the closeHook
has closed the existing SolrClientCache this shouldn't cause any connection
leaks with reloads.




Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, May 3, 2016 at 10:01 AM, Joel Bernstein  wrote:

> I'll look into this today.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, May 3, 2016 at 9:22 AM, Kevin Risden 
> wrote:
>
>> What I think is happening is that since the CloudSolrClient is from the
>> SolrCache and the collection was reloaded. zkStateReader is actually null
>> since there was no cloudSolrClient.connect() call after the reload. I
>> think
>> that would cause the NPE on anything that uses the zkStateReader like
>> getClusterState().
>>
>> ZkStateReader zkStateReader = cloudSolrClient.getZkStateReader();
>> ClusterState clusterState = zkStateReader.getClusterState();
>>
>>
>> Kevin Risden
>> Apache Lucene/Solr Committer
>> Hadoop and Search Tech Lead | Avalon Consulting, LLC
>> 
>> M: 732 213 8417
>> LinkedIn  |
>> Google+
>>  | Twitter
>> 
>>
>>
>> -
>> This message (including any attachments) contains confidential information
>> intended for a specific individual and purpose, and is protected by law.
>> If
>> you are not the intended recipient, you should delete this message. Any
>> disclosure, copying, or distribution of this message, or the taking of any
>> action based on it, is strictly prohibited.
>>
>> On Mon, May 2, 2016 at 9:58 PM, Joel Bernstein 
>> wrote:
>>
>> > Looks like the loop below is throwing a Null pointer. I suspect the
>> > collection has not yet come back online. In theory this should be self
>> > healing and when the collection comes back online it should start
>> working
>> > again. If not then that would be a bug.
>> >
>> > for(String col : clusterState.getCollections()) {
>> >
>> >
>> > Joel Bernstein
>> > http://joelsolr.blogspot.com/
>> >
>> > On Mon, May 2, 2016 at 10:06 PM, Ryan Yacyshyn > >
>> > wrote:
>> >
>> > > Yes stack trace can be found here:
>> > >
>> > > http://pastie.org/10821638
>> > >
>> > >
>> > >
>> > > On Mon, 2 May 2016 at 01:05 Joel Bernstein 
>> wrote:
>> > >
>> > > > Can you post your stack trace? I suspect this has to do with how the
>> > > > Streaming API is interacting with SolrCloud. We can probably also
>> > create
>> > > a
>> > > > jira ticket for this.
>> > > >
>> > > > Joel Bernstein
>> > > > http://joelsolr.blogspot.com/
>> > > >
>> > > > On Sun, May 1, 2016 at 4:02 AM, Ryan Yacyshyn <
>> ryan.yacys...@gmail.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > I'm exploring with parallel SQL queries and found something
>> strange
>> > > after
>> > > > > reloading the collection: the same query will return a
>> > > > > java.lang.NullPointerException error. Here are my steps on a fresh
>> > > > install
>> > > > > of Solr 6.0.0.
>> > > > >
>> > > > > *Start Solr in cloud mode with example*
>> > > > > bin/solr -e cloud -noprompt
>> > > > >
>> > > > > *Index some data*
>> > > > > bin/post -c gettingstarted example/exampledocs/*.xml
>> > > > >
>> > > > > *Send query, which works*
>> > > > > curl --data-urlencode 'stmt=select id,name from gettingstarted
>> where
>> > > > > inStock = true limit 2'
>> > http://localhost:8983/solr/gettingstarted/sql
>> > > > >
>> > > > > *Reload the collection*
>> > > > > curl '
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://localhost:8983/solr/admin/collections?action=RELOAD=gettingstarted
>> > > > > '
>> > > > >
>> > > > > After reloading, running the exact query above will return the
>> null
>> > > > pointer
>> > > > > exception error. Any idea why?
>> > > > >
>> > > > > If I stop all Solr severs and restart, then it's fine.
>> > > > >
>> > > > > *java -version*
>> > > > > java version "1.8.0_25"
>> > > > > Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
>> > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
>> > > > >
>> > > > > Thanks,
>> > > > > Ryan
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Include and exclude feature with multi valued fileds

2016-05-03 Thread Ahmet Arslan

Can you provide us example documents? Which you want to match which you don't?



On Tuesday, May 3, 2016 3:15 PM, Anil  wrote:
Any inputs please ?


On 2 May 2016 at 18:18, Anil  wrote:

> HI,
>
> i have created a document with multi valued fields.
>
> Eg :
> An issue is impacting multiple customers, products, versions etc.
>
> In my issue document, i have created customers, products, versions as
> multi valued fields.
>
> how to find all issues that are impacting google (customer) but not
> facebook (customer) ?
>
> Google and facebook can be part of in single issue document.
>
> Please let me know if you have any questions. Thanks.
>
> Regards,
> Anil
>
>
>
>
>
>
>
>
>

Re: EmbeddedSolrServer Loading Core Containers Solr 4.3.1

2016-05-03 Thread Erick Erickson

Please explain more clearly what problem you're actually
facing. It _sounds_ like you're indexing data to Solr then
trying to search it and being unsuccessful. This is almost
always the result of a failure to commit after indexing.

Best,
Erick

On Mon, May 2, 2016 at 3:29 AM, SRINI SOLR  wrote:
> Hi Team -
> I am using Solr 4.3.1.
>
> We are using this EmbeddedSolrServer to load Core Containers in one of the
> java application.
>
> This is setup as a cron job for every 1 hour to load the new data on to the
> containers.
>
> Otherwise - the new data is not getting loaded on the containers , if we
> access from Java application even after re-indexing also.
>
> Please help here to resolve the issue ...?

Re: Problem in Issuing a Command to Upload Configuration

2016-05-03 Thread Erick Erickson

If Solr is killed un-gracefully, " it can leave the lock files
in the index directory and when Solr comes back up
it thinks some other Solr process is writing the files
and refuses to allow _this_ Solr process to write to
those files

Probably this:
bq:  machines faced a forced
restart to install Windows Updates

shut the Solr instances down un-gracefully. So next time
Windows "helpfully" wants to kill all your processes,
don't let it until you've stopped Solr gracefully via the
bin\solr script.

BTW, I've seen some situations where the bin/solr script
will forcefully kill the Solr instance.

And to cure the situation, remove the write.lock file from the
index   directory (when Solr is stopped) and then start
Solr.

Best,
Erick

On Mon, May 2, 2016 at 2:36 AM, Salman Ansari  wrote:
> Well, that just happened! Solr and Zookeeper machines faced a forced
> restart to install Windows Updates. This caused Zookeeper ensemble and Solr
> instances to go down. When the machines came back up again. I tried the
> following
>
> 1) Started Zookeeper on all machines using the following command
> zkServer.cmd (on all three machines)
>
> 2) Started Solr on two of those machines using
>
> solr.cmd start -c -p 8983 -h [server1_name] -z
> "[server1_ip]:2181,[server2_name]:2181,[server3_name]:2181"
> solr.cmd start -c -p 8983 -h [server2_name] -z
> "[server2_ip]:2181,[server1_name]:2181,[server3_name]:2181"
> solr.cmd start -c -p 7574 -h [server1_name] -z
> "[server1_ip]:2181,[server2_name]:2181,[server3_name]:2181"
> solr.cmd start -c -p 7574 -h [server2_name] -z
> "[server2_ip]:2181,[server1_name]:2181,[server3_name]:2181"
>
> After several trials, it did start the solr on both machines but *non of
> the previous collections came back normally.* When I look at the admin
> page, it shows errors as follows
>
> *[Collection_name]_shard2_replica2:*
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Index locked for write for core '[Collection_name]_shard2_replica2'. Solr
> now longer supports forceful unlocking via 'unlockOnStartup'. Please verify
> locks manually!
>
> So probably I am doing something wrong or the simple scenario is not
> straight forward to recover from.
>
> Your comment/feedback is appreciated.
>
> Regards,
> Salman
>
>
>
> On Thu, Apr 7, 2016 at 3:56 PM, Shawn Heisey  wrote:
>
>> On 4/7/2016 5:40 AM, Salman Ansari wrote:
>> > Any comments regarding the issue I mentioned above "the proper procedure
>> of
>> > bringing old collections up after a restart of zookeeper ensemble and
>> Solr
>> > instances"?
>>
>> What precisely do you mean by "old collections"?  The simplest
>> interpretation of that is that you are trying to restart your servers
>> and have everything you already had in the cloud work properly.  An
>> alternate interpretation, which might be just as valid, is that you have
>> some collections on some old servers that you want to incorporate into a
>> new cloud.
>>
>> If it's the simple scenario: shut down solr, shut down zookeeper, start
>> zookeeper, start solr.  If it's the other scenario, that is not quite so
>> simple.
>>
>> Thanks,
>> Shawn
>>
>>

Re: Searching for term sequence including blank character using regex

2016-05-03 Thread Erick Erickson

Not quite sure. How is the field type defined? What
is the result of adding =true to the query? Have
you looked at the actual terms indexed via the
admin UI/schema browser? Have you looked at the
admin/analysis page to see how the data is parsed by
the fieldType?

If this is a tokenized field you should be fine
with fq=content:"hello world", possibly with slop.

And I'm all confused. The mail came through with
what looks like regex expressions (slash period asterisk
hello world period asterisk slash).
But that's often an artifact of markup for bolding
which gets stripped by some e-mail programs. You've
been around long enough that I highly doubt you're trying
to do regex queries!

Best,
Erick

On Sun, May 1, 2016 at 6:57 AM, Ali Nazemian  wrote:
> Dear Solr Users/Developers,
> Hi,
>
> I was wondering what is the correct query syntax for searching sequence of
> terms with blank character in the middle of sequence. Suppose I am looking
> for a query syntax with using fq parameter. For example suppose I want to
> search for all documents having "hello world" sequence using fq parameter.
> I am not sure why using fq=content:/.*hello world.*/ did not works for
> tokenized field in this situation. However, fq=content:/.*hello.*/ did work
> for the same field. Is there any possible fq query syntax for such
> searching requirement?
>
> Best regards.
>
>
> --
> A.Nazemian

Re: Parallel SQL Interface returns "java.lang.NullPointerException" after reloading collection

2016-05-03 Thread Joel Bernstein

I'll look into this today.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, May 3, 2016 at 9:22 AM, Kevin Risden 
wrote:

> What I think is happening is that since the CloudSolrClient is from the
> SolrCache and the collection was reloaded. zkStateReader is actually null
> since there was no cloudSolrClient.connect() call after the reload. I think
> that would cause the NPE on anything that uses the zkStateReader like
> getClusterState().
>
> ZkStateReader zkStateReader = cloudSolrClient.getZkStateReader();
> ClusterState clusterState = zkStateReader.getClusterState();
>
>
> Kevin Risden
> Apache Lucene/Solr Committer
> Hadoop and Search Tech Lead | Avalon Consulting, LLC
> 
> M: 732 213 8417
> LinkedIn  | Google+
>  | Twitter
> 
>
>
> -
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law. If
> you are not the intended recipient, you should delete this message. Any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, is strictly prohibited.
>
> On Mon, May 2, 2016 at 9:58 PM, Joel Bernstein  wrote:
>
> > Looks like the loop below is throwing a Null pointer. I suspect the
> > collection has not yet come back online. In theory this should be self
> > healing and when the collection comes back online it should start working
> > again. If not then that would be a bug.
> >
> > for(String col : clusterState.getCollections()) {
> >
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Mon, May 2, 2016 at 10:06 PM, Ryan Yacyshyn 
> > wrote:
> >
> > > Yes stack trace can be found here:
> > >
> > > http://pastie.org/10821638
> > >
> > >
> > >
> > > On Mon, 2 May 2016 at 01:05 Joel Bernstein  wrote:
> > >
> > > > Can you post your stack trace? I suspect this has to do with how the
> > > > Streaming API is interacting with SolrCloud. We can probably also
> > create
> > > a
> > > > jira ticket for this.
> > > >
> > > > Joel Bernstein
> > > > http://joelsolr.blogspot.com/
> > > >
> > > > On Sun, May 1, 2016 at 4:02 AM, Ryan Yacyshyn <
> ryan.yacys...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'm exploring with parallel SQL queries and found something strange
> > > after
> > > > > reloading the collection: the same query will return a
> > > > > java.lang.NullPointerException error. Here are my steps on a fresh
> > > > install
> > > > > of Solr 6.0.0.
> > > > >
> > > > > *Start Solr in cloud mode with example*
> > > > > bin/solr -e cloud -noprompt
> > > > >
> > > > > *Index some data*
> > > > > bin/post -c gettingstarted example/exampledocs/*.xml
> > > > >
> > > > > *Send query, which works*
> > > > > curl --data-urlencode 'stmt=select id,name from gettingstarted
> where
> > > > > inStock = true limit 2'
> > http://localhost:8983/solr/gettingstarted/sql
> > > > >
> > > > > *Reload the collection*
> > > > > curl '
> > > > >
> > > > >
> > > >
> > >
> >
> http://localhost:8983/solr/admin/collections?action=RELOAD=gettingstarted
> > > > > '
> > > > >
> > > > > After reloading, running the exact query above will return the null
> > > > pointer
> > > > > exception error. Any idea why?
> > > > >
> > > > > If I stop all Solr severs and restart, then it's fine.
> > > > >
> > > > > *java -version*
> > > > > java version "1.8.0_25"
> > > > > Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
> > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
> > > > >
> > > > > Thanks,
> > > > > Ryan
> > > > >
> > > >
> > >
> >
>

Re: Parallel SQL Interface returns "java.lang.NullPointerException" after reloading collection

2016-05-03 Thread Kevin Risden

What I think is happening is that since the CloudSolrClient is from the
SolrCache and the collection was reloaded. zkStateReader is actually null
since there was no cloudSolrClient.connect() call after the reload. I think
that would cause the NPE on anything that uses the zkStateReader like
getClusterState().

ZkStateReader zkStateReader = cloudSolrClient.getZkStateReader();
ClusterState clusterState = zkStateReader.getClusterState();


Kevin Risden
Apache Lucene/Solr Committer
Hadoop and Search Tech Lead | Avalon Consulting, LLC

M: 732 213 8417
LinkedIn  | Google+
 | Twitter


-
This message (including any attachments) contains confidential information
intended for a specific individual and purpose, and is protected by law. If
you are not the intended recipient, you should delete this message. Any
disclosure, copying, or distribution of this message, or the taking of any
action based on it, is strictly prohibited.

On Mon, May 2, 2016 at 9:58 PM, Joel Bernstein  wrote:

> Looks like the loop below is throwing a Null pointer. I suspect the
> collection has not yet come back online. In theory this should be self
> healing and when the collection comes back online it should start working
> again. If not then that would be a bug.
>
> for(String col : clusterState.getCollections()) {
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, May 2, 2016 at 10:06 PM, Ryan Yacyshyn 
> wrote:
>
> > Yes stack trace can be found here:
> >
> > http://pastie.org/10821638
> >
> >
> >
> > On Mon, 2 May 2016 at 01:05 Joel Bernstein  wrote:
> >
> > > Can you post your stack trace? I suspect this has to do with how the
> > > Streaming API is interacting with SolrCloud. We can probably also
> create
> > a
> > > jira ticket for this.
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > > On Sun, May 1, 2016 at 4:02 AM, Ryan Yacyshyn  >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I'm exploring with parallel SQL queries and found something strange
> > after
> > > > reloading the collection: the same query will return a
> > > > java.lang.NullPointerException error. Here are my steps on a fresh
> > > install
> > > > of Solr 6.0.0.
> > > >
> > > > *Start Solr in cloud mode with example*
> > > > bin/solr -e cloud -noprompt
> > > >
> > > > *Index some data*
> > > > bin/post -c gettingstarted example/exampledocs/*.xml
> > > >
> > > > *Send query, which works*
> > > > curl --data-urlencode 'stmt=select id,name from gettingstarted where
> > > > inStock = true limit 2'
> http://localhost:8983/solr/gettingstarted/sql
> > > >
> > > > *Reload the collection*
> > > > curl '
> > > >
> > > >
> > >
> >
> http://localhost:8983/solr/admin/collections?action=RELOAD=gettingstarted
> > > > '
> > > >
> > > > After reloading, running the exact query above will return the null
> > > pointer
> > > > exception error. Any idea why?
> > > >
> > > > If I stop all Solr severs and restart, then it's fine.
> > > >
> > > > *java -version*
> > > > java version "1.8.0_25"
> > > > Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
> > > > Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
> > > >
> > > > Thanks,
> > > > Ryan
> > > >
> > >
> >
>

Re: bf calculation

2016-05-03 Thread Jan Verweij - Reeleez

Hi Georg,

So obvious I totally forgot about that option.
Thanks.

Jan.






On 2 May 2016 at 11:15, Georg Sorst  wrote:

> Hi Jan,
>
> have you tried Solr's debug output? ie. add
> "...=true=true" to your query. This should
> answer your question.
>
> Best,
> Georg
>
> Jan Verweij - Reeleez  schrieb am Mo., 2. Mai 2016 um
> 09:47 Uhr:
>
> > Hi,
> > I'm trying to understand the exact calculation that takes place when
> using
> > edismax and the bf parameter.
> > When searching I get a product returned with a score of 0.625
> > Now, I have a field called productranking with a value of 0.5 for this
> > specific
> > product. If I add =field(productranking) to the request the score
> > becomes 0.7954515
> > How is this calculated?
> > Cheers,
> > Jan Verweij
>
> --
> *Georg M. Sorst I CTO*
> FINDOLOGIC GmbH
>
>
>
> Jakob-Haringer-Str. 5a | 5020 Salzburg I T.: +43 662 456708
> E.: g.so...@findologic.com
> www.findologic.com Folgen Sie uns auf: XING
> facebook
>  Twitter
> 
>
> Wir sehen uns auf dem *Shopware Community Day in Ahaus am 20.05.2016!* Hier
>  Termin
> vereinbaren!
> Wir sehen uns auf der* dmexco in Köln am 14.09. und 15.09.2016!* Hier
>  Termin
> vereinbaren!
>

Re: Query String Limit

2016-05-03 Thread Mugeesh Husain

hi,

Please share your query ?

I think you should increase maxBooleanClauses from solrconfig file.
as  below
1024



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-String-Limit-tp4274161p4274236.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Include and exclude feature with multi valued fileds

2016-05-03 Thread Anil

Any inputs please ?

On 2 May 2016 at 18:18, Anil  wrote:

> HI,
>
> i have created a document with multi valued fields.
>
> Eg :
> An issue is impacting multiple customers, products, versions etc.
>
> In my issue document, i have created customers, products, versions as
> multi valued fields.
>
> how to find all issues that are impacting google (customer) but not
> facebook (customer) ?
>
> Google and facebook can be part of in single issue document.
>
> Please let me know if you have any questions. Thanks.
>
> Regards,
> Anil
>
>
>
>
>
>
>
>
>

Re: Results of facet differs with change in facet.limit.

2016-05-03 Thread Modassar Ather

I tried to reproduce the same issue with a field of following type but
could not.


Please share your inputs.

Best,
Modassar

On Tue, May 3, 2016 at 10:32 AM, Modassar Ather 
wrote:

> Hi,
>
> Kindly share your inputs on this issue.
>
> Thanks,
> Modassar
>
> On Mon, May 2, 2016 at 3:53 PM, Modassar Ather 
> wrote:
>
>> Hi,
>>
>> I have a field f which is defined as follows on solr 5.x. It is 12 shard
>> cluster with no replica.
>>
>> > stored="false" indexed="false" docValues="true"/>
>>
>> When I facet on this field with different facet.limit I get different
>> facet count.
>>
>> E.g.
>> Query : text_field:term=f=100
>> Result :
>> 1225
>> 1082
>> 1076
>>
>> Query : text_field:term=f=200
>> 1366
>> 1321
>> 1315
>>
>> I am noticing lesser document in facets whereas the numFound during
>> search is more. Please refer to following query for details.
>>
>> Query : text_field:term=f
>> Result :
>> 1225
>> 1082
>> 1076
>>
>> Query : text_field:term AND f:val1
>> Result: numFound=1366
>>
>> Kindly help me understand this behavior or let me know if it is an issue.
>>
>> Thanks,
>> Modassar
>>
>
>

Re: Does EML files with inline images affect the indexing speed

2016-05-03 Thread Alexandre Rafalovitch

This is an extract handler, right?

If so, this is a question better for the Apache Tina list. That's what
doing the parsing.

Regards,
Alex
On 3 May 2016 7:53 pm, "Zheng Lin Edwin Yeo"  wrote:

> Hi,
>
> I would like to find out, if the presence of inline images in EML files
> will slow down the indexing speed significantly?
>
> Even though the content of the EML files are in Plain Text instead of HTML.
> but I still found that the indexing performance is not up to expectation
> yet. Average speed which I'm getting are around 0.3GB/hr.
>
> I'm using Solr 5.4.0 on SolrCloud.
>
> Regards,
> Edwin
>

Does EML files with inline images affect the indexing speed

2016-05-03 Thread Zheng Lin Edwin Yeo

Hi,

I would like to find out, if the presence of inline images in EML files
will slow down the indexing speed significantly?

Even though the content of the EML files are in Plain Text instead of HTML.
but I still found that the indexing performance is not up to expectation
yet. Average speed which I'm getting are around 0.3GB/hr.

I'm using Solr 5.4.0 on SolrCloud.

Regards,
Edwin

Re: Query String Limit

2016-05-03 Thread Ahmet Arslan

Hi,

Post method should do the trick.
Can you give an example of your query string, may be there is a way to shorten 
it?

ahmet



On Tuesday, May 3, 2016 9:30 AM, Prasanna S. Dhakephalkar 
 wrote:
Hi,



I have a solr 5.3.1 standalone installation.



When a query is fired on the browser, 

If the number of characters in the url is 6966, then I get the result, 

but if it is increased by 1 character, making the URL 6967 then solr does
not return any result and screen goes blank.

I am not sure if these numbers mean anything.



We were trying through PHP application using Solr Library, The query string
was long, Tried using GET and POST methods, but were not getting result.

The query string was about 16000+ characters.

Hence tried to see it on browser that is when we got to the character limit
of query.



Is it due to reaching memory limit ? Can I increase the limit ? If so How ?



Any help in resolving the issue is much appreciated.



Regards,



Prasanna.

Re: Solr comparsion between two columns

2016-05-03 Thread Alexandre Rafalovitch

The documentation link I gave has a bunch of examples. What is the specific
difficulty?
On 3 May 2016 4:30 pm, "kavurupavan"  wrote:

> Hi Alex,
>
> Please provide any example for comparsion in solr.
>
> Regards,
> Pavan.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-comparsion-between-two-columns-tp4274155p4274158.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: OOM script executed

2016-05-03 Thread Bastien Latard - MDPI AG


Hi Tomás,

Thanks for your answer.
How could I see what's using memory?
I tried to add "-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/var/solr/logs/OOM_Heap_dump/"

...but this doesn't seem to be really helpful...

Kind regards,
Bastien

On 02/05/2016 22:55, Tomás Fernández Löbbe wrote:

You could, but before that I'd try to see what's using your memory and see
if you can decrease that. Maybe identify why you are running OOM now and
not with your previous Solr version (assuming you weren't, and that you are
running with the same JVM settings). A bigger heap usually means more work
to the GC and less memory available for the OS cache.

Tomás

On Sun, May 1, 2016 at 11:20 PM, Bastien Latard - MDPI AG <
lat...@mdpi.com.invalid> wrote:


Hi Guys,

I got several times the OOM script executed since I upgraded to Solr6.0:

$ cat solr_oom_killer-8983-2016-04-29_15_16_51.log
Running OOM killer script for process 26044 for Solr on port 8983

Does it mean that I need to increase my JAVA Heap?
Or should I do anything else?

Here are some further logs:
$ cat solr_gc_log_20160502_0730:
}
{Heap before GC invocations=1674 (full 91):
  par new generation   total 1747648K, used 1747135K [0x0005c000,
0x00064000, 0x00064000)
   eden space 1398144K, 100% used [0x0005c000, 0x00061556,
0x00061556)
   from space 349504K,  99% used [0x00061556, 0x00062aa2fc30,
0x00062aab)
   to   space 349504K,   0% used [0x00062aab, 0x00062aab,
0x00064000)
  concurrent mark-sweep generation total 6291456K, used 6291455K
[0x00064000, 0x0007c000, 0x0007c000)
  Metaspace   used 39845K, capacity 40346K, committed 40704K, reserved
1085440K
   class spaceused 4142K, capacity 4273K, committed 4368K, reserved
1048576K
2016-04-29T21:15:41.970+0200: 20356.359: [Full GC (Allocation Failure)
2016-04-29T21:15:41.970+0200: 20356.359: [CMS:
6291455K->6291456K(6291456K), 12.5694653 secs]
8038591K->8038590K(8039104K), [Metaspace: 39845K->39845K(1085440K)],
12.5695497 secs] [Times: user=12.57 sys=0.00, real=12.57 secs]


Kind regards,
Bastien




Kind regards,
Bastien Latard
Web engineer
--
MDPI AG
Postfach, CH-4005 Basel, Switzerland
Office: Klybeckstrasse 64, CH-4057
Tel. +41 61 683 77 35
Fax: +41 61 302 89 18
E-mail:
lat...@mdpi.com
http://www.mdpi.com/

Re: Solr comparsion between two columns

2016-05-03 Thread kavurupavan

Hi Alex,

Please provide any example for comparsion in solr.

Regards,
Pavan.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-comparsion-between-two-columns-tp4274155p4274158.html
Sent from the Solr - User mailing list archive at Nabble.com.

Query String Limit

2016-05-03 Thread Prasanna S. Dhakephalkar

Hi,

 

I have a solr 5.3.1 standalone installation.

 

When a query is fired on the browser, 

If the number of characters in the url is 6966, then I get the result, 

but if it is increased by 1 character, making the URL 6967 then solr does
not return any result and screen goes blank.

I am not sure if these numbers mean anything.

 

We were trying through PHP application using Solr Library, The query string
was long, Tried using GET and POST methods, but were not getting result.

The query string was about 16000+ characters.

Hence tried to see it on browser that is when we got to the character limit
of query.

 

Is it due to reaching memory limit ? Can I increase the limit ? If so How ?

 

Any help in resolving the issue is much appreciated.

 

Regards,

 

Prasanna.

50 matches

Mail list logo