Disadvantages of having Zookeeper instance and Solr instance in the same server

2018-05-29 Thread solr2020
Hi,

What is the pros and cons of having Zookeeper instance and Solr instance in
the same VM/Server in production environment?

Thanks.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Find value in Parent doc fields OR Child doc fields

2018-05-29 Thread Asher Shih
unsubscribe

On Tue, May 29, 2018 at 6:01 PM, kristaclaire14
 wrote:
> Hi,
>
> I want to query/find a value that may match on parent document fields or
> child document fields. Is this possible using block join parent query
> parser? How can I do this with solr nested documents? Here is the example
> data:
>
> [{
> "id":"1001"
> "path":"1.Project",
> "Project_Title":"Sample Project",
> "_childDocuments_":[
> {
> "id":"2001",
> "path":"2.Project.Submission",
> "Submission_No":"1234-QWE",
> "_childDocuments_":[
> {
> "id":"3001",
> "path":"3.Project.Submission.Agency",
> "Agency_Cd":"QWE"
> }
> ]
> }]
> }, {
> "id":"1002"
> "path":"1.Project",
> "Project_Title":"Test Project QWE",
> "_childDocuments_":[
> {
> "id":"2002",
> "path":"2.Project.Submission",
> "Submission_No":"4567-AGY",
> "_childDocuments_":[
> {
> "id":"3002",
> "path":"3.Project.Submission.Agency",
> "Agency_Cd":"AGY"
> }]
> },{
> "id":"2003",
> "path":"2.Project.Submission",
> "Submission_No":"7891-QWE",
> "_childDocuments_":[
> {
> "id":"3003",
> "path":"3.Project.Submission.Agency",
> "Agency_Cd":"QWE"
> }]
> }]
> }]
>
> I want to retrieve all Projects with Project_Title:*QWE* OR
> Submission_Submission_No:*QWE*. Thanks in advance.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Find value in Parent doc fields OR Child doc fields

2018-05-29 Thread kristaclaire14
Hi,

I want to query/find a value that may match on parent document fields or
child document fields. Is this possible using block join parent query
parser? How can I do this with solr nested documents? Here is the example
data:

[{
"id":"1001"
"path":"1.Project",
"Project_Title":"Sample Project",
"_childDocuments_":[
{
"id":"2001",
"path":"2.Project.Submission",
"Submission_No":"1234-QWE",
"_childDocuments_":[
{
"id":"3001",
"path":"3.Project.Submission.Agency",
"Agency_Cd":"QWE"
}
]
}]
}, {
"id":"1002"
"path":"1.Project",
"Project_Title":"Test Project QWE",
"_childDocuments_":[
{
"id":"2002",
"path":"2.Project.Submission",
"Submission_No":"4567-AGY",
"_childDocuments_":[
{
"id":"3002",
"path":"3.Project.Submission.Agency",
"Agency_Cd":"AGY"
}]
},{
"id":"2003",
"path":"2.Project.Submission",
"Submission_No":"7891-QWE",
"_childDocuments_":[
{
"id":"3003",
"path":"3.Project.Submission.Agency",
"Agency_Cd":"QWE"
}]
}]
}]

I want to retrieve all Projects with Project_Title:*QWE* OR
Submission_Submission_No:*QWE*. Thanks in advance.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Weird behavioural differences between pf in dismax and edismax

2018-05-29 Thread Sambhav Kothari
Wouldn't all of this depend entirely on the tokenizers used? I was talking
about phrases in a multi-token sense.

Regardless, I still think there should be similarity between dismax and
edismax for the commonly parameters. (Either extend the edismax logic to
dismax or vice versa)

Regards,
Sam

On Tue, May 29, 2018, 23:16 Elizabeth Haubert <
ehaub...@opensourceconnections.com> wrote:

> That would make sense.
> Multi-term synonyms get into a weird case too.  Should the single-term
> words that have multi-term synonyms expand out? Or should the multi-term
> synonyms that have single-term synonyms contract down and count as only a
> single clause for pf2 or pf3.
>
>
>
> On Tue, May 29, 2018 at 1:37 PM, Alessandro Benedetti <
> a.benede...@sease.io>
> wrote:
>
> > I don't have any hard position on this, It's ok to not build a phrase
> boost
> > if the input query is 1 term and it remains one term after the analysis
> for
> > one of the pf fields.
> >
> > But if the term produces multiple tokens after query time analysis, I do
> > believe that building a phrase boost should be the correct
> interpretation (
> > e.g. wi-fi with a query time analiser which split by - ) .
> >
> > Cheers
> >
> >
> >
> >
> >
> >
> >
> > -
> > ---
> > Alessandro Benedetti
> > Search Consultant, R Software Engineer, Director
> > Sease Ltd. - www.sease.io
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>


Re: sending empty request parameters to solr

2018-05-29 Thread Shawn Heisey

On 5/29/2018 5:10 AM, Riyaz wrote:

We had come across a requirement to allow empty parameter values to query
string(q), start and rows as part of solr search query.

In solr 3.4, have added defType to edismax and it's allowing empty params

http:///solr//select?=xml=true=="
-->working fine in solr 3.4


I would say that the behavior in 3.x is the problem here. Version 3.4.0 
is nearly seven years old.


An empty value for the "q" parameter is a situation that Solr is 
explicitly looking for.  When q is missing or empty, the value in q.alt 
is used instead, with the standard Lucene parser.


Behavior is undefined for an empty value on parameters like start and rows.


while applying the same changes in solr-5.3.1, empty rows/start parameter
causing "java.lang.NumberFormatException: For input string: """.


The newer version is validating that what you're sending is an actual 
valid number, and throwing an error early when it's not.  The latest 
version I have downloaded right now is 7.3.0, and it behaves the same as 
5.3.1.  The latest 3.x (3.6.2) accepts the empty parameters, and the 
latest 4.x (4.10.4) does not accept them.  As you have seen, 5.x doesn't 
accept them.  All three of these major versions are out of date and no 
longer receive updates.  I did not try any 6.x versions for this 
problem, but they probably behave the same as 5.x and 7.x.



Can you please let us know, is there any way to do this?.


If you want a query to use defaults for these parameters, don't include 
them on the query URL at all.  Solr will use the defaults that you have 
defined on the handler in solrconfig.xml.


I don't consider the behavior you're seeing to be a bug.  I think that 
it's correct to throw an error in this situation, and that 3.x behaves 
incorrectly by NOT throwing an error.


Thanks,
Shawn



Re: Field list vs getting everything

2018-05-29 Thread root23
Yes i meant fl. 
So essentially if i ask in the fl list for 10 fields vs all the fields have
no effect in terms of the amount of work solr has to do ?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr Cloud 7.3.1 backups

2018-05-29 Thread Greg Roodt
Hi

What is the best way to perform a backup of a Solr Cloud cluster? Is there
a way to backup only the leader? From my tests with the collections admin
BACKUP command, all nodes in the cluster need to have access to a shared
filesystem. Surely that isn't necessary if you are backing up the leader or
TLOG replica?

Kind Regards
Greg


Re: Index protected zip

2018-05-29 Thread Cassandra Targett
Someone needs to update the Ref Guide. That can be a patch submitted on a
JIRA issue, or a committer could forego a patch and make changes directly
with commits.

Otherwise, this wiki page is making a bad situation even worse.

On Tue, May 29, 2018 at 12:06 PM Tim Allison  wrote:

> I’m happy to contribute to this message in any way I can.  Let me know how
> I can help.
>
> On Tue, May 29, 2018 at 2:31 PM Cassandra Targett 
> wrote:
>
> > It's not as simple as a banner. Information was added to the wiki that
> does
> > not exist in the Ref Guide.
> >
> > Before you say "go look at the Ref Guide" you need to make sure it says
> > what you want it to say, and the creation of this page just 3 days ago
> > indicates to me that the Ref Guide is missing something.
> >
> > On Tue, May 29, 2018 at 1:04 PM Erick Erickson 
> > wrote:
> >
> > > On further reflection ,+1 to marking the Wiki page superseded by the
> > > reference guide. I'd be fine with putting a banner at the top of all
> > > the Wiki pages saying "check the Solr reference guide first" ;)
> > >
> > > On Tue, May 29, 2018 at 10:59 AM, Cassandra Targett
> > >  wrote:
> > > > Couldn't the same information on that page be put into the Solr Ref
> > > Guide?
> > > >
> > > > I mean, if that's what we recommend, it should be documented
> officially
> > > > that it's what we recommend.
> > > >
> > > > I mean, is anyone surprised people keep stumbling over this? Shawn's
> > wiki
> > > > page doesn't point to the Ref Guide (instead pointing at other wiki
> > pages
> > > > that are out of date) and the Ref Guide doesn't point to that page.
> So
> > > half
> > > > the info is in our "official" place but the real story is in another
> > > place,
> > > > one we alternately tell people to sometimes ignore but sometimes keep
> > up
> > > to
> > > > date? Even I'm confused.
> > > >
> > > > On Sat, May 26, 2018 at 6:41 PM Erick Erickson <
> > erickerick...@gmail.com>
> > > > wrote:
> > > >
> > > >> Thanks! now I can just record the URL and then paste it in ;)
> > > >>
> > > >> Who knows, maybe people will see it first too!
> > > >>
> > > >> On Sat, May 26, 2018 at 9:48 AM, Tim Allison 
> > > wrote:
> > > >> > W00t! Thank you, Shawn!
> > > >> >
> > > >> > The "don't use ERH in production" response comes up frequently
> > enough
> > > >> >> that I have created a wiki page we can use for responses:
> > > >> >>
> > > >> >> https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
> > > >> >>
> > > >> >> Tim, you are extremely well-qualified to expand and correct this
> > > page.
> > > >> >> Erick may be interested in making adjustments also. The flow of
> the
> > > page
> > > >> >> feels a little bit awkward to me, but I'm not sure how to improve
> > it.
> > > >> >>
> > > >> >> If the page name is substandard, feel free to rename.  I've
> already
> > > >> >> renamed it once!  I searched for an existing page like this
> before
> > I
> > > >> >> started creating it.  I did put a link to the new page on the
> > > >> >> ExtractingRequestHandler page.
> > > >> >>
> > > >> >> Thanks,
> > > >> >> Shawn
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
>


Re: Field list vs getting everything

2018-05-29 Thread Alexandre Rafalovitch
'df' is Default Field parameter and does not affect the fields
returned. You probably meant 'fl'.

Just not listing field in 'fl' will not have much effect, apart from
serialization time and network time. Which may help your real users if
your middleware just passes the results to the browser.

However, there is also a settings for Lazy Loading the fields, which
is supposed to help:
https://lucene.apache.org/solr/guide/7_3/query-settings-in-solrconfig.html

The reasons is not on by default - I believe - is because there are
issues with caching partial documents in memory. I am not 100% sure on
that though.

Regards,
   Alex.

On 29 May 2018 at 15:00, root23  wrote:
> Hi all ,
> We have in our solr schema around 110 fields and then around 200 dynamic
> fields.
> Untill now whenever we query solr we will just blindly get everything. Even
> if our calling application dont need all the fields to send back to the
> client.
>
> So i thought that may be if i ask only a subset of all the fields from solr
> using the df attribute, for each query it should be fast. so instead of
> getting everything i am not asking for around 65 fields and 100 dynamic
> fields.
> In theory i am asking around 50% less from solr.
>
> I was hoping that this will be fast but it seems like didnt make much
> difference the response time is pretty much same.
>
> Does anyone know if my theory is correct or not ? I was thinking if i ask
> less data from solr its less work for solr and then also less data to
> transfer on the wire.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Field list vs getting everything

2018-05-29 Thread root23
Hi all ,
We have in our solr schema around 110 fields and then around 200 dynamic
fields. 
Untill now whenever we query solr we will just blindly get everything. Even
if our calling application dont need all the fields to send back to the
client.

So i thought that may be if i ask only a subset of all the fields from solr
using the df attribute, for each query it should be fast. so instead of
getting everything i am not asking for around 65 fields and 100 dynamic
fields. 
In theory i am asking around 50% less from solr.

I was hoping that this will be fast but it seems like didnt make much
difference the response time is pretty much same.

Does anyone know if my theory is correct or not ? I was thinking if i ask
less data from solr its less work for solr and then also less data to
transfer on the wire.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Index protected zip

2018-05-29 Thread Tim Allison
I’m happy to contribute to this message in any way I can.  Let me know how
I can help.

On Tue, May 29, 2018 at 2:31 PM Cassandra Targett 
wrote:

> It's not as simple as a banner. Information was added to the wiki that does
> not exist in the Ref Guide.
>
> Before you say "go look at the Ref Guide" you need to make sure it says
> what you want it to say, and the creation of this page just 3 days ago
> indicates to me that the Ref Guide is missing something.
>
> On Tue, May 29, 2018 at 1:04 PM Erick Erickson 
> wrote:
>
> > On further reflection ,+1 to marking the Wiki page superseded by the
> > reference guide. I'd be fine with putting a banner at the top of all
> > the Wiki pages saying "check the Solr reference guide first" ;)
> >
> > On Tue, May 29, 2018 at 10:59 AM, Cassandra Targett
> >  wrote:
> > > Couldn't the same information on that page be put into the Solr Ref
> > Guide?
> > >
> > > I mean, if that's what we recommend, it should be documented officially
> > > that it's what we recommend.
> > >
> > > I mean, is anyone surprised people keep stumbling over this? Shawn's
> wiki
> > > page doesn't point to the Ref Guide (instead pointing at other wiki
> pages
> > > that are out of date) and the Ref Guide doesn't point to that page. So
> > half
> > > the info is in our "official" place but the real story is in another
> > place,
> > > one we alternately tell people to sometimes ignore but sometimes keep
> up
> > to
> > > date? Even I'm confused.
> > >
> > > On Sat, May 26, 2018 at 6:41 PM Erick Erickson <
> erickerick...@gmail.com>
> > > wrote:
> > >
> > >> Thanks! now I can just record the URL and then paste it in ;)
> > >>
> > >> Who knows, maybe people will see it first too!
> > >>
> > >> On Sat, May 26, 2018 at 9:48 AM, Tim Allison 
> > wrote:
> > >> > W00t! Thank you, Shawn!
> > >> >
> > >> > The "don't use ERH in production" response comes up frequently
> enough
> > >> >> that I have created a wiki page we can use for responses:
> > >> >>
> > >> >> https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
> > >> >>
> > >> >> Tim, you are extremely well-qualified to expand and correct this
> > page.
> > >> >> Erick may be interested in making adjustments also. The flow of the
> > page
> > >> >> feels a little bit awkward to me, but I'm not sure how to improve
> it.
> > >> >>
> > >> >> If the page name is substandard, feel free to rename.  I've already
> > >> >> renamed it once!  I searched for an existing page like this before
> I
> > >> >> started creating it.  I did put a link to the new page on the
> > >> >> ExtractingRequestHandler page.
> > >> >>
> > >> >> Thanks,
> > >> >> Shawn
> > >> >>
> > >> >>
> > >>
> >
>


Re: CURL command problem on Solr

2018-05-29 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Roee,

On 5/29/18 11:02 AM, Roee Tarab wrote:
> I am having some troubles with pushing a features file to solr
> while building an LTR model. I'm trying to upload a JSON file on
> windows cmd executable from an already installed CURL folder, with
> the command:
> 
> curl -XPUT
> 'http://localhost:8983/solr/techproducts/schema/feature-store' 
> --data-binary "@/path/myFeatures.json" -H
> 'Content-type:application/json'.
> 
> I am receiving the following error massage:
> 
> { "responseHeader":{ "status":500, "QTime":7}, "error":{ "msg":"Bad
> Request", "trace":"Bad Request (400) - Invalid content type 
> application/x-www-form-urlencoded; only application/json is 
> supported.\r\n\tat
> org.apache.solr.rest.RestManager$ManagedEndpoint. 
> parseJsonFromRequestBody(RestManager.java:407)\r\n\tat
> org.apache.solr.rest. 
> RestManager$ManagedEndpoint.put(RestManager.java:340) 
> 
> This is definitely a technical issue, and I have not been able to
> overcome it for 2 days.
> 
> Is there another option of uploading the file to our core? Is
> there something we are missing in our command?

What happens if you put the URL as the very last command-line option,
instead of the second one?

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlsNpFMACgkQHPApP6U8
pFjCxQ//WvN5ISrSf1Hoek+HA9e/1jgvHdtbPTz3SKx9Cxv0M28VDE41dOD8/TJU
Yeu8WIIyjbAOugPxYd6X/1Q+ksmzp8DwcANO4uWjM7m9KnrKUcgUqFbiEx5DCWFv
cCO49lD6pbnP7M21BFqIUPdRu4Sk84bObhb8+pFiANDurGG9iDGsk4z5JG8kph1n
QtJeyGss79GF4Fb8Ojs+rju+fcMW9tssi2NCbPI/OUmcEntonmVQKW6Zg8WaqlXD
w29gjss9P6sMloyIe4QbusxfwCL//HdCjuTBOAOZg/Od+Xb4bHG3AkZGqjmf21qC
oR7hjwkQtjl9C9yK5pHMPvAK1bUR8NCuv993dCOw3ddwdPsScv7K7TsI7GqVOfCD
X+PwkrE1PeZbPfSJGO4jVEwRIZ1zx5jRwl2WFpa0HSTnN2+GHVZnezqqIOW6HVax
Hb/7r13vs+6jOUBQPZvzcWtnGl7DurAwYM3nREgBjzMeXYMKqI67lwSBieoyC/da
a8GxkZBn6J+vutLI/hodi8ymUB+wNxiV6W4XTG8t2HSLGmZWD9fUgW6gr4a8WRQk
LM8yzmVSADjTkf5/fdKZ9ausYoMwHzrxKc0ceuK1iEF9WNts6AdOoIcIxrrFfr0v
yPyXnVaGS/5eLnwEt3vR8DROZRpX6OUKgteZRln0QQpAWegzW/I=
=U23/
-END PGP SIGNATURE-


Re: Index protected zip

2018-05-29 Thread Cassandra Targett
It's not as simple as a banner. Information was added to the wiki that does
not exist in the Ref Guide.

Before you say "go look at the Ref Guide" you need to make sure it says
what you want it to say, and the creation of this page just 3 days ago
indicates to me that the Ref Guide is missing something.

On Tue, May 29, 2018 at 1:04 PM Erick Erickson 
wrote:

> On further reflection ,+1 to marking the Wiki page superseded by the
> reference guide. I'd be fine with putting a banner at the top of all
> the Wiki pages saying "check the Solr reference guide first" ;)
>
> On Tue, May 29, 2018 at 10:59 AM, Cassandra Targett
>  wrote:
> > Couldn't the same information on that page be put into the Solr Ref
> Guide?
> >
> > I mean, if that's what we recommend, it should be documented officially
> > that it's what we recommend.
> >
> > I mean, is anyone surprised people keep stumbling over this? Shawn's wiki
> > page doesn't point to the Ref Guide (instead pointing at other wiki pages
> > that are out of date) and the Ref Guide doesn't point to that page. So
> half
> > the info is in our "official" place but the real story is in another
> place,
> > one we alternately tell people to sometimes ignore but sometimes keep up
> to
> > date? Even I'm confused.
> >
> > On Sat, May 26, 2018 at 6:41 PM Erick Erickson 
> > wrote:
> >
> >> Thanks! now I can just record the URL and then paste it in ;)
> >>
> >> Who knows, maybe people will see it first too!
> >>
> >> On Sat, May 26, 2018 at 9:48 AM, Tim Allison 
> wrote:
> >> > W00t! Thank you, Shawn!
> >> >
> >> > The "don't use ERH in production" response comes up frequently enough
> >> >> that I have created a wiki page we can use for responses:
> >> >>
> >> >> https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
> >> >>
> >> >> Tim, you are extremely well-qualified to expand and correct this
> page.
> >> >> Erick may be interested in making adjustments also. The flow of the
> page
> >> >> feels a little bit awkward to me, but I'm not sure how to improve it.
> >> >>
> >> >> If the page name is substandard, feel free to rename.  I've already
> >> >> renamed it once!  I searched for an existing page like this before I
> >> >> started creating it.  I did put a link to the new page on the
> >> >> ExtractingRequestHandler page.
> >> >>
> >> >> Thanks,
> >> >> Shawn
> >> >>
> >> >>
> >>
>


Re: Index protected zip

2018-05-29 Thread Erick Erickson
On further reflection ,+1 to marking the Wiki page superseded by the
reference guide. I'd be fine with putting a banner at the top of all
the Wiki pages saying "check the Solr reference guide first" ;)

On Tue, May 29, 2018 at 10:59 AM, Cassandra Targett
 wrote:
> Couldn't the same information on that page be put into the Solr Ref Guide?
>
> I mean, if that's what we recommend, it should be documented officially
> that it's what we recommend.
>
> I mean, is anyone surprised people keep stumbling over this? Shawn's wiki
> page doesn't point to the Ref Guide (instead pointing at other wiki pages
> that are out of date) and the Ref Guide doesn't point to that page. So half
> the info is in our "official" place but the real story is in another place,
> one we alternately tell people to sometimes ignore but sometimes keep up to
> date? Even I'm confused.
>
> On Sat, May 26, 2018 at 6:41 PM Erick Erickson 
> wrote:
>
>> Thanks! now I can just record the URL and then paste it in ;)
>>
>> Who knows, maybe people will see it first too!
>>
>> On Sat, May 26, 2018 at 9:48 AM, Tim Allison  wrote:
>> > W00t! Thank you, Shawn!
>> >
>> > The "don't use ERH in production" response comes up frequently enough
>> >> that I have created a wiki page we can use for responses:
>> >>
>> >> https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
>> >>
>> >> Tim, you are extremely well-qualified to expand and correct this page.
>> >> Erick may be interested in making adjustments also. The flow of the page
>> >> feels a little bit awkward to me, but I'm not sure how to improve it.
>> >>
>> >> If the page name is substandard, feel free to rename.  I've already
>> >> renamed it once!  I searched for an existing page like this before I
>> >> started creating it.  I did put a link to the new page on the
>> >> ExtractingRequestHandler page.
>> >>
>> >> Thanks,
>> >> Shawn
>> >>
>> >>
>>


Re: Index protected zip

2018-05-29 Thread Cassandra Targett
Couldn't the same information on that page be put into the Solr Ref Guide?

I mean, if that's what we recommend, it should be documented officially
that it's what we recommend.

I mean, is anyone surprised people keep stumbling over this? Shawn's wiki
page doesn't point to the Ref Guide (instead pointing at other wiki pages
that are out of date) and the Ref Guide doesn't point to that page. So half
the info is in our "official" place but the real story is in another place,
one we alternately tell people to sometimes ignore but sometimes keep up to
date? Even I'm confused.

On Sat, May 26, 2018 at 6:41 PM Erick Erickson 
wrote:

> Thanks! now I can just record the URL and then paste it in ;)
>
> Who knows, maybe people will see it first too!
>
> On Sat, May 26, 2018 at 9:48 AM, Tim Allison  wrote:
> > W00t! Thank you, Shawn!
> >
> > The "don't use ERH in production" response comes up frequently enough
> >> that I have created a wiki page we can use for responses:
> >>
> >> https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
> >>
> >> Tim, you are extremely well-qualified to expand and correct this page.
> >> Erick may be interested in making adjustments also. The flow of the page
> >> feels a little bit awkward to me, but I'm not sure how to improve it.
> >>
> >> If the page name is substandard, feel free to rename.  I've already
> >> renamed it once!  I searched for an existing page like this before I
> >> started creating it.  I did put a link to the new page on the
> >> ExtractingRequestHandler page.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>


Re: Weird behavioural differences between pf in dismax and edismax

2018-05-29 Thread Elizabeth Haubert
That would make sense.
Multi-term synonyms get into a weird case too.  Should the single-term
words that have multi-term synonyms expand out? Or should the multi-term
synonyms that have single-term synonyms contract down and count as only a
single clause for pf2 or pf3.



On Tue, May 29, 2018 at 1:37 PM, Alessandro Benedetti 
wrote:

> I don't have any hard position on this, It's ok to not build a phrase boost
> if the input query is 1 term and it remains one term after the analysis for
> one of the pf fields.
>
> But if the term produces multiple tokens after query time analysis, I do
> believe that building a phrase boost should be the correct interpretation (
> e.g. wi-fi with a query time analiser which split by - ) .
>
> Cheers
>
>
>
>
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Weird behavioural differences between pf in dismax and edismax

2018-05-29 Thread Alessandro Benedetti
I don't have any hard position on this, It's ok to not build a phrase boost
if the input query is 1 term and it remains one term after the analysis for
one of the pf fields.

But if the term produces multiple tokens after query time analysis, I do
believe that building a phrase boost should be the correct interpretation (
e.g. wi-fi with a query time analiser which split by - ) .

Cheers







-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Throttling on replication after ADDREPLICA

2018-05-29 Thread Leonard, Carl
My setup is a 3 shard index with 20GB per shard.  I am migrating from a 
master/slave replication to TLOG replicas using Solr 7.1.

In my use case I have to recreate my index from scratch each day.  I have a 
heterogeneous setup with a high powered indexer and lesser searcher that answer 
queries.

My method it to create a new collection with a single TLOG instance and then 
build the data.  When complete I use ADDREPLICA to create the replicas on the 
searcher instances.

My issue is that the replication after the ADDREPLICA seems to be limited to 
about 100Mbits per second.  I am on a 1Gbit network and I am able to rsync or 
scp at full speed.  This causes excess delays and it is a real problem.  I've 
looked for what is throttling the replication, but I have not found anything 
that makes any difference.  I've tried


  
1000
  


That did not change anything.

Is there a config option that I have not found to control this?  I've looked in 
the source code, but have not found anything there either.

Thanks in advance if anyone could help point me in the correct direction in 
either config options or where this might be in the source.


solr-extracting features values

2018-05-29 Thread Roee Tarab
Hi,
I have a 8 Question core, and a feature file as well. Im trying
to extract feature values for each Q couple in order to use them for
training an algorithm (in order to build a model).
Can you help me to extract those feature values?
Thanks!


Re: CURL command problem on Solr

2018-05-29 Thread chris
HTTP headers are case insensitive





 Original message From: simon  Date: 
5/29/18  12:17 PM  (GMT-05:00) To: solr-user  
Subject: Re: CURL command problem on Solr 
Could it be that the header should be 'Content-Type' (which is what I see
in the relevant RFC) rather than 'Content-type' as shown in your email ? I
don't know if headers are case-sensitive, but it's worth checking.

-Simon

On Tue, May 29, 2018 at 11:02 AM, Roee Tarab  wrote:

> Hi ,
>
> I am having some troubles with pushing a features file to solr while
> building an LTR model. I'm trying to upload a JSON file on windows cmd
> executable from an already installed CURL folder, with the command:
>
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/feature-store'
> --data-binary "@/path/myFeatures.json" -H 'Content-type:application/json'.
>
> I am receiving the following error massage:
>
> {
>   "responseHeader":{
> "status":500,
> "QTime":7},
>   "error":{
> "msg":"Bad Request",
> "trace":"Bad Request (400) - Invalid content type
> application/x-www-form-urlencoded; only application/json is
> supported.\r\n\tat org.apache.solr.rest.RestManager$ManagedEndpoint.
> parseJsonFromRequestBody(RestManager.java:407)\r\n\tat
> org.apache.solr.rest.
> RestManager$ManagedEndpoint.put(RestManager.java:340) 
>
> This is definitely a technical issue, and I have not been able to overcome
> it for 2 days.
>
> Is there another option of uploading the file to our core? Is there
> something we are missing in our command?
>
> Thank you in advance for any help,
>


Re: CURL command problem on Solr

2018-05-29 Thread simon
Could it be that the header should be 'Content-Type' (which is what I see
in the relevant RFC) rather than 'Content-type' as shown in your email ? I
don't know if headers are case-sensitive, but it's worth checking.

-Simon

On Tue, May 29, 2018 at 11:02 AM, Roee Tarab  wrote:

> Hi ,
>
> I am having some troubles with pushing a features file to solr while
> building an LTR model. I'm trying to upload a JSON file on windows cmd
> executable from an already installed CURL folder, with the command:
>
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/feature-store'
> --data-binary "@/path/myFeatures.json" -H 'Content-type:application/json'.
>
> I am receiving the following error massage:
>
> {
>   "responseHeader":{
> "status":500,
> "QTime":7},
>   "error":{
> "msg":"Bad Request",
> "trace":"Bad Request (400) - Invalid content type
> application/x-www-form-urlencoded; only application/json is
> supported.\r\n\tat org.apache.solr.rest.RestManager$ManagedEndpoint.
> parseJsonFromRequestBody(RestManager.java:407)\r\n\tat
> org.apache.solr.rest.
> RestManager$ManagedEndpoint.put(RestManager.java:340) 
>
> This is definitely a technical issue, and I have not been able to overcome
> it for 2 days.
>
> Is there another option of uploading the file to our core? Is there
> something we are missing in our command?
>
> Thank you in advance for any help,
>


Re: Removed nodes still visible as gone in Solrcloud graph

2018-05-29 Thread Dominique Bejean
Hi,

I reply to myself.

The solution is to edit the state.json fie for all impacted collections.


   - Stop all Solr nodes


   - Download state.json file from ZK for collection "xx"

# server/scripts/cloud-scripts/zkcli.sh -z "xxx.xxx.xxx.xxx:2181" -cmd
getfile /collections/xx/state.json /tmp/-state-local.json


   - Edit the downloaded state.json and save it


   - Remove collection state.json from ZK

# server/scripts/cloud-scripts/zkcli.sh -z "xxx.xxx.xxx.xxx:2181" -cmd
clear /collections/xx/state.json


   - Upload modified state.json to ZK

# server/scripts/cloud-scripts/zkcli.sh -z "xxx.xxx.xxx.xxx:2181" -cmd
putfile /collections/xx/state.json /tmp/-state-local.json


   - Start all Solr nodes


Dominique


Le mar. 29 mai 2018 à 14:19, Dominique Bejean  a
écrit :

> Hi,
>
> On a node, I accidentally changed the SOLR_HOST value from uppercase to
> lowercase and I restarted the node. After I fixed the error, I restarted
> again the node but the node name in lowercase is still visible as "gone".
> How to definitively remove a gone node from the Solrcloud graph ?
>
> Regards.
>
> Dominique
>
>
> --
> Dominique Béjean
> 06 08 46 12 43
>
-- 
Dominique Béjean
06 08 46 12 43


CURL command problem on Solr

2018-05-29 Thread Roee Tarab
Hi ,

I am having some troubles with pushing a features file to solr while
building an LTR model. I'm trying to upload a JSON file on windows cmd
executable from an already installed CURL folder, with the command:

curl -XPUT 'http://localhost:8983/solr/techproducts/schema/feature-store'
--data-binary "@/path/myFeatures.json" -H 'Content-type:application/json'.

I am receiving the following error massage:

{
  "responseHeader":{
"status":500,
"QTime":7},
  "error":{
"msg":"Bad Request",
"trace":"Bad Request (400) - Invalid content type
application/x-www-form-urlencoded; only application/json is
supported.\r\n\tat org.apache.solr.rest.RestManager$ManagedEndpoint.
parseJsonFromRequestBody(RestManager.java:407)\r\n\tat org.apache.solr.rest.
RestManager$ManagedEndpoint.put(RestManager.java:340) 

This is definitely a technical issue, and I have not been able to overcome
it for 2 days.

Is there another option of uploading the file to our core? Is there
something we are missing in our command?

Thank you in advance for any help,


Looking for folks to test out Solr 5 and 6 bindings for upcoming YCSB 0.14.0 release

2018-05-29 Thread Sean Busbey
Hi Solr users!

The YCSB project is currently testing out release candidates for our 0.14.0 
release.

This release updates our Solr 5 and 6 support to allow use with kerberized Solr.

If anyone has a spare 30 or so minutes, we could use help testing out things so 
that the Solr clients can stay in our "tested in a release and supported" 
category. For details on what's involved in testing and where to get the 
release candidate, please see the following issue:

https://github.com/brianfrankcooper/YCSB/issues/1117

-
busbey


Removed nodes still visible as gone in Solrcloud graph

2018-05-29 Thread Dominique Bejean
Hi,

On a node, I accidentally changed the SOLR_HOST value from uppercase to
lowercase and I restarted the node. After I fixed the error, I restarted
again the node but the node name in lowercase is still visible as "gone".
How to definitively remove a gone node from the Solrcloud graph ?

Regards.

Dominique


-- 
Dominique Béjean
06 08 46 12 43


Re: Weird behavioural differences between pf in dismax and edismax

2018-05-29 Thread Elizabeth Haubert
I disagree that a phrase of 1-word is just a phrase.  That is the core
difference between the qf and pf clauses.  Qf is collecting the terms; pf
is boosting the combinations.

For queries where the original query phrase has only a single term in it,
then it might be a moot point, unless the clauses are being pointed at
different fields or different boosts.

But for multi-term queries, pf (and pf2 and pf3) can be important
differentiators between documents that just happen to have enough words
from the user's original query, and documents that get closer to the user's
meaning.It balances the documents that have enough terms per mm and
documents that have enough terms in one field.

Elizabeth Haubert






On Tue, May 29, 2018 at 5:14 AM, Alessandro Benedetti 
wrote:

> In my opinion, given the definition of dismax and edismax query parsers,
> they
> should behave the same for parameters in common.
> To be a little bit extreme I don't think we need the dismax query parser at
> all anymore ( in the the end edismax  is only offering more than the
> dismax)
>
> Finally, I do believe that even if the query is a single term ( before or
> after the analysis for a PF field) it should anyway boost the phrase.
> A phrase of 1 word is still a phrase, isn't it ?
>
>
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Impact/Performance of maxDistErr

2018-05-29 Thread David Smiley
Hello Jens,
With solr.RptWithGeometrySpatialField, you always get an accurate result
thanks to the "WithGeometry" part.  The "Rpt" part is a grid index, and
most of the parameters pertain to that.  maxDistErr controls the highest
resolution grid.  No shape will be indexed to higher resolutions than this,
though may be courser resolutions dependent on distErrPct.  The
configuration you chose initially (that turned out to be slow for you) was
a meter, and then you changed it to a kilometer and got fast indexing
results.  I figure the size of your indexed shapes are on average a
kilometer in size (give or take an order of magnitude).  It's hard to guess
how your query shapes compare to your indexed shapes as there are multiple
possibilities that could yield similar query performance when changing
maxDistErr so much.

The bottom line is that you should dial up maxDistErr as much as you can
get away with it -- which is as long as query performance is good.  So you
did the right thing :-).  That number will probably be a distance somewhat
less than the average indexed shape diameter, or average query shape
diameter, whichever is greater.  Perhaps 1/10th smaller; if I had to pick.
The default setting, I think a meter, is probably not a good default for
this field type.

Note you could also try increasing distErrPct some, maybe to as much as
.25, though I wouldn't go much higher., as it may yield gridded shapes that
are so course as to not have interior cells.  Depending on what your query
shapes typically look like and indexed shapes relative to each other, that
may be significant or may not be.  If the indexed shapes are often much
larger than your query shape then it's significant.

~ David

On Fri, May 25, 2018 at 6:59 AM Jens Viebig  wrote:

> Hello,
>
> we are indexing a polygon with 4 points (non-rectangular, field-of-view of
> a camera) in a RptWithGeometrySpatialField alongside some more fields, to
> perform searches that check if a point is within this polygon
>
> We started using the default configuration found in several examples
> online:
>
> 
> spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
>geo="true" distErrPct="0.15" maxDistErr="0.001"
> distanceUnits="kilometers" />
>
> We discovered that with this setting the indexing (soft commit) speed is
> very slow
> For 1 documents it takes several minutes to finish the commit
>
> If we disable this field, indexing+soft commit is only 3 seconds for 1
> docs,
> if we set maxDistErr to 1, indexing speed is at around 5 seconds, so a
> huge performance gain against the several minutes we had before
>
> I tried to find out via the documentation whats the impact of "maxDistErr"
> on search results but didn't quite find an in-depth explanation
> From our tests we did, the search results still seem to be very accurate
> even if the covered space of the polygon is less then 1km and search speed
> did not suffer.
>
> So i would love to learn more about the differences on having
> maxDistErr="0.001" vs maxDistErr="1" on a RptWithGeometrySpatialField and
> what problems could we run into with the bigger value
>
> Thanks
> Jens
>
>
>
>
> *Jens Viebig*
>
> Software Development
>
> MAM Products
>
>
> T. +49-(0)4307-8358-0 <+49%204307%2083580>
>
> E. jens.vie...@vitec.com
>
> *http://www.vitec.com *
>
>
>
> [image: VITEC_logo_for_email_signature]
>
>
>
> --
>
> VITEC GmbH, 24223 Schwentinental
>
> Geschäftsführer/Managing Director: Philippe Wetzel
> HRB Plön 1584 / Steuernummer: 1929705211 / VATnumber: DE134878603
>
>
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


sending empty request parameters to solr

2018-05-29 Thread Riyaz
Hi,
We had come across a requirement to allow empty parameter values to query
string(q), start and rows as part of solr search query. 

In solr 3.4, have added defType to edismax and it's allowing empty params

http:///solr//select?=xml=true==" 
-->working fine in solr 3.4

Configs:



explicit
10
**:*
edismax*





while applying the same changes in solr-5.3.1, empty rows/start parameter
causing "java.lang.NumberFormatException: For input string: """.

Can you please let us know, is there any way to do this?.
Thanks
Riyaz



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: Solr 7.3, FunctionScoreQuery no longer displays debug output

2018-05-29 Thread Sadat Anwer
I have had the same issue, only I was comparing it to solr 6.6 and when I saw 
the rather brief debug OP. I initially thought it was beause the boost was now 
a FunctionQuery! I am glad others have the same issue. Its almost impossible to 
make any sense of the debug op now.

Thanks,

Sadat

On May 29, 2018, 11:18 AM +0200, Markus Jelsma , 
wrote:
> Anyway, i've created a ticket for reference:
> https://issues.apache.org/jira/browse/SOLR-12414
>
> Thanks,
> Markus
>
> -Original message-
> > From:Markus Jelsma  > Sent: Friday 18th May 2018 0:24
> > To: solr-user@lucene.apache.org
> > Subject: RE: Solr 7.3, FunctionScoreQuery no longer displays debug output
> >
> > Thanks Yonik,
> >
> > That is the suspect issue i stumbled upon when reading through the 
> > CHANGES.txt. Can you, or someone, please verify this? I need to know this 
> > before i can file a bug.
> >
> > There is a definitive difference in 7.2 and 7.3's respective outputs, i 
> > triple checked the debug output. But on one hand i can't believe that issue 
> > was committed with this flaw. And although we have a lot of custom code, we 
> > have nothing that should interfere this much with the debug prints, or that 
> > should be obvious in the change log.
> >
> > Please verify and let me open a ticket, or we'll change the discussion into 
> > what has changed in Solr/Lucene so much, for us to get back on track.
> >
> > Many thanks,
> > Markus
> >
> >
> > -Original message-
> > > From:Yonik Seeley  > > Sent: Friday 18th May 2018 0:04
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Solr 7.3, FunctionScoreQuery no longer displays debug output
> > >
> > > If this used to work, I wonder if it's something to do with changes to 
> > > boost:
> > > https://issues.apache.org/jira/browse/LUCENE-8099
> > >
> > > -Yonik
> > >
> > >
> > > On Thu, May 17, 2018 at 5:48 PM, Markus Jelsma
> > >  wrote:
> > > > Hello,
> > > >
> > > > Sorry to disturb. Is there anyone here able to reproduce and verify 
> > > > this issue?
> > > >
> > > > Many thanks,
> > > > Markus
> > > >
> > > >
> > > >
> > > > -Original message-
> > > > > From:Markus Jelsma  > > > > Sent: Wednesday 9th May 2018 18:25
> > > > > To: solr-user  > > > > Subject: Solr 7.3, FunctionScoreQuery no longer displays debug output
> > > > >
> > > > > Hi,
> > > > >
> > > > > Is this a known problem? For example, the following query:
> > > > > q=australia=true=if(exists(query($bqlang)),2,1)=lang:en=edismax=content_en
> > > > >  content_ro
> > > > >
> > > > > returns the following toString for 7.2.1:
> > > > > boost(+(Synonym(content_en:australia content_en:australia) | 
> > > > > Synonym(content_ro:austral 
> > > > > content_ro:australia)),if(exists(query(lang:en,def=0.0)),const(2),const(1)))
> > > > >
> > > > > 7.3:
> > > > > FunctionScoreQuery(+(Synonym(content_en:australia 
> > > > > content_en:australia) | Synonym(content_ro:austral 
> > > > > content_ro:australia)), scored by 
> > > > > boost(if(exists(query(lang:en,def=0.0)),const(2),const(1
> > > > >
> > > > > and the following debug output for 7.2.1:
> > > > >
> > > > > 11.226025 = boost((Synonym(content_en:australia content_en:australia) 
> > > > > | Synonym(content_ro:austral 
> > > > > content_ro:australia)),if(exists(query(lang:en,def=0.0)),const(2),const(1))),
> > > > >  product of:
> > > > > 11.226025 = max of:
> > > > > 11.226025 = weight(Synonym(content_ro:austral content_ro:australia) 
> > > > > in 6761) [SchemaSimilarity], result of:
> > > > > 11.226025 = score(doc=6761,freq=18.0 = termFreq=18.0
> > > > > ), product of:
> > > > > 5.442921 = idf(docFreq=193, docCount=44720)
> > > > > 2.0625 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1) from:
> > > > > 18.0 = termFreq=18.0
> > > > > 1.2 = parameter k1
> > > > > 0.0 = parameter b (norms omitted for field)
> > > > > 1.0 = if(exists(query(lang:en,def=0.0)=0.0),const(2),const(1))
> > > > >
> > > > > but for 7.3 i get only:
> > > > >
> > > > > 11.226025 = product of:
> > > > > 1.0 = boost
> > > > > 11.226025 = 
> > > > > boost(if(exists(query(lang:en,def=0.0)),const(2),const(1)))
> > > > >
> > > > > The scores are still the same, but the debug output is useless. 
> > > > > Removing the boost fixes the problem of debug output immediately.
> > > > >
> > > > > Thanks,
> > > > > Markus
> > > > >
> > > > >
> > >
> >


RE: Solr 7.3, FunctionScoreQuery no longer displays debug output

2018-05-29 Thread Markus Jelsma
Anyway, i've created a ticket for reference:
https://issues.apache.org/jira/browse/SOLR-12414

Thanks,
Markus

-Original message-
> From:Markus Jelsma 
> Sent: Friday 18th May 2018 0:24
> To: solr-user@lucene.apache.org
> Subject: RE: Solr 7.3, FunctionScoreQuery no longer displays debug output
> 
> Thanks Yonik,
> 
> That is the suspect issue i stumbled upon when reading through the 
> CHANGES.txt. Can you, or someone, please verify this? I need to know this 
> before i can file a bug.
> 
> There is a definitive difference in 7.2 and 7.3's respective outputs, i 
> triple checked the debug output. But on one hand i can't believe that issue 
> was committed with this flaw. And although we have a lot of custom code, we 
> have nothing that should interfere this much with the debug prints, or that 
> should be obvious in the change log.
> 
> Please verify and let me open a ticket, or we'll change the discussion into 
> what has changed in Solr/Lucene so much, for us to get back on track.
> 
> Many thanks,
> Markus
>  
>  
> -Original message-
> > From:Yonik Seeley 
> > Sent: Friday 18th May 2018 0:04
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr 7.3, FunctionScoreQuery no longer displays debug output
> > 
> > If this used to work, I wonder if it's something to do with changes to 
> > boost:
> > https://issues.apache.org/jira/browse/LUCENE-8099
> > 
> > -Yonik
> > 
> > 
> > On Thu, May 17, 2018 at 5:48 PM, Markus Jelsma
> >  wrote:
> > > Hello,
> > >
> > > Sorry to disturb. Is there anyone here able to reproduce and verify this 
> > > issue?
> > >
> > > Many thanks,
> > > Markus
> > >
> > >
> > >
> > > -Original message-
> > >> From:Markus Jelsma 
> > >> Sent: Wednesday 9th May 2018 18:25
> > >> To: solr-user 
> > >> Subject: Solr 7.3, FunctionScoreQuery no longer displays debug output
> > >>
> > >> Hi,
> > >>
> > >> Is this a known problem? For example, the following query:
> > >> q=australia=true=if(exists(query($bqlang)),2,1)=lang:en=edismax=content_en
> > >>  content_ro
> > >>
> > >> returns the following toString for 7.2.1:
> > >> boost(+(Synonym(content_en:australia content_en:australia) | 
> > >> Synonym(content_ro:austral 
> > >> content_ro:australia)),if(exists(query(lang:en,def=0.0)),const(2),const(1)))
> > >>
> > >> 7.3:
> > >> FunctionScoreQuery(+(Synonym(content_en:australia content_en:australia) 
> > >> | Synonym(content_ro:austral content_ro:australia)), scored by 
> > >> boost(if(exists(query(lang:en,def=0.0)),const(2),const(1
> > >>
> > >> and the following debug output for 7.2.1:
> > >>
> > >> 11.226025 = boost((Synonym(content_en:australia content_en:australia) | 
> > >> Synonym(content_ro:austral 
> > >> content_ro:australia)),if(exists(query(lang:en,def=0.0)),const(2),const(1))),
> > >>  product of:
> > >>   11.226025 = max of:
> > >> 11.226025 = weight(Synonym(content_ro:austral content_ro:australia) 
> > >> in 6761) [SchemaSimilarity], result of:
> > >>   11.226025 = score(doc=6761,freq=18.0 = termFreq=18.0
> > >> ), product of:
> > >> 5.442921 = idf(docFreq=193, docCount=44720)
> > >> 2.0625 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1) 
> > >> from:
> > >>   18.0 = termFreq=18.0
> > >>   1.2 = parameter k1
> > >>   0.0 = parameter b (norms omitted for field)
> > >>   1.0 = if(exists(query(lang:en,def=0.0)=0.0),const(2),const(1))
> > >>
> > >> but for 7.3 i get only:
> > >>
> > >> 11.226025 = product of:
> > >>   1.0 = boost
> > >>   11.226025 = boost(if(exists(query(lang:en,def=0.0)),const(2),const(1)))
> > >>
> > >> The scores are still the same, but the debug output is useless. Removing 
> > >> the boost fixes the problem of debug output immediately.
> > >>
> > >> Thanks,
> > >> Markus
> > >>
> > >>
> > 
> 


Re: Weird behavioural differences between pf in dismax and edismax

2018-05-29 Thread Alessandro Benedetti
In my opinion, given the definition of dismax and edismax query parsers, they
should behave the same for parameters in common.
To be a little bit extreme I don't think we need the dismax query parser at
all anymore ( in the the end edismax  is only offering more than the dismax)

Finally, I do believe that even if the query is a single term ( before or
after the analysis for a PF field) it should anyway boost the phrase.
A phrase of 1 word is still a phrase, isn't it ?





-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Understanding SOLR Joins

2018-05-29 Thread Nancy Goyal
Hi,


I have implemented basic SOLR Joins between two collections. Currently in
my project implementation, we are getting data from multiple tables and
storing as single document in view and indexing that view. We got a
suggestion to implement the same with Joins but not sure if the same
functionalities can be achieved with JOINs or Block Join (Nested documents)-

*Data*

There are multiple tables, one primary table having all the basic details
about product, the primary key is Product ID and then 7-8 other tables
having other details of product, it has Product ID column too but can have
multiple entries for single Product ID.

*Can you please let me know if the below are possible-*

   1. Can we get data from multiple collections in the search results. The
   results should contain only one record for a single product ID?
   2. Can we search across multiple collections in a single query and then
   club the results, so that final search results will have single result for
   each Product ID.
   3. Can we perform join on more than 2 collections as we need to search
   across 6-7 collections and then merge the data based on product ID.
   4. Can we query Parent and child in nested index at the same time.
   Example- perform search on column1 from Parent and Column2 from Child and
   get the parent records with nested child in search results?
   5. If we can perform fielded search across multiple collections in the
   single query,will the filters from different collections be returned in a
   single search response


The examples i got from internet have joins only between two collections
and searching only on single collection.

Thanks & Regards,
Nancy Goyal


RE: Configuring aliases in ZooKeeper first

2018-05-29 Thread Gael Jourdan-Weil
Thanks for your feedback. I opened following issue: 
https://issues.apache.org/jira/browse/SOLR-12413.

De : Shalin Shekhar Mangar 
Envoyé : lundi 28 mai 2018 17:58
À : solr-user@lucene.apache.org
Objet : Re: Configuring aliases in ZooKeeper first
  

Thanks for the report. This sounds like a bug. At least on startup, we
should refresh the configuration from ZK without looking at local config
versions. Can you please open a Jira issue?

On Wed, May 23, 2018 at 5:35 PM, Gael Jourdan-Weil <
gael.jourdan-w...@kelkoogroup.com> wrote:

> Hello everyone,
>
> We are running a SolrCloud cluster with ZooKeeper.
> This SolrCloud cluster is down most of the time (backup environment) but
> the ZooKeeper instances are always up so that we can easily update
> configuration.
>
> This has been working fine for a long time with Solr 6.4.0 then 6.6.0, but
> since upgrading to 7.2.1, we ran into an issue where Solr ignores
> aliases.json stored in ZooKeeper.
>
> Steps to reproduce the problem:
> 1/ SolrCloud cluster is down
> 2/ Direct update of aliases.json file in ZooKeeper with Solr ZkCLI
> *without using Collections API* :
> java ... org.apache.solr.cloud.ZkCLI -zkhost ... -cmd clear /aliases.json
> java ... org.apache.solr.cloud.ZkCLI -zkhost ... -cmd put /aliases.json
> "new content"
> 3/ SolrCloud cluster is started => aliases.json not taken into account
>
> Digging a bit in the code, what is actually causing the issue is that,
> when starting, Solr now checks for the metadata of the aliases.json file
> and if the version metadata from ZooKeeper is lower or equal to local
> version, it keeps the local version.
> When it starts, Solr has a local version of 0 for the aliases but
> ZooKeeper also has a version of 0 of the file because we just recreated it.
> So Solr ignores ZooKeeper configuration and never has a chance to load
> aliases.
>
> Relevant parts of Solr code are:
> - https://github.com/apache/lucene-solr/blob/branch_7_2/
> solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java :
> line 4562 : method setIfNewer
> - https://github.com/apache/lucene-solr/blob/branch_7_2/
> solr/solrj/src/java/org/apache/solr/common/cloud/Aliases.java : line 45 :
> the "empty" Aliases object with default version 0
>
> Obviously, a workaround is to force ZooKeeper to have a version greater
> than 0 for aliases.json file (for instance by not clearing the file and
> just overwriting it again and again).
>
>
> But we were wondering, is this the intended behavior for Solr ?
>
> Thanks for reading,
>
> Gaël




-- 
Regards,
Shalin Shekhar Mangar.