Re: More Highlighting details

2019-07-25 Thread govind nitk
Hi Furkan KAMACI,

Thanks for your reply. Will look at custom transformer options.
I was looking for any way( can be debug also) to get from highlight that it
is matched vs defaultSummary.
Anyways, will update findings on custom transformers and if it can solve
what I mean by.

Best Regards,
Govind

On Thu, Jul 25, 2019 at 11:45 PM Furkan KAMACI 
wrote:

> Hi Govind,
>
> Highlighting is the easiest way to detect it. You can find a similar
> question at here:
>
> https://stackoverflow.com/questions/9629147/how-to-return-column-that-matched-the-query-in-solr
>
> Kind Regards,
> Furkan KAMACI
>
> On Wed, Jul 24, 2019 at 9:28 PM govind nitk  wrote:
>
> > Hi Furkan KAMACI,
> >
> > Thanks for your thoughts on maxAnalyzedChars.
> >
> > So, how can we get whether its matched or not? Is there any way to get
> such
> > data from extra payload in response from solr ?
> >
> > Thanks and regards
> > Govind
> >
> > On Wed, Jul 24, 2019 at 8:43 PM Furkan KAMACI 
> > wrote:
> >
> > > Hi Govind,
> > >
> > > Using *hl.tag.pre* and *hl.tag.post* may help you. However you should
> > keep
> > > in mind that even such term exists in desired field, highlighter can
> use
> > > fallback field due to *hl.maxAnalyzedChars* parameter.
> > >
> > > Kind Regards,
> > > Furkan KAMACI
> > >
> > > On Wed, Jul 24, 2019 at 8:24 AM govind nitk 
> > wrote:
> > >
> > > > Hi all,
> > > > How about using hl.tag pre and post. If these are present then there
> is
> > > > actually field match otherwise its default summary ?
> > > > Will it work or there are some cases where it will not ?
> > > >
> > > >
> > > > Thanks in advance.
> > > >
> > > >
> > > >
> > > > On Tue, Jul 23, 2019 at 5:48 PM govind nitk 
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > How to get more details for highlighting ?
> > > > >
> > > > > I am using
> > > > >
> > > >
> > >
> >
> hl.method=unified&=title,url,paragraph=true=true
> > > > >
> > > > > So, if query words not matched, I am getting defaultSummary, which
> is
> > > > > great. *Can we get more info saying whether it found matches or
> > default
> > > > > summary? How to get such information?*
> > > > > Also is it good idea to use highlighting on urls ? Will urls get
> > > > distorted
> > > > > by any chance ?
> > > > >
> > > > >
> > > > > Best Regards,
> > > > > Govind
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: More Highlighting details

2019-07-24 Thread govind nitk
Hi Furkan KAMACI,

Thanks for your thoughts on maxAnalyzedChars.

So, how can we get whether its matched or not? Is there any way to get such
data from extra payload in response from solr ?

Thanks and regards
Govind

On Wed, Jul 24, 2019 at 8:43 PM Furkan KAMACI 
wrote:

> Hi Govind,
>
> Using *hl.tag.pre* and *hl.tag.post* may help you. However you should keep
> in mind that even such term exists in desired field, highlighter can use
> fallback field due to *hl.maxAnalyzedChars* parameter.
>
> Kind Regards,
> Furkan KAMACI
>
> On Wed, Jul 24, 2019 at 8:24 AM govind nitk  wrote:
>
> > Hi all,
> > How about using hl.tag pre and post. If these are present then there is
> > actually field match otherwise its default summary ?
> > Will it work or there are some cases where it will not ?
> >
> >
> > Thanks in advance.
> >
> >
> >
> > On Tue, Jul 23, 2019 at 5:48 PM govind nitk 
> wrote:
> >
> > > Hi all,
> > >
> > > How to get more details for highlighting ?
> > >
> > > I am using
> > >
> >
> hl.method=unified&=title,url,paragraph=true=true
> > >
> > > So, if query words not matched, I am getting defaultSummary, which is
> > > great. *Can we get more info saying whether it found matches or default
> > > summary? How to get such information?*
> > > Also is it good idea to use highlighting on urls ? Will urls get
> > distorted
> > > by any chance ?
> > >
> > >
> > > Best Regards,
> > > Govind
> > >
> > >
> >
>


Re: More Highlighting details

2019-07-23 Thread govind nitk
Hi all,
How about using hl.tag pre and post. If these are present then there is
actually field match otherwise its default summary ?
Will it work or there are some cases where it will not ?


Thanks in advance.



On Tue, Jul 23, 2019 at 5:48 PM govind nitk  wrote:

> Hi all,
>
> How to get more details for highlighting ?
>
> I am using
> hl.method=unified&=title,url,paragraph=true=true
>
> So, if query words not matched, I am getting defaultSummary, which is
> great. *Can we get more info saying whether it found matches or default
> summary? How to get such information?*
> Also is it good idea to use highlighting on urls ? Will urls get distorted
> by any chance ?
>
>
> Best Regards,
> Govind
>
>


More Highlighting details

2019-07-23 Thread govind nitk
Hi all,

How to get more details for highlighting ?

I am using
hl.method=unified&=title,url,paragraph=true=true

So, if query words not matched, I am getting defaultSummary, which is
great. *Can we get more info saying whether it found matches or default
summary? How to get such information?*
Also is it good idea to use highlighting on urls ? Will urls get distorted
by any chance ?


Best Regards,
Govind


How to get spellcheck results per field in solr ?

2018-10-27 Thread govind nitk
Hi,

I have done suggestion using suggest component. And the results returned
are having format:

suggest: { "cityname_suggest": { }, "location_suggest": {},
"area_suggest":{} }
given cityname_suggest, location_suggest, area_suggest are different
dictionary names.

Now comparing this result structure to spellcheck response, my questions
are :
1. how to build multiple spellcheck results per dictionary ?


What I have tried :
copying multiple fields data into "get_spell" field and build spellcheck on
top of this. But is there any way to get spellcheck results per dictionary
mentioned ?

Thanks


Re: Unknown field "cache"

2018-09-02 Thread govind nitk
Please metion the schema definition of date.
If you edit solr schema manually, you need to reload the solr core.



On Sat, Sep 1, 2018 at 3:38 AM kunhu0...@gmail.com 
wrote:

> Hello Team,
>
> Need suggestions on Solr Indexing. We are using Solr-6.6.3 and Nutch 1.14.
>
> I see unknown field 'cache' error while indexing the data to Solr so i
> added
> below entry in field section of schema.xml forsolr
>
> 
>
> Tried indexing the data again and this time error is unknown field 'date'.
> However i have the 
> Please suggest
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Logging Every document to particular core

2018-06-20 Thread govind nitk
Thanks a lot for your inputs Alessandro and Mikhail.
@Alessandro, I tried with transaction log. But it was bit more of work to
get around( as it gets rolled over).

Hack I did is use of a proxy in between and Now I have more control.

Regards,
Govind

On Thu, Jun 14, 2018 at 7:32 PM Mikhail Khludnev  wrote:

> You can enable DEBUG level for LogUpdateProcessorFactory category
>
>
> https://github.com/apache/lucene-solr/blob/228a84fd6db3ef5fc1624d69e1c82a1f02c51352/solr/core/src/java/org/apache/solr/update/processor/LogUpdateProcessorFactory.java#L100
>
>
>
> On Wed, Jun 13, 2018 at 5:00 PM, govind nitk 
> wrote:
>
> > Hi,
> >
> > Is there any way to log all the data getting indexed to a particular core
> > only ?
> >
> >
> > Regards,
> > govind
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Logging Every document to particular core

2018-06-13 Thread govind nitk
Hi,

Is there any way to log all the data getting indexed to a particular core
only ?


Regards,
govind


Re: FreeTextSuggester throwing error "token must not contain separator byte"

2017-07-24 Thread govind nitk
Hi Angel,
please share the freesuggester defined in the config.

I guess you might have mentioned whitespace as separator in the
freesuggester definition as :

 
Which is creaing the trouble.




On Tue, Jul 25, 2017 at 9:01 AM, Erick Erickson 
wrote:

> The shingle filter may use space as the separator between shingles that it
> generates. The admin/ analysis page is your friend.
>
> On Jul 24, 2017 2:45 PM, "Angel Todorov"  wrote:
>
> > Hi Rick,
> >
> > Yep, that's really weird, because I am using the
> StandardTokenizerFactory,
> > which is supposed to remove whitespace. Also tried the
> > WhitespaceTokenizerFactory. I'll have a look at other analyzers or if
> > nothing works maybe implement my own.
> >
> > I am using a Shingle filter right after the StandardTokenizer, not sure
> if
> > that has anything to do with it.
> >
> >
> > Thanks,
> > Angel
> >
> >
> > On Tue, Jul 25, 2017 at 12:09 AM Rick Leir  wrote:
> >
> > > Angel,
> > > The 20 byte is an ASCII space character, which is a separator in most
> > > contexts. Breaking the buffer at spaces, you can see 6 non-space
> tokens.
> > >
> > > Have a look at your analysis chain and see why you are getting this.
> > > Cheers -- Rick
> > >
> > > On July 24, 2017 4:27:00 PM EDT, Angel Todorov 
> > > wrote:
> > > >Hi guys,
> > > >
> > > >I am trying to setup the FreeTextSuggester/ Lookup Factory in a
> > > >suggester
> > > >definition in SOLR. Unfortunately while the index is building, I am
> > > >encountering the following errors:
> > > >
> > > >*"msg":"tokens must not contain separator byte; got token=[30 20 30 20
> > > >32
> > > >20 72 20 61 6c 6c 65 6e 20 72] but gramCount=6, which is greater than
> > > >expected max ngram size=5","trace":"java.lang.
> IllegalArgumentException:
> > > >tokens must not contain separator byte; got token=[30 20 30 20 32 20
> 72
> > > >20
> > > >61 6c 6c 65 6e 20 72] but gramCount=6, which is greater than expected
> > > >max
> > > >ngram size=5\r\n\tat
> > >
> > > >org.apache.lucene.search.suggest.analyzing.FreeTextSuggester.build(
> > FreeTextSuggester.java:362)\r\n\tat
> > > >*
> > > >
> > > >I've also opened the following issue, because i don't think it's right
> > > >not
> > > >to handle this exception:
> > > >
> > > >https://issues.apache.org/jira/browse/SOLR-11139
> > > >
> > > >But my question is about the error in general - why is it occurring? I
> > > >only
> > > >have English text, nothing special.
> > > >
> > > >Thanks,
> > > >Angel
> > >
> > > --
> > > Sorry for being brief. Alternate email is rickleir at yahoo dot com
> >
>


Re: How to Debug Solr With Eclipse

2017-07-13 Thread govind nitk
Hi,

Solr has releases, kindly checkout to the needed one.


cheers

On Thu, Jul 13, 2017 at 11:20 PM, Rainer Gnan 
wrote:

> Hello community,
>
> my aim is to develop solr custom code (e.g. UpdateRequestProcessor)
> within Eclipse AND to test the code within a debuggable solr/lucene
> local instance - also within Eclipse.
> Searching the web led me to multiple instructions but for me no one
> works.
>
> The only relevant question I actually have to solve this problem is:
> Where can I download the source code for the version I want that
> includes the ANT build.xml for building an Eclipse-Project?
>
> The solr project page (http://archive.apache.org/dist/lucene/solr/)
> seems not to provide that.
>
> I appreciate any hint!
>
> Best regards
> Rainer
>
>


Re: Need domain configuration assistance

2017-07-13 Thread govind nitk
1. run solr on different port other than 8983.
2. If you are exposing it publicly, dont expose through port. But have a
apache/nignx server in middle to pass traffic to solr.

cheers

On Thu, Jul 13, 2017 at 11:13 PM, Susheel Kumar 
wrote:

> But don't expose Solr outside to public...
>
> On Thu, Jul 13, 2017 at 1:41 PM, Susheel Kumar 
> wrote:
>
> > This window solr server must have a name and IP address associated with
> > it. Check from external content deliver servers if port 8983 to Solr
> server
> > is open and if so you can refer solr via http://:/
> solr.
> >   if port 8983 is not open then try to run solr 80/8080 or work with
> > network team to open the ports.  If you need to access solr server by
> > domain name then you would have to ask network team to create a DNS/VIP
> and
> > map it to solr IP address.
> >
> > Enjoy!!!
> >
> >
> >
> >
> >
> >
> > On Thu, Jul 13, 2017 at 1:23 PM, Bertini, Vickie <
> > vickie.bert...@bannerhealth.com> wrote:
> >
> >> I have installed Solr 6.5.1 as a service on our Windows Server 2012. It
> >> is up and running properly under localhost:8983, however, I have a
> domain
> >> name I want to assign to it so our external content delivery servers can
> >> reach it for our web search. However, I am very new to Solr and mostly
> only
> >> familiar with IIS, not jetty, so is there documentation on how I would
> >> configure solr to run on our new domain/IP instead of localhost? Web
> >> searches up to this point have been fairly unproductive and unhelpful.
> >>
> >>
> >>
> >> *Vickie Bertini*
> >> IT Sitecore Architect
> >>
> >> Digital Business Technology
> >> 602.747.7186 <(602)%20747-7186> office
> >>
> >>
> >>
> >> 
> >>
> >>
> >>
> >
> >
>


Re: suggestors on shingles

2017-07-13 Thread govind nitk
Hi Alessandro,

Currently fuzzy or lookup suggester is not supporting - returning
suggestions generated from the shingles at index time.
As you guided, I am using FreeText suggester. But freetext suggester won't
support the fuzzyness.

Let me know your inputs on this edge case.


Regards,
Govind




On Thu, Jul 13, 2017 at 8:03 PM, alessandro.benedetti 
wrote:

> To do what ?
> If it is a use case, please explain us.
>
> If it is just to check that the analysis chain worked correctly, you can
> check the schema browser or use Luke.
>
> If you just want to test your analysis chain, you can use the analysis tool
> in the Solr admin.
>
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/suggestors-on-shingles-tp4345763p4345836.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: suggestors on shingles

2017-07-13 Thread govind nitk
Hi Alessandro,

Thanks a lot. I followed your blog and able to get the suggestions.

But I am curious about solr returning the tokenized data.
So, lot of filtering happended, shingles are generated, now Is it possible
to get those finally genrated tokens any way ?


Regards,
Govind


On Thu, Jul 13, 2017 at 4:00 PM, alessandro.benedetti 
wrote:

> I would recommend this blog of mine to get a better understanding of how
> tokenization and the suggester work together [1] .
>
> If you take a look to the FuzzyLookupFactory, you will see that it is one
> of
> the suggesters that return the entire content of the field.
>
> You may be interested to the FreeTextLookupFactory.
>
> Cheers
>
>
> [1] http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/suggestors-on-shingles-tp4345763p4345793.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


suggestors on shingles

2017-07-12 Thread govind nitk
Hi,

I have a fieldtype "suggestion" with definition as :


**
*  *
**
**
* *
*  *
*  *
**
**

*  *
**

I have a field named "mysuggestion" with definition as :

**

*I copy other field - names, countries, short description to
"mysuggestion".*


I am building a suggestor on top of this field as:

**
*  fuzzySuggester*
*  FuzzyLookupFactory*
*  DocumentDictionaryFactory*
*  true*
*  true*
*  mysuggestion*
*  true*
*  suggestion *
**



Expectation: returned Syggestions should be shingles not the entire line of
description or name.

1. Is it possible to pass suggestors a tokenized/analyzed field ?
2. Is it possible to retrieve tokenized values from solr ?


Regards,
Govind


Re: Boosting Documents using the field Value

2017-06-27 Thread govind nitk
Hi Erick,

Finally Made it work.

bf=if(exists(query($qqone)),one_score,0)=one_query:\"google cloud\"

Thanks a lot for guiding, also reminding its not url escape.

No analyzers used.


Regards,
Govind



On Tue, Jun 27, 2017 at 11:01 AM, govind nitk <govind.n...@gmail.com> wrote:

> Hi Erick,
> I accept, I should have mentioned the what I was doing first.
>
> field types:
> one_query is "string",
> one_score is float.
>
> So No explicit analyzers.
>
> mentioned sow=false. and escaping as you mentioned. But still the error
> persist. - undefined field "cloud"
>
> Will get back.
>
> Regards,
> Givind
>
> On Tue, Jun 27, 2017 at 8:44 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> bq: So, ultimate goal is when the exact query matches in field
>> one_query, apply boost of one_score
>>
>> It would have been helpful to have made that statement in the first
>> place, would have saved some false paths.
>>
>> What is your analysis chain here? If it's anything like "text_general"
>> or the like then you're going to have some trouble. I'd think about an
>> analysis chain like KeywordTokenizerFactory and
>> LowercaseFilterFactory. That'll index the entire field as a single
>> token. The admin/analysis page is your friend.
>>
>> To search against it, you need to _escape_ the space (not "url
>> escape"). As in google\ cloud so that makes it through the query
>> parser as a single token.
>>
>> As of Solr 6.5 you can also specify sow=false (Split On Whitespace),
>> which may be a better option, see:
>> https://issues.apache.org/jira/browse/SOLR-9185
>>
>> Best,
>> Erick
>>
>> On Mon, Jun 26, 2017 at 7:32 PM, govind nitk <govind.n...@gmail.com>
>> wrote:
>> > Hi Developers, Erick
>> >
>> > I am able to add boost through function as below:
>> > bf=if(termfreq(one_query,"google"),one_score,0)
>> >
>> > Problem is when I say "google cloud" as query, it gives error:
>> > undefined field: \"cloud\""
>> >
>> > I tried encoding the query(%20, + for space), but not able to get it
>> > working.
>> >
>> > So, ultimate goal is when the exact query matches in field one_query,
>> apply
>> > boost of one_score.
>> >
>> > Is there any way to do this? Or a PR is needed.
>> >
>> >
>> > Regards,
>> > Govind
>> >
>> >
>> > On Mon, Jun 26, 2017 at 11:14 AM, govind nitk <govind.n...@gmail.com>
>> wrote:
>> >
>> >>
>> >> Hi Erick,
>> >>
>> >> Exactly this is what I was looking for.
>> >> Thanks a lot.
>> >>
>> >>
>> >> Regards,
>> >> Govind
>> >>
>> >> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson <
>> erickerick...@gmail.com>
>> >> wrote:
>> >>
>> >>> Take a look at function queries. You're probably looking for "field",
>> >>> "termfreq" and "if" functions or some other combination like that.
>> >>>
>> >>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk <govind.n...@gmail.com>
>> >>> wrote:
>> >>> > Hi Erik, Thanks for the reply.
>> >>> >
>> >>> > My intention of using the domain_ct in the qf was, giving the weight
>> >>> > present in the that document.
>> >>> >
>> >>> > e.g
>> >>> > qf=category^domain_ct
>> >>> >
>> >>> > if the current query matched in the category, the boost given will
>> be
>> >>> > domain_ct, which is present in the current matched document.
>> >>> >
>> >>> >
>> >>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct
>> as 4
>> >>> > indexed categories(text_general fields) and the same document has
>> >>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
>> >>> > fields(int), is there any way to achieve:
>> >>> >
>> >>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
>> >>> egory_3ct^domain_3ct=category_4ct^domain_4ct
>> >>> >   ?
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > Regards
>> >>> >

Re: Boosting Documents using the field Value

2017-06-26 Thread govind nitk
Hi Erick,
I accept, I should have mentioned the what I was doing first.

field types:
one_query is "string",
one_score is float.

So No explicit analyzers.

mentioned sow=false. and escaping as you mentioned. But still the error
persist. - undefined field "cloud"

Will get back.

Regards,
Givind

On Tue, Jun 27, 2017 at 8:44 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> bq: So, ultimate goal is when the exact query matches in field
> one_query, apply boost of one_score
>
> It would have been helpful to have made that statement in the first
> place, would have saved some false paths.
>
> What is your analysis chain here? If it's anything like "text_general"
> or the like then you're going to have some trouble. I'd think about an
> analysis chain like KeywordTokenizerFactory and
> LowercaseFilterFactory. That'll index the entire field as a single
> token. The admin/analysis page is your friend.
>
> To search against it, you need to _escape_ the space (not "url
> escape"). As in google\ cloud so that makes it through the query
> parser as a single token.
>
> As of Solr 6.5 you can also specify sow=false (Split On Whitespace),
> which may be a better option, see:
> https://issues.apache.org/jira/browse/SOLR-9185
>
> Best,
> Erick
>
> On Mon, Jun 26, 2017 at 7:32 PM, govind nitk <govind.n...@gmail.com>
> wrote:
> > Hi Developers, Erick
> >
> > I am able to add boost through function as below:
> > bf=if(termfreq(one_query,"google"),one_score,0)
> >
> > Problem is when I say "google cloud" as query, it gives error:
> > undefined field: \"cloud\""
> >
> > I tried encoding the query(%20, + for space), but not able to get it
> > working.
> >
> > So, ultimate goal is when the exact query matches in field one_query,
> apply
> > boost of one_score.
> >
> > Is there any way to do this? Or a PR is needed.
> >
> >
> > Regards,
> > Govind
> >
> >
> > On Mon, Jun 26, 2017 at 11:14 AM, govind nitk <govind.n...@gmail.com>
> wrote:
> >
> >>
> >> Hi Erick,
> >>
> >> Exactly this is what I was looking for.
> >> Thanks a lot.
> >>
> >>
> >> Regards,
> >> Govind
> >>
> >> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> >>
> >>> Take a look at function queries. You're probably looking for "field",
> >>> "termfreq" and "if" functions or some other combination like that.
> >>>
> >>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk <govind.n...@gmail.com>
> >>> wrote:
> >>> > Hi Erik, Thanks for the reply.
> >>> >
> >>> > My intention of using the domain_ct in the qf was, giving the weight
> >>> > present in the that document.
> >>> >
> >>> > e.g
> >>> > qf=category^domain_ct
> >>> >
> >>> > if the current query matched in the category, the boost given will be
> >>> > domain_ct, which is present in the current matched document.
> >>> >
> >>> >
> >>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct
> as 4
> >>> > indexed categories(text_general fields) and the same document has
> >>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
> >>> > fields(int), is there any way to achieve:
> >>> >
> >>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
> >>> egory_3ct^domain_3ct=category_4ct^domain_4ct
> >>> >   ?
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > Regards
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher <
> erik.hatc...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
> >>> >> edismax.
> >>> >>
> >>> >> > On Jun 23, 2017, at 23:01, govind nitk <govind.n...@gmail.com>
> >>> wrote:
> >>> >> >
> >>> >> > Hi Solr,
> >>> >> >
> >>> >> > My Index Data:
> >>> >> >
> >>> >> > id name category domain domain_ct
> >>> >> > 1 Banana Fruits Home > Fruits > Banana 2
> >>> >> > 2 Orange Fruits Home > Fruits > Orange 4
> >>> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
> >>> >> >
> >>> >> >
> >>> >> > I am able to retrieve the documents with dismax parser with the
> >>> weights
> >>> >> > mentioned as below.
> >>> >> >
> >>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >>> >> indent=on=fruits=category
> >>> >> > ^0.9=name^0.7=json
> >>> >> >
> >>> >> >
> >>> >> > Is it possible to retrieve the documents with weight taken from
> the
> >>> >> indexed
> >>> >> > field like:
> >>> >> >
> >>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >>> >> indent=on=fruits=category
> >>> >> > ^domain_ct=name^domain_ct=json
> >>> >> >
> >>> >> > Is this possible to give weight from an indexed field ? Am I doing
> >>> >> > something wrong?
> >>> >> > Is there any other way of doing this?
> >>> >> >
> >>> >> >
> >>> >> > Regards
> >>> >>
> >>>
> >>
> >>
>


Re: Boosting Documents using the field Value

2017-06-26 Thread govind nitk
Hi Developers, Erick

I am able to add boost through function as below:
bf=if(termfreq(one_query,"google"),one_score,0)

Problem is when I say "google cloud" as query, it gives error:
undefined field: \"cloud\""

I tried encoding the query(%20, + for space), but not able to get it
working.

So, ultimate goal is when the exact query matches in field one_query, apply
boost of one_score.

Is there any way to do this? Or a PR is needed.


Regards,
Govind


On Mon, Jun 26, 2017 at 11:14 AM, govind nitk <govind.n...@gmail.com> wrote:

>
> Hi Erick,
>
> Exactly this is what I was looking for.
> Thanks a lot.
>
>
> Regards,
> Govind
>
> On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Take a look at function queries. You're probably looking for "field",
>> "termfreq" and "if" functions or some other combination like that.
>>
>> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk <govind.n...@gmail.com>
>> wrote:
>> > Hi Erik, Thanks for the reply.
>> >
>> > My intention of using the domain_ct in the qf was, giving the weight
>> > present in the that document.
>> >
>> > e.g
>> > qf=category^domain_ct
>> >
>> > if the current query matched in the category, the boost given will be
>> > domain_ct, which is present in the current matched document.
>> >
>> >
>> > So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
>> > indexed categories(text_general fields) and the same document has
>> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
>> > fields(int), is there any way to achieve:
>> >
>> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=cat
>> egory_3ct^domain_3ct=category_4ct^domain_4ct
>> >   ?
>> >
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher <erik.hatc...@gmail.com>
>> > wrote:
>> >
>> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
>> >> edismax.
>> >>
>> >> > On Jun 23, 2017, at 23:01, govind nitk <govind.n...@gmail.com>
>> wrote:
>> >> >
>> >> > Hi Solr,
>> >> >
>> >> > My Index Data:
>> >> >
>> >> > id name category domain domain_ct
>> >> > 1 Banana Fruits Home > Fruits > Banana 2
>> >> > 2 Orange Fruits Home > Fruits > Orange 4
>> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
>> >> >
>> >> >
>> >> > I am able to retrieve the documents with dismax parser with the
>> weights
>> >> > mentioned as below.
>> >> >
>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> >> indent=on=fruits=category
>> >> > ^0.9=name^0.7=json
>> >> >
>> >> >
>> >> > Is it possible to retrieve the documents with weight taken from the
>> >> indexed
>> >> > field like:
>> >> >
>> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
>> >> indent=on=fruits=category
>> >> > ^domain_ct=name^domain_ct=json
>> >> >
>> >> > Is this possible to give weight from an indexed field ? Am I doing
>> >> > something wrong?
>> >> > Is there any other way of doing this?
>> >> >
>> >> >
>> >> > Regards
>> >>
>>
>
>


Re: SOLR Suggester returns either the full field value or single terms only

2017-06-26 Thread govind nitk
Hi Alessandro,

Thanks for clarification.



On Mon, Jun 26, 2017 at 4:53 PM, alessandro.benedetti 
wrote:

> " Don't use an heavy Analyzers, the suggested terms will come from the
> index,
> so be sure they are meaningful tokens. A really basic analyser is
> suggested,
> stop words and stemming are not "
>
> This means that your suggestions will come from the index, so if you use
> heavy analysers you can get terms suggested which are not really useful :
>
> e.g.
>
> Solr is an amazing search engine
>
> If you have some stemmer in your analysis chain, you will have this
> behavior
> :
>
> q= ama
> result : amaz search engin
>
> So it is better to have this lookup strategy configured on top of a light
> analysed field ( or copyfield).
>
>
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/SOLR-Suggester-returns-either-the-full-field-
> value-or-single-terms-only-tp4342763p4342807.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SOLR Suggester returns either the full field value or single terms only

2017-06-26 Thread govind nitk
Hi alessandro,
Really nice article.

Can you brief us on "*Don't use an heavy Analyzers*" ?


Regards,
Govind

On Mon, Jun 26, 2017 at 2:19 PM, alessandro.benedetti 
wrote:

> Hi Angel,
> your are looking for the Free Text lookup approach.
> You find more info in [1] and [2]
>
> [1]
> https://lucene.apache.org/solr/guide/6_6/suggester.html#Suggester-
> FreeTextLookupFactory
> [2] http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/SOLR-Suggester-returns-either-the-full-field-
> value-or-single-terms-only-tp4342763p4342790.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


strdist function gives error

2017-06-26 Thread govind nitk
Hi Team,

solr 6.5.1 on ubuntu 14.04:


strdist function gives error for comparison as below:

strdist(myfield,"google cloud","jw")


I am getting below error:

"error": {"metadata": ["error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.search.SyntaxError"],"msg":
"org.apache.solr.search.SyntaxError:
Missing end quote for string at pos 21 str='if(gt(strdist(myfield,\"google'"
,"code": 400}

I tried putting encoding(%20, +) for space, but no use. - like
strdist(myfield,"google%20cloud","jw")

Any way to do this?


Regards,
Govind


Re: Boosting Documents using the field Value

2017-06-25 Thread govind nitk
Hi Erick,

Exactly this is what I was looking for.
Thanks a lot.


Regards,
Govind

On Mon, Jun 26, 2017 at 12:03 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Take a look at function queries. You're probably looking for "field",
> "termfreq" and "if" functions or some other combination like that.
>
> On Sun, Jun 25, 2017 at 9:01 AM, govind nitk <govind.n...@gmail.com>
> wrote:
> > Hi Erik, Thanks for the reply.
> >
> > My intention of using the domain_ct in the qf was, giving the weight
> > present in the that document.
> >
> > e.g
> > qf=category^domain_ct
> >
> > if the current query matched in the category, the boost given will be
> > domain_ct, which is present in the current matched document.
> >
> >
> > So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
> > indexed categories(text_general fields) and the same document has
> > domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
> > fields(int), is there any way to achieve:
> >
> > qf=category_1ct^domain_1ct=category_2ct^domain_2ct=
> category_3ct^domain_3ct=category_4ct^domain_4ct
> >   ?
> >
> >
> >
> >
> > Regards
> >
> >
> >
> >
> > On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher <erik.hatc...@gmail.com>
> > wrote:
> >
> >> With dismax use bf=domain_ct. you can also use boost=domain_ct with
> >> edismax.
> >>
> >> > On Jun 23, 2017, at 23:01, govind nitk <govind.n...@gmail.com> wrote:
> >> >
> >> > Hi Solr,
> >> >
> >> > My Index Data:
> >> >
> >> > id name category domain domain_ct
> >> > 1 Banana Fruits Home > Fruits > Banana 2
> >> > 2 Orange Fruits Home > Fruits > Orange 4
> >> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
> >> >
> >> >
> >> > I am able to retrieve the documents with dismax parser with the
> weights
> >> > mentioned as below.
> >> >
> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >> indent=on=fruits=category
> >> > ^0.9=name^0.7=json
> >> >
> >> >
> >> > Is it possible to retrieve the documents with weight taken from the
> >> indexed
> >> > field like:
> >> >
> >> > http://localhost:8983/solr/my_index/select?defType=dismax;
> >> indent=on=fruits=category
> >> > ^domain_ct=name^domain_ct=json
> >> >
> >> > Is this possible to give weight from an indexed field ? Am I doing
> >> > something wrong?
> >> > Is there any other way of doing this?
> >> >
> >> >
> >> > Regards
> >>
>


Re: SOLR Suggester returns either the full field value or single terms only

2017-06-25 Thread govind nitk
Hi Angel,

Please Look at these documents.
1. https://home.apache.org/~ctargett/RefGuidePOC/jekyll-full/suggester.html
2.
https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory


Regards,
Govind



On Mon, Jun 26, 2017 at 3:12 AM, Angel Todorov  wrote:

> Hi guys,
>
> I am trying to configure the Suggester in a way that i get google-style
> auto suggestions:
>
> - I don't want the suggestions to be the _whole_ field value
> - I don't want the suggestions to be single terms
>
> For example, if I have a field that has the value "The brown fox jumped
> over the fence"
>
> and I type "br" for example, I would like the get things like "Brown" and
> "brown fox", but not the whole sentence. Also, i don't want my results to
> be just single terms, but to also include phrases.
>
> I have tried a lot of configurations and have found out that if I use the
> Document Dictionary Factory, I get the whole field value as a result, If I
> use the Fuzzy Dictionary, I only get single terms as results. Nothing
> similar to my requirements.  My config is properly configured and my field
> type is stored, because i am getting results, it's just that the results
> are not what I'd expect.
>
> Would greatly appreciate if you can guide me to the right config.
>
> Thanks,
> Angel
>


Re: Boosting Documents using the field Value

2017-06-25 Thread govind nitk
Hi Erik, Thanks for the reply.

My intention of using the domain_ct in the qf was, giving the weight
present in the that document.

e.g
qf=category^domain_ct

if the current query matched in the category, the boost given will be
domain_ct, which is present in the current matched document.


So if I have category_1ct, category_2ct, category_3ct, category_4ct as 4
indexed categories(text_general fields) and the same document has
domain_1ct, domain_2ct, domain_3ct, domain_4ct as 4 different count
fields(int), is there any way to achieve:

qf=category_1ct^domain_1ct=category_2ct^domain_2ct=category_3ct^domain_3ct=category_4ct^domain_4ct
  ?




Regards




On Sat, Jun 24, 2017 at 3:42 PM, Erik Hatcher <erik.hatc...@gmail.com>
wrote:

> With dismax use bf=domain_ct. you can also use boost=domain_ct with
> edismax.
>
> > On Jun 23, 2017, at 23:01, govind nitk <govind.n...@gmail.com> wrote:
> >
> > Hi Solr,
> >
> > My Index Data:
> >
> > id name category domain domain_ct
> > 1 Banana Fruits Home > Fruits > Banana 2
> > 2 Orange Fruits Home > Fruits > Orange 4
> > 3 Samsung Mobile Electronics > Mobile > Samsung 3
> >
> >
> > I am able to retrieve the documents with dismax parser with the weights
> > mentioned as below.
> >
> > http://localhost:8983/solr/my_index/select?defType=dismax;
> indent=on=fruits=category
> > ^0.9=name^0.7=json
> >
> >
> > Is it possible to retrieve the documents with weight taken from the
> indexed
> > field like:
> >
> > http://localhost:8983/solr/my_index/select?defType=dismax;
> indent=on=fruits=category
> > ^domain_ct=name^domain_ct=json
> >
> > Is this possible to give weight from an indexed field ? Am I doing
> > something wrong?
> > Is there any other way of doing this?
> >
> >
> > Regards
>


Boosting Documents using the field Value

2017-06-23 Thread govind nitk
Hi Solr,

My Index Data:

id name category domain domain_ct
1 Banana Fruits Home > Fruits > Banana 2
2 Orange Fruits Home > Fruits > Orange 4
3 Samsung Mobile Electronics > Mobile > Samsung 3


I am able to retrieve the documents with dismax parser with the weights
mentioned as below.

http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
^0.9=name^0.7=json


Is it possible to retrieve the documents with weight taken from the indexed
field like:

http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
^domain_ct=name^domain_ct=json

Is this possible to give weight from an indexed field ? Am I doing
something wrong?
Is there any other way of doing this?


Regards


Fwd: Solr - search score and tf-idf vector from individual fields

2016-08-16 Thread govind nitk
Hi Developers,


down votefavorite


This is a fundamental question which I was unable to get from the solr help
and other related Stackoverflow queries.

I have few hundred thousand documents which have 12 fields in them (to be
indexed). All of these fields have text in them (each field can have
varying length text in them - may be from 10 to 5000 characters). For e.g ,
lets say these fields are named A, B . L (12 in all)

Now, when I search for documents, my query comes from 3 fields. X1 , X2 and
X3. Now X1 (conceptually) closely matches with fields C, D , and E. X2
(conceptually) closely matches with fields F, G and J. And X3 is basically
the same field as A. But X1 and X2 should be searched for, all over the
fields (including A). Just filtering against their conceptually matching
fields will not do.

So when designing the schema, my only criterion is the ranking and the
search. I also want (can I ? ) get scores of my query against individual
fields. Something like this

Query : X1 , Score against C , E and over all score (for all returned
documents)

Query : X2 , Score against M , N , O and over all score (for all returned
documents)

Query : X1 + X2 , Score against C , E, M, N and O, and over all score (for
all returned documents)

The reason I want those individual scores is I want to further use those
scores for ML algorithms to further reshuffle/fit the rankings against a
training set.

I also want want the tf-idf vector components of X1 and X2 against C, E and
M,N,O respectively.

Can anyone please let me know if this is possible ?