Re: Solr cloud in kubernetes

2017-11-19 Thread rajasaur
Hi Bjorn,

Im trying a similar approach now (to get solrcloud working on kubernetes). I
have run Zookeeper as a statefulset, but not running SolrCloud, which is
causing an issue when my pods get destroyed and restarted. 
I will try running the -h option so that the SOLR_HOST is used when
connecting to itself (and to zookeeper).

On another note, how do you store the indexes ? I had an issue with my GCE
node (Node NotReady), which had its kubelet to be restarted, but with that,
since solrcloud pods were restarted, all the data got wiped out. Just
wondering how you have setup your indexes with the solrcloud kubernetes
setup.

Thanks
Raja
 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: LTR training

2017-11-19 Thread Diego Ceccarelli
Hello Ilay,
Answers in line:

On Sat, Nov 18, 2017 at 2:22 PM, ilay  wrote:
>
> 1. Does LTR only support phrase matching (complete user query) from training
> data for extracting feature score:
> ex.
> efi.user_query='tv+stand'  matches the title feature only if title contains
> "tv stand" in the title.
> By removing the quotes, able to match at term level, but the behaviour is
> not consistent when we change the order of the terms in the query.
> i.e. efi.user_query=tv stand gives a different feature match score that of
> efi.user_query=stand tv for the same title match.
>
> Are we supposed to always wrap efi.userquery with single quotes and do
> phrase matching. If we do so, we miss out on term matches.
> Which request handler does this query go through?

I'm not sure I understand properly the question, could you post the
feature that you are using together with the efi parameter? I guess
you have a feaure like this:


{
"name" : "myFeature",
"class" : "org.apache.solr.ltr.feature.SolrFeature",
"params" : {
 {"q":"$user_query"}
}
},

with efi.user_query='tv+stand' you get the score of the boolean query
with the terms in and while efi_query='tv OR stand' should give you a
score of the OR  (so it will give you a score different from zero if
only one of the two terms is in the field). If you want to match
exactly the bigram in the text I think efi.user_query='\"tv+stand\"'
should work. Let me know if it solves.

>
> 2. Generating training data using clickstream
>
>  Please advice on usage of clickstream data for training (in the absence of
> human judgements).
>  Can we expect LTR to do good job interms of weights learned when we use
> click data (implicit feedback data).

This really depends on your use case. I would suggest to take a look
at this survey by Chuklin et al.
clickmodels.weebly.com/uploads/5/2/2/5/52257029/mc2015-clickmodels.pdf

>
> 3. Newness challenge
>Generally clicks data is good for popular items. Learning newness seems a
> challenge with this approach. Any thoughts here..

Not really, i'm afraid it's an open problem, crowdsourcing may help..

> 4. Original score feature weight is still zero

Did you take a look at the original scores? there was a bug that was
making solr always return zero instead of the original score, so maybe
that's your problem.
The bug has been recently fixed in
https://issues.apache.org/jira/browse/SOLR-11180.  It would explain
why your weight is zero.

Best,
Diego



>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: get all tokens from TokenStream in my custom filter

2017-11-19 Thread Ahmet Arslan
 
Hi Kumar,
I checked the code base and I couldn't find peek method either. However, I 
found LookaheadTokenFilter that may be useful to you.
I figured that this is a Lucene question and you can receive more answers in 
the Lucene user list.
Ahmet


On Sunday, November 19, 2017, 10:16:21 PM GMT+3, kumar gaurav 
 wrote:  
 
 Hi friends 
very much thank you for your replies .
yet i could not solved the problem .
Emir , i need all tokens of query in incrementToken() function not only current 
token .
Modassar , if i am not ending or closing the stream . all tokens is blank and 
only last token is indexed .
Ahmet i could not find peek or advance method :(  

Please help me guys . 

On Fri, Nov 17, 2017 at 10:10 PM, Ahmet Arslan  wrote:

 Hi Kumar,
If I am not wrong, I think there is method named something like peek(2) or 
advance(2).Some filters access tokens ahead and perform some logic.
AhmetOn Wednesday, November 15, 2017, 10:50:55 PM GMT+3, kumar gaurav 
 wrote:  
 
 Hi

I need to get full field value from TokenStream in my custom filter class .

I am using this

stream.reset();
while (tStream.incrementToken()) {
    term += " "+charTermAttr.toString();
}
stream.end();
stream.close();

this is ending streaming . no token is producing if i am using this .

I want to get full string without hampering token creation .

Eric ! Are you there ? :)  Anyone Please help  ?