Re: Cache for percentiles facets

2015-08-16 Thread Håvard Wahl Kongsgård
Hi, just a general question as I was unable to find any old posts relating
to stats/percentile/facets performance/cache settings.

I have been using Solr since version 4.0 , now using the latest v. 5.2.1.

What I have done:

- Increase heap memory to 30gb

- Experimented with the cache settings

- Merged segments

- Used docvalues as filter

- Tried with ramdrive for index as well

-The field I calculate percentile on is type int, seems to be a big
performance difference between int and float/decimal etc.

The database consists of multiple sets with 5 mil rows I calculate facets
stats for a field filtered by those sets. My fields are indexed not stored

The queries are basic

curl http://localhost:8983/solr/demo/query -d
'rows=0fq=set_id:id_of_setq=*:*
json.facet={
  by_something:{terms:{
field:myfield,
facet:{
  median_value:percentile(myvalue_field,50)
}
  }}
}


As a quick fix I created a cache in redis ;)

-Håvard


On Sat, Aug 15, 2015 at 10:26 PM, Erick Erickson erickerick...@gmail.com
wrote:

 You have to provide a lot more info about your problem, including
 what you've tried, what your data looks like, etc.

 You might review:
 http://wiki.apache.org/solr/UsingMailingLists

 Best,
 Erick

 On Sat, Aug 15, 2015 at 10:27 AM, Håvard Wahl Kongsgård
 haavard.kongsga...@gmail.com wrote:
  Hi, I have tried various options to speed up percentile calculation for
  facets. But the internal solr cache only speed up my queries from 22 to
 19
  sec.
 
  I'am using the new json facets http://yonik.com/json-facet-api/
 
  Any tips for caching stats?
 
 
  -Håvard



Cache for percentiles facets

2015-08-15 Thread Håvard Wahl Kongsgård
Hi, I have tried various options to speed up percentile calculation for
facets. But the internal solr cache only speed up my queries from 22 to 19
sec.

I'am using the new json facets http://yonik.com/json-facet-api/

Any tips for caching stats?


-Håvard


Boosting on field-not-empty

2014-10-30 Thread Håvard Wahl Kongsgård
Hi, a simple question how to boost field-not-empty. For some reasons
solr(4.6) returns rows with empty fields first (while the fields are not
part of the search query).

I came across this old thread
http://grokbase.com/t/lucene/solr-user/125e4yenha/boosting-on-field-empty-or-not
, but no solution



-- 
Håvard Wahl Kongsgård


Re: Automating Solr

2014-10-30 Thread Håvard Wahl Kongsgård
Then you have to run it again and again
30. okt. 2014 19:18 skrev Craig Hoffman mountain@gmail.com følgende:

 The data gets into Solr via MySQL script.
 --
 Craig Hoffman
 w: http://www.craighoffmanphotography.com
 FB: www.facebook.com/CraigHoffmanPhotography
 TW: https://twitter.com/craiglhoffman













  On Oct 30, 2014, at 12:11 PM, Craig Hoffman mountain@gmail.com
 wrote:
 
  Right, of course. The data changes every few days. According to this
 article, you can run a CRON Job to create a new index.
  http://www.finalconcept.com.au/article/view/apache-solr-hints-and-tips 
 http://www.finalconcept.com.au/article/view/apache-solr-hints-and-tips
 
  On Thu, Oct 30, 2014 at 12:04 PM, Alexandre Rafalovitch 
 arafa...@gmail.com mailto:arafa...@gmail.com wrote:
  You don't reindex Solr. You reindex data into Solr. So, this depends
  where you data is coming from and how often it changes. If the data
  does not change, no point re-indexing it. And how do you get the data
  into the Solr in the first place?
 
  Regards,
 Alex.
  Personal: http://www.outerthoughts.com/ http://www.outerthoughts.com/
 and @arafalov
  Solr resources and newsletter: http://www.solr-start.com/ 
 http://www.solr-start.com/ and @solrstart
  Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
 https://www.linkedin.com/groups?gid=6713853
 
 
  On 30 October 2014 13:58, Craig Hoffman mountain@gmail.com mailto:
 mountain@gmail.com wrote:
   Simple question:
   What is best way to automate re-indexing Solr? Setup a CRON JOB / Curl
 Script?
  
   Thanks,
   Craig
   --
   Craig Hoffman
   w: http://www.craighoffmanphotography.com 
 http://www.craighoffmanphotography.com/
   FB: www.facebook.com/CraigHoffmanPhotography 
 http://www.facebook.com/CraigHoffmanPhotography
   TW: https://twitter.com/craiglhoffman 
 https://twitter.com/craiglhoffman
  
  
  
  
  
  
  
  
  
  
  
  
  
 
 
 
  --
  __
  Craig Hoffman
  iChat / AIM:mountain.do
  __




Re: Boosting on field-not-empty

2014-10-30 Thread Håvard Wahl Kongsgård
Thanks :)

On Thu, Oct 30, 2014 at 7:49 PM, Ramzi Alqrainy ramzi.alqra...@gmail.com
wrote:

 You can use FunctionQuery that allows one to use the actual value of a
 field
 and functions of those fields in a relevancy score.

 Two function will help you, which are :

 *exists*

 exists(field|function) returns true if a value exists for a given document.

 Example use: exists(myField) will return true if myField has a value, while
 exists(query({!v='year:2012'})) will return true for docs with year=2012.

 *if*

 if(expression,trueValue,falseValue) emits trueValue if the expression is
 true, else falseValue. An expression can be any function which outputs
 boolean values, or even functions returning numeric values, in which case
 value 0 will be interpreted as false, or strings, in which case empty
 string
 is interpreted as false.

 Example use: if(exists(myField),100,0) returns 100 if myField exists

 *Solution: *

 Use in a parameter that is explicitly for specifying functions, such as the
 EDisMax query parser's boost param, or DisMax query parser's bf (boost
 function) parameter. (Note that the bf parameter actually takes a list of
 function queries separated by white space and each with an optional boost.
 Make sure you eliminate any internal white space in single function queries
 when using bf). For example:

 
 http://lucene.472066.n3.nabble.com/file/n4166709/Screen_Shot_2014-10-30_at_9.png
 




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Boosting-on-field-not-empty-tp4166692p4166709.html
 Sent from the Solr - User mailing list archive at Nabble.com.