Re: paging size in SOLR

2011-08-19 Thread jame vaalet
1 .what does this specify ?

queryResultCache class=*solr.LRUCache*
size=*${queryResultCacheSize:0}*initialSize
=*${queryResultCacheInitialSize:0}* autowarmCount=*
${queryResultCacheRows:0}* /

2.

when i say *queryResultCacheSize : 512 *, does it mean 512 queries can be
cached or 512 bytes are reserved for caching ?

can some please give me an answer ?



On 14 August 2011 21:41, Erick Erickson erickerick...@gmail.com wrote:

 Yep.

 ResultWindowSize in
  solrconfig.xml
 
  Best
  Erick
 
  On Sun, Aug 14, 2011 at 8:35 AM, jame vaalet jamevaa...@gmail.com
 wrote:
   thanks erick ... that means it depends upon the memory allocated to
 the
  JVM
   .
  
   going back queryCacheResults factor i have got this doubt ..
   say, i have got 10 threads with 10 different queries ..and each of
 them
  in
   parallel are searching the same index with millions of docs in it
   (multisharded ) .
   now each of the queries have large number of results in it hence got
 to
  page
   them all..
   which all thread's (query ) result-set will be cached ? so that
  subsequent
   pages can be retrieved quickly ..?
  
   On 14 August 2011 17:40, Erick Erickson erickerick...@gmail.com
 wrote:
  
   There isn't an optimum page size that I know of, it'll vary with
 lots
  of
   stuff, not the least of which is whatever servlet container limits
 there
   are.
  
   But I suspect you can get quite a few (1000s) without
   too much problem, and you can always use the JSON response
   writer to pack in more pages with less overhead.
  
   You pretty much have to try it and see.
  
   Best
   Erick
  
   On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com
  wrote:
speaking about pagesizes, what is the optimum page size that should
 be
retrieved each time ??
i understand it depends upon the data you are fetching back
 fromeach
  hit
document ... but lets say when ever a document is hit am fetching
 back
   100
bytes worth data from each of those docs in indexes (along with
 solr
response statements ) .
this will make 100*x bytes worth data in each page if x is the page
  size
   ..
what is the optimum value of this x that solr can return each time
   without
going into exceptions 
   
On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com
  wrote:
   
Jame:
   
You control the number via settings in solrconfig.xml, so it's
up to you.
   
Jonathan:
Hmmm, that's seems right, after all the deep paging penalty is
  really
about keeping a large sorted array in memory but at least you
  only
pay it once per 10,000, rather than 100 times (assuming page size
 is
100)...
   
Best
Erick
   
On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet 
 jamevaa...@gmail.com
wrote:
 when you say queryResultCache, does it only cache n number of
  result
   for
the
 last one query or more than one queries?


 On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:

 Worth remembering there are some performance penalties with
 deep
 paging, if you use the page-by-page approach. may not be too
 much
  of
   a
 problem if you really are only looking to retrieve 10K docs.

 -Simon

 On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  Well, if you really want to you can specify start=0 and
  rows=1
   and
  get them all back at once.
 
  You can do page-by-page by incrementing the start parameter
 as
   you
  indicated.
 
  You can keep from re-executing the search by setting your
 queryResultCache
  appropriately, but this affects all searches so might be an
  issue.
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet 
  jamevaa...@gmail.com
   
 wrote:
  hi,
  i want to retrieve all the data from solr (say 10,000 ids )
 and
  my
page
 size
  is 1000 .
  how do i get back the data (pages) one after other ?do i
 have
  to
 increment
  the start value each time by the page size from 0 and do
 the
iteration
 ?
  In this case am i querying the index 10 time instead of one
 or
   after
 first
  query the result will be cached somewhere for the subsequent
  pages
   ?
 
 
  JAME VAALET
 
 




 --

 -JAME

   
   
   
   
--
   
-JAME
   
  
  
  
  
   --
  
   -JAME
  
 
 
 
 
  --
 
  -JAME
 




-- 

-JAME


Re: paging size in SOLR

2011-08-19 Thread Erick Erickson
1 I don't know, where is it coming from? Looks like you've done stats call on
a freshly opened server.

2 512 entries (i.e. results for 512 queries). Each entry is
queryResultWindowSize
doc IDs.

Best
Erick

On Fri, Aug 19, 2011 at 5:33 AM, jame vaalet jamevaa...@gmail.com wrote:
 1 .what does this specify ?

 queryResultCache class=*solr.LRUCache*
 size=*${queryResultCacheSize:0}*initialSize
 =*${queryResultCacheInitialSize:0}* autowarmCount=*
 ${queryResultCacheRows:0}* /

 2.

 when i say *queryResultCacheSize : 512 *, does it mean 512 queries can be
 cached or 512 bytes are reserved for caching ?

 can some please give me an answer ?



 On 14 August 2011 21:41, Erick Erickson erickerick...@gmail.com wrote:

 Yep.

 ResultWindowSize in
  solrconfig.xml
 
  Best
  Erick
 
  On Sun, Aug 14, 2011 at 8:35 AM, jame vaalet jamevaa...@gmail.com
 wrote:
   thanks erick ... that means it depends upon the memory allocated to
 the
  JVM
   .
  
   going back queryCacheResults factor i have got this doubt ..
   say, i have got 10 threads with 10 different queries ..and each of
 them
  in
   parallel are searching the same index with millions of docs in it
   (multisharded ) .
   now each of the queries have large number of results in it hence got
 to
  page
   them all..
   which all thread's (query ) result-set will be cached ? so that
  subsequent
   pages can be retrieved quickly ..?
  
   On 14 August 2011 17:40, Erick Erickson erickerick...@gmail.com
 wrote:
  
   There isn't an optimum page size that I know of, it'll vary with
 lots
  of
   stuff, not the least of which is whatever servlet container limits
 there
   are.
  
   But I suspect you can get quite a few (1000s) without
   too much problem, and you can always use the JSON response
   writer to pack in more pages with less overhead.
  
   You pretty much have to try it and see.
  
   Best
   Erick
  
   On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com
  wrote:
speaking about pagesizes, what is the optimum page size that should
 be
retrieved each time ??
i understand it depends upon the data you are fetching back
 fromeach
  hit
document ... but lets say when ever a document is hit am fetching
 back
   100
bytes worth data from each of those docs in indexes (along with
 solr
response statements ) .
this will make 100*x bytes worth data in each page if x is the page
  size
   ..
what is the optimum value of this x that solr can return each time
   without
going into exceptions 
   
On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com
  wrote:
   
Jame:
   
You control the number via settings in solrconfig.xml, so it's
up to you.
   
Jonathan:
Hmmm, that's seems right, after all the deep paging penalty is
  really
about keeping a large sorted array in memory but at least you
  only
pay it once per 10,000, rather than 100 times (assuming page size
 is
100)...
   
Best
Erick
   
On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet 
 jamevaa...@gmail.com
wrote:
 when you say queryResultCache, does it only cache n number of
  result
   for
the
 last one query or more than one queries?


 On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:

 Worth remembering there are some performance penalties with
 deep
 paging, if you use the page-by-page approach. may not be too
 much
  of
   a
 problem if you really are only looking to retrieve 10K docs.

 -Simon

 On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  Well, if you really want to you can specify start=0 and
  rows=1
   and
  get them all back at once.
 
  You can do page-by-page by incrementing the start parameter
 as
   you
  indicated.
 
  You can keep from re-executing the search by setting your
 queryResultCache
  appropriately, but this affects all searches so might be an
  issue.
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet 
  jamevaa...@gmail.com
   
 wrote:
  hi,
  i want to retrieve all the data from solr (say 10,000 ids )
 and
  my
page
 size
  is 1000 .
  how do i get back the data (pages) one after other ?do i
 have
  to
 increment
  the start value each time by the page size from 0 and do
 the
iteration
 ?
  In this case am i querying the index 10 time instead of one
 or
   after
 first
  query the result will be cached somewhere for the subsequent
  pages
   ?
 
 
  JAME VAALET
 
 




 --

 -JAME

   
   
   
   
--
   
-JAME
   
  
  
  
  
   --
  
   -JAME
  
 
 
 
 
  --
 
  -JAME
 




 --

 -JAME



Re: paging size in SOLR

2011-08-14 Thread jame vaalet
speaking about pagesizes, what is the optimum page size that should be
retrieved each time ??
i understand it depends upon the data you are fetching back fromeach hit
document ... but lets say when ever a document is hit am fetching back 100
bytes worth data from each of those docs in indexes (along with solr
response statements ) .
this will make 100*x bytes worth data in each page if x is the page size ..
what is the optimum value of this x that solr can return each time without
going into exceptions 

On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com wrote:

 Jame:

 You control the number via settings in solrconfig.xml, so it's
 up to you.

 Jonathan:
 Hmmm, that's seems right, after all the deep paging penalty is really
 about keeping a large sorted array in memory but at least you only
 pay it once per 10,000, rather than 100 times (assuming page size is
 100)...

 Best
 Erick

 On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com
 wrote:
  when you say queryResultCache, does it only cache n number of result for
 the
  last one query or more than one queries?
 
 
  On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:
 
  Worth remembering there are some performance penalties with deep
  paging, if you use the page-by-page approach. may not be too much of a
  problem if you really are only looking to retrieve 10K docs.
 
  -Simon
 
  On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
  erickerick...@gmail.com wrote:
   Well, if you really want to you can specify start=0 and rows=1 and
   get them all back at once.
  
   You can do page-by-page by incrementing the start parameter as you
   indicated.
  
   You can keep from re-executing the search by setting your
  queryResultCache
   appropriately, but this affects all searches so might be an issue.
  
   Best
   Erick
  
   On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
  wrote:
   hi,
   i want to retrieve all the data from solr (say 10,000 ids ) and my
 page
  size
   is 1000 .
   how do i get back the data (pages) one after other ?do i have to
  increment
   the start value each time by the page size from 0 and do the
 iteration
  ?
   In this case am i querying the index 10 time instead of one or after
  first
   query the result will be cached somewhere for the subsequent pages ?
  
  
   JAME VAALET
  
  
 
 
 
 
  --
 
  -JAME
 




-- 

-JAME


Re: paging size in SOLR

2011-08-14 Thread Erick Erickson
There isn't an optimum page size that I know of, it'll vary with lots of
stuff, not the least of which is whatever servlet container limits there are.

But I suspect you can get quite a few (1000s) without
too much problem, and you can always use the JSON response
writer to pack in more pages with less overhead.

You pretty much have to try it and see.

Best
Erick

On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com wrote:
 speaking about pagesizes, what is the optimum page size that should be
 retrieved each time ??
 i understand it depends upon the data you are fetching back fromeach hit
 document ... but lets say when ever a document is hit am fetching back 100
 bytes worth data from each of those docs in indexes (along with solr
 response statements ) .
 this will make 100*x bytes worth data in each page if x is the page size ..
 what is the optimum value of this x that solr can return each time without
 going into exceptions 

 On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com wrote:

 Jame:

 You control the number via settings in solrconfig.xml, so it's
 up to you.

 Jonathan:
 Hmmm, that's seems right, after all the deep paging penalty is really
 about keeping a large sorted array in memory but at least you only
 pay it once per 10,000, rather than 100 times (assuming page size is
 100)...

 Best
 Erick

 On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com
 wrote:
  when you say queryResultCache, does it only cache n number of result for
 the
  last one query or more than one queries?
 
 
  On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:
 
  Worth remembering there are some performance penalties with deep
  paging, if you use the page-by-page approach. may not be too much of a
  problem if you really are only looking to retrieve 10K docs.
 
  -Simon
 
  On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
  erickerick...@gmail.com wrote:
   Well, if you really want to you can specify start=0 and rows=1 and
   get them all back at once.
  
   You can do page-by-page by incrementing the start parameter as you
   indicated.
  
   You can keep from re-executing the search by setting your
  queryResultCache
   appropriately, but this affects all searches so might be an issue.
  
   Best
   Erick
  
   On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
  wrote:
   hi,
   i want to retrieve all the data from solr (say 10,000 ids ) and my
 page
  size
   is 1000 .
   how do i get back the data (pages) one after other ?do i have to
  increment
   the start value each time by the page size from 0 and do the
 iteration
  ?
   In this case am i querying the index 10 time instead of one or after
  first
   query the result will be cached somewhere for the subsequent pages ?
  
  
   JAME VAALET
  
  
 
 
 
 
  --
 
  -JAME
 




 --

 -JAME



Re: paging size in SOLR

2011-08-14 Thread jame vaalet
thanks erick ... that means it depends upon the memory allocated to the JVM
.

going back queryCacheResults factor i have got this doubt ..
say, i have got 10 threads with 10 different queries ..and each of them in
parallel are searching the same index with millions of docs in it
(multisharded ) .
now each of the queries have large number of results in it hence got to page
them all..
which all thread's (query ) result-set will be cached ? so that subsequent
pages can be retrieved quickly ..?

On 14 August 2011 17:40, Erick Erickson erickerick...@gmail.com wrote:

 There isn't an optimum page size that I know of, it'll vary with lots of
 stuff, not the least of which is whatever servlet container limits there
 are.

 But I suspect you can get quite a few (1000s) without
 too much problem, and you can always use the JSON response
 writer to pack in more pages with less overhead.

 You pretty much have to try it and see.

 Best
 Erick

 On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com wrote:
  speaking about pagesizes, what is the optimum page size that should be
  retrieved each time ??
  i understand it depends upon the data you are fetching back fromeach hit
  document ... but lets say when ever a document is hit am fetching back
 100
  bytes worth data from each of those docs in indexes (along with solr
  response statements ) .
  this will make 100*x bytes worth data in each page if x is the page size
 ..
  what is the optimum value of this x that solr can return each time
 without
  going into exceptions 
 
  On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com wrote:
 
  Jame:
 
  You control the number via settings in solrconfig.xml, so it's
  up to you.
 
  Jonathan:
  Hmmm, that's seems right, after all the deep paging penalty is really
  about keeping a large sorted array in memory but at least you only
  pay it once per 10,000, rather than 100 times (assuming page size is
  100)...
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com
  wrote:
   when you say queryResultCache, does it only cache n number of result
 for
  the
   last one query or more than one queries?
  
  
   On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:
  
   Worth remembering there are some performance penalties with deep
   paging, if you use the page-by-page approach. may not be too much of
 a
   problem if you really are only looking to retrieve 10K docs.
  
   -Simon
  
   On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
   erickerick...@gmail.com wrote:
Well, if you really want to you can specify start=0 and rows=1
 and
get them all back at once.
   
You can do page-by-page by incrementing the start parameter as
 you
indicated.
   
You can keep from re-executing the search by setting your
   queryResultCache
appropriately, but this affects all searches so might be an issue.
   
Best
Erick
   
On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
 
   wrote:
hi,
i want to retrieve all the data from solr (say 10,000 ids ) and my
  page
   size
is 1000 .
how do i get back the data (pages) one after other ?do i have to
   increment
the start value each time by the page size from 0 and do the
  iteration
   ?
In this case am i querying the index 10 time instead of one or
 after
   first
query the result will be cached somewhere for the subsequent pages
 ?
   
   
JAME VAALET
   
   
  
  
  
  
   --
  
   -JAME
  
 
 
 
 
  --
 
  -JAME
 




-- 

-JAME


Re: paging size in SOLR

2011-08-14 Thread Erick Erickson
As many results will be cached as you ask. See solrconfig.xml,
the queryResultCache. This cache is essentially a map of queries
and result document IDs. The number of doc IDs cached for
each query is controlled by queryResultWindowSize in
solrconfig.xml

Best
Erick

On Sun, Aug 14, 2011 at 8:35 AM, jame vaalet jamevaa...@gmail.com wrote:
 thanks erick ... that means it depends upon the memory allocated to the JVM
 .

 going back queryCacheResults factor i have got this doubt ..
 say, i have got 10 threads with 10 different queries ..and each of them in
 parallel are searching the same index with millions of docs in it
 (multisharded ) .
 now each of the queries have large number of results in it hence got to page
 them all..
 which all thread's (query ) result-set will be cached ? so that subsequent
 pages can be retrieved quickly ..?

 On 14 August 2011 17:40, Erick Erickson erickerick...@gmail.com wrote:

 There isn't an optimum page size that I know of, it'll vary with lots of
 stuff, not the least of which is whatever servlet container limits there
 are.

 But I suspect you can get quite a few (1000s) without
 too much problem, and you can always use the JSON response
 writer to pack in more pages with less overhead.

 You pretty much have to try it and see.

 Best
 Erick

 On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com wrote:
  speaking about pagesizes, what is the optimum page size that should be
  retrieved each time ??
  i understand it depends upon the data you are fetching back fromeach hit
  document ... but lets say when ever a document is hit am fetching back
 100
  bytes worth data from each of those docs in indexes (along with solr
  response statements ) .
  this will make 100*x bytes worth data in each page if x is the page size
 ..
  what is the optimum value of this x that solr can return each time
 without
  going into exceptions 
 
  On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com wrote:
 
  Jame:
 
  You control the number via settings in solrconfig.xml, so it's
  up to you.
 
  Jonathan:
  Hmmm, that's seems right, after all the deep paging penalty is really
  about keeping a large sorted array in memory but at least you only
  pay it once per 10,000, rather than 100 times (assuming page size is
  100)...
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com
  wrote:
   when you say queryResultCache, does it only cache n number of result
 for
  the
   last one query or more than one queries?
  
  
   On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:
  
   Worth remembering there are some performance penalties with deep
   paging, if you use the page-by-page approach. may not be too much of
 a
   problem if you really are only looking to retrieve 10K docs.
  
   -Simon
  
   On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
   erickerick...@gmail.com wrote:
Well, if you really want to you can specify start=0 and rows=1
 and
get them all back at once.
   
You can do page-by-page by incrementing the start parameter as
 you
indicated.
   
You can keep from re-executing the search by setting your
   queryResultCache
appropriately, but this affects all searches so might be an issue.
   
Best
Erick
   
On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
 
   wrote:
hi,
i want to retrieve all the data from solr (say 10,000 ids ) and my
  page
   size
is 1000 .
how do i get back the data (pages) one after other ?do i have to
   increment
the start value each time by the page size from 0 and do the
  iteration
   ?
In this case am i querying the index 10 time instead of one or
 after
   first
query the result will be cached somewhere for the subsequent pages
 ?
   
   
JAME VAALET
   
   
  
  
  
  
   --
  
   -JAME
  
 
 
 
 
  --
 
  -JAME
 




 --

 -JAME



Re: paging size in SOLR

2011-08-14 Thread jame vaalet
my queryResultCache size =0  and queryResultWindowSize =50
does this mean that am not caching any results ?

On 14 August 2011 18:27, Erick Erickson erickerick...@gmail.com wrote:

 As many results will be cached as you ask. See solrconfig.xml,
 the queryResultCache. This cache is essentially a map of queries
 and result document IDs. The number of doc IDs cached for
 each query is controlled by queryResultWindowSize in
 solrconfig.xml

 Best
 Erick

 On Sun, Aug 14, 2011 at 8:35 AM, jame vaalet jamevaa...@gmail.com wrote:
  thanks erick ... that means it depends upon the memory allocated to the
 JVM
  .
 
  going back queryCacheResults factor i have got this doubt ..
  say, i have got 10 threads with 10 different queries ..and each of them
 in
  parallel are searching the same index with millions of docs in it
  (multisharded ) .
  now each of the queries have large number of results in it hence got to
 page
  them all..
  which all thread's (query ) result-set will be cached ? so that
 subsequent
  pages can be retrieved quickly ..?
 
  On 14 August 2011 17:40, Erick Erickson erickerick...@gmail.com wrote:
 
  There isn't an optimum page size that I know of, it'll vary with lots
 of
  stuff, not the least of which is whatever servlet container limits there
  are.
 
  But I suspect you can get quite a few (1000s) without
  too much problem, and you can always use the JSON response
  writer to pack in more pages with less overhead.
 
  You pretty much have to try it and see.
 
  Best
  Erick
 
  On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com
 wrote:
   speaking about pagesizes, what is the optimum page size that should be
   retrieved each time ??
   i understand it depends upon the data you are fetching back fromeach
 hit
   document ... but lets say when ever a document is hit am fetching back
  100
   bytes worth data from each of those docs in indexes (along with solr
   response statements ) .
   this will make 100*x bytes worth data in each page if x is the page
 size
  ..
   what is the optimum value of this x that solr can return each time
  without
   going into exceptions 
  
   On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com
 wrote:
  
   Jame:
  
   You control the number via settings in solrconfig.xml, so it's
   up to you.
  
   Jonathan:
   Hmmm, that's seems right, after all the deep paging penalty is
 really
   about keeping a large sorted array in memory but at least you
 only
   pay it once per 10,000, rather than 100 times (assuming page size is
   100)...
  
   Best
   Erick
  
   On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com
   wrote:
when you say queryResultCache, does it only cache n number of
 result
  for
   the
last one query or more than one queries?
   
   
On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:
   
Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much
 of
  a
problem if you really are only looking to retrieve 10K docs.
   
-Simon
   
On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and
 rows=1
  and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as
  you
 indicated.

 You can keep from re-executing the search by setting your
queryResultCache
 appropriately, but this affects all searches so might be an
 issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet 
 jamevaa...@gmail.com
  
wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and
 my
   page
size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have
 to
increment
 the start value each time by the page size from 0 and do the
   iteration
?
 In this case am i querying the index 10 time instead of one or
  after
first
 query the result will be cached somewhere for the subsequent
 pages
  ?


 JAME VAALET


   
   
   
   
--
   
-JAME
   
  
  
  
  
   --
  
   -JAME
  
 
 
 
 
  --
 
  -JAME
 




-- 

-JAME


Re: paging size in SOLR

2011-08-14 Thread Erick Erickson
Yep.

ResultWindowSize in
 solrconfig.xml

 Best
 Erick

 On Sun, Aug 14, 2011 at 8:35 AM, jame vaalet jamevaa...@gmail.com wrote:
  thanks erick ... that means it depends upon the memory allocated to the
 JVM
  .
 
  going back queryCacheResults factor i have got this doubt ..
  say, i have got 10 threads with 10 different queries ..and each of them
 in
  parallel are searching the same index with millions of docs in it
  (multisharded ) .
  now each of the queries have large number of results in it hence got to
 page
  them all..
  which all thread's (query ) result-set will be cached ? so that
 subsequent
  pages can be retrieved quickly ..?
 
  On 14 August 2011 17:40, Erick Erickson erickerick...@gmail.com wrote:
 
  There isn't an optimum page size that I know of, it'll vary with lots
 of
  stuff, not the least of which is whatever servlet container limits there
  are.
 
  But I suspect you can get quite a few (1000s) without
  too much problem, and you can always use the JSON response
  writer to pack in more pages with less overhead.
 
  You pretty much have to try it and see.
 
  Best
  Erick
 
  On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet jamevaa...@gmail.com
 wrote:
   speaking about pagesizes, what is the optimum page size that should be
   retrieved each time ??
   i understand it depends upon the data you are fetching back fromeach
 hit
   document ... but lets say when ever a document is hit am fetching back
  100
   bytes worth data from each of those docs in indexes (along with solr
   response statements ) .
   this will make 100*x bytes worth data in each page if x is the page
 size
  ..
   what is the optimum value of this x that solr can return each time
  without
   going into exceptions 
  
   On 13 August 2011 19:59, Erick Erickson erickerick...@gmail.com
 wrote:
  
   Jame:
  
   You control the number via settings in solrconfig.xml, so it's
   up to you.
  
   Jonathan:
   Hmmm, that's seems right, after all the deep paging penalty is
 really
   about keeping a large sorted array in memory but at least you
 only
   pay it once per 10,000, rather than 100 times (assuming page size is
   100)...
  
   Best
   Erick
  
   On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com
   wrote:
when you say queryResultCache, does it only cache n number of
 result
  for
   the
last one query or more than one queries?
   
   
On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:
   
Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much
 of
  a
problem if you really are only looking to retrieve 10K docs.
   
-Simon
   
On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and
 rows=1
  and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as
  you
 indicated.

 You can keep from re-executing the search by setting your
queryResultCache
 appropriately, but this affects all searches so might be an
 issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet 
 jamevaa...@gmail.com
  
wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and
 my
   page
size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have
 to
increment
 the start value each time by the page size from 0 and do the
   iteration
?
 In this case am i querying the index 10 time instead of one or
  after
first
 query the result will be cached somewhere for the subsequent
 pages
  ?


 JAME VAALET


   
   
   
   
--
   
-JAME
   
  
  
  
  
   --
  
   -JAME
  
 
 
 
 
  --
 
  -JAME
 




 --

 -JAME



Re: paging size in SOLR

2011-08-13 Thread Erick Erickson
Jame:

You control the number via settings in solrconfig.xml, so it's
up to you.

Jonathan:
Hmmm, that's seems right, after all the deep paging penalty is really
about keeping a large sorted array in memory but at least you only
pay it once per 10,000, rather than 100 times (assuming page size is
100)...

Best
Erick

On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet jamevaa...@gmail.com wrote:
 when you say queryResultCache, does it only cache n number of result for the
 last one query or more than one queries?


 On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:

 Worth remembering there are some performance penalties with deep
 paging, if you use the page-by-page approach. may not be too much of a
 problem if you really are only looking to retrieve 10K docs.

 -Simon

 On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  Well, if you really want to you can specify start=0 and rows=1 and
  get them all back at once.
 
  You can do page-by-page by incrementing the start parameter as you
  indicated.
 
  You can keep from re-executing the search by setting your
 queryResultCache
  appropriately, but this affects all searches so might be an issue.
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
 wrote:
  hi,
  i want to retrieve all the data from solr (say 10,000 ids ) and my page
 size
  is 1000 .
  how do i get back the data (pages) one after other ?do i have to
 increment
  the start value each time by the page size from 0 and do the iteration
 ?
  In this case am i querying the index 10 time instead of one or after
 first
  query the result will be cached somewhere for the subsequent pages ?
 
 
  JAME VAALET
 
 




 --

 -JAME



Re: paging size in SOLR

2011-08-10 Thread Erick Erickson
Well, if you really want to you can specify start=0 and rows=1 and
get them all back at once.

You can do page-by-page by incrementing the start parameter as you
indicated.

You can keep from re-executing the search by setting your queryResultCache
appropriately, but this affects all searches so might be an issue.

Best
Erick

On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET



Re: paging size in SOLR

2011-08-10 Thread simon
Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much of a
problem if you really are only looking to retrieve 10K docs.

-Simon

On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and rows=1 and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as you
 indicated.

 You can keep from re-executing the search by setting your queryResultCache
 appropriately, but this affects all searches so might be an issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET




RE: paging size in SOLR

2011-08-10 Thread Jonathan Rochkind
I would imagine the performance penalties with deep paging will ALSO be there 
if you just ask for 1 rows all at once though, instead of in, say, 100 row 
paged batches. Yes? No?

-Original Message-
From: simon [mailto:mtnes...@gmail.com] 
Sent: Wednesday, August 10, 2011 10:44 AM
To: solr-user@lucene.apache.org
Subject: Re: paging size in SOLR

Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much of a
problem if you really are only looking to retrieve 10K docs.

-Simon

On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and rows=1 and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as you
 indicated.

 You can keep from re-executing the search by setting your queryResultCache
 appropriately, but this affects all searches so might be an issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET




Re: paging size in SOLR

2011-08-10 Thread jame vaalet
when you say queryResultCache, does it only cache n number of result for the
last one query or more than one queries?


On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:

 Worth remembering there are some performance penalties with deep
 paging, if you use the page-by-page approach. may not be too much of a
 problem if you really are only looking to retrieve 10K docs.

 -Simon

 On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  Well, if you really want to you can specify start=0 and rows=1 and
  get them all back at once.
 
  You can do page-by-page by incrementing the start parameter as you
  indicated.
 
  You can keep from re-executing the search by setting your
 queryResultCache
  appropriately, but this affects all searches so might be an issue.
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
 wrote:
  hi,
  i want to retrieve all the data from solr (say 10,000 ids ) and my page
 size
  is 1000 .
  how do i get back the data (pages) one after other ?do i have to
 increment
  the start value each time by the page size from 0 and do the iteration
 ?
  In this case am i querying the index 10 time instead of one or after
 first
  query the result will be cached somewhere for the subsequent pages ?
 
 
  JAME VAALET
 
 




-- 

-JAME