Mikhail,
Yes, +1.
This question comes up a few times a year. Grant created a JIRA issue
for this many moons ago.
https://issues.apache.org/jira/browse/LUCENE-2127
https://issues.apache.org/jira/browse/SOLR-1726
Otis
--
Solr ElasticSearch Support -- http://sematext.com/
Performance Monitoring
Roman,
Can you disclosure how that streaming writer works? What does it stream
docList or docSet?
Thanks
On Wed, Jul 24, 2013 at 5:57 AM, Roman Chyla roman.ch...@gmail.com wrote:
Hello Matt,
You can consider writing a batch processing handler, which receives a query
and instead of sending
Mikhail,
It is a slightly hacked JSONWriter - actually, while poking around, I have
discovered that dumping big hitsets would be possible - the main hurdle
right now, is that writer is expecting to receive docuemnts with fields
loaded, but if it received something that loads docs lazily, you could
On Tue, Jul 23, 2013 at 10:05 PM, Matt Lieber mlie...@impetus.com wrote:
That sounds like a satisfactory solution for the time being -
I am assuming you dump the data from Solr in a csv format?
JSON
How did you implement the streaming processor ? (what tool did you use for
this? Not
fwiw,
i did some prototype with the following differences:
- it streams straight to the socket output stream
- it streams on-going during collecting, without necessity to store a
bitset.
It might have some limited extreme usage. Is there anyone interested?
On Wed, Jul 24, 2013 at 7:19 PM, Roman
: Subject: Processing a lot of results in Solr
: Message-ID: d57c2b719b792f428beca7b0096c88e22c0...@mail1.impetus.co.in
: In-Reply-To: 1374612243070-4079869.p...@n3.nabble.com
https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a
Hi Matt,
This feature is commonly known as deep paging and Lucene and Solr have
issues with it ... take a look at
http://solr.pl/en/2011/07/18/deep-paging-problem/ as a potential
starting point using filters to bucketize a result set into sets of
sub result sets.
Cheers,
Tim
On Tue, Jul 23,
Hello Matt,
You can consider writing a batch processing handler, which receives a query
and instead of sending results back, it writes them into a file which is
then available for streaming (it has its own UUID). I am dumping many GBs
of data from solr in few minutes - your query + streaming
That sounds like a satisfactory solution for the time being -
I am assuming you dump the data from Solr in a csv format?
How did you implement the streaming processor ? (what tool did you use for
this? Not familiar with that)
You say it takes a few minutes only to dump the data - how long does it