Hello Raul.
Thanks for your help. I've had a look the test code and I understand how it should be possible to read the big result of the Mongo query using the batch size. But, at the opposite of your example, in my case I have no idea of the total count of matching documents to divide the reading process and loop over the paged results. I've test the "count" operation, but just as the documentation says, it returns the total number of documents in the collection and doesn't take the query to restrict the count. If I'm not wrong, there's not yet any EIP construct to use a "<WHILE>" style block, and the "<LOOP>" needs a predefined value that requires that the total number of data is evaluated before... Any idea of some means to retrieve such an information ? Thanks again. Regards. Ephemeris Lappis Le 19/04/2014 03:27, Raul Kripalani [via Camel] a écrit : > There are unit tests that showcase this functionality of the component: > > https://github.com/apache/camel/blob/e7563a7611667fb9b449d8a7f8c3fa7e3a0524bd/components/camel-mongodb/src/test/java/org/apache/camel/component/mongodb/MongoDbFindOperationTest.java#L90 > > I think we could enhance it anyway to enable returning the DBCursor > directly to the route, so you can then handle the result in any manner > you > want, e.g. writing it to an OutputStream or whatever. > > I've created [1] to track this new feature. > > [1] https://issues.apache.org/jira/browse/CAMEL-7378 > > Regards, > > *Raúl Kripalani* > Apache Camel PMC Member & Committer | Enterprise Architect, Open Source > Integration specialist > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani > http://blog.raulkr.net | twitter: @raulvk > > On Fri, Apr 18, 2014 at 5:30 AM, Ephemeris Lappis < > [hidden email] </user/SendEmail.jtp?type=node&node=5750367&i=0>> wrote: > > > Hello. > > > > I have tried different options; like batch size; to evaluate some > > scenario to optimize some cases. > > But for cases with a really big volume of data, retrieving them all in > > memory always leads to an error. > > > > Our current case should be something as simple as : > > A first route : > > - receive a soap request from a web client with some kind of filter > form > > to select data > > - push the xml request to an active queue, and send back a simple soap > > response > > A main route : > > - get back the xml request from the queue > > - make a json body to set the query from the xml request (10 of 15 > lines > > of groovy for example) > > - set a header to select the needed collection's attributes > > - call mongo findAll > > - marshal the result to csv > > - write the result into a file. > > - send a mail to the caller to inform the job is done. > > > > This may be done with a very simple blueprint with very few lines > and no > > complexity at all. > > > > Do you mean that the only way to process a big volume of Mongo data is > > to set up a more "smart" algorithm like : > > - build a first request to count the data. > > - loop over the data set reading batch parts using "skip" and "page > size" > > - write the paged results appending them to the file. > > - etc ? > > > > Have you an example of paging process ? > > > > Thanks for you help. > > > > Ephemeris Lappis > > > > Le 18/04/2014 02:52, Raul Kripalani [via Camel] a écrit : > > > Hi, > > > > > > We use Mongo cursors to read from the DB. But a DBCursor is not > > > something we can return to the route because not all technologies > > > support Streams, Cursors, Chunking, etc. For example, how would > you go > > > about returning a DBCursor to a JMS endpoint? > > > > > > That's why we offer the skipping and limiting option so you can > > > perform pagination in such scenarios. You can also specify a batch > > > size. Take a look at the component page for further details. > > > > > > Hope that helps! > > > Raúl. > > > > > > > On 17 Apr 2014, at 15:41, Ephemeris Lappis <[hidden email] > > > </user/SendEmail.jtp?type=node&node=5750355&i=0>> wrote: > > > > > > > > Hello. > > > > > > > > After some tests, it seems that the Camel MongoDB "findAll" > > > operation tries > > > > to load all the matching queried data into memory before process > > > them. With > > > > collections whose content is about tens millions of documents, this > > > > naturally leads to OutOfMemoryErrors... > > > > > > > > Does this component may use cursors to read the input data and > > > stream them ? > > > > > > > > An idea ? > > > > > > > > Thanks in advance. > > > > > > > > Regards. > > > > > > > > > > > > > > > > -- > > > > View this message in context: > > > > > > http://camel.465427.n5.nabble.com/Does-Camel-MongoDB-use-cursors-on-findAll-tp5750352.html > > > > Sent from the Camel - Users mailing list archive at Nabble.com. > > > > > > > > > > ------------------------------------------------------------------------ > > > If you reply to this email, your message will be added to the > > > discussion below: > > > > > > http://camel.465427.n5.nabble.com/Does-Camel-MongoDB-use-cursors-on-findAll-tp5750352p5750355.html > > > > > > To unsubscribe from Does Camel MongoDB use cursors on findAll ?, > click > > > here > > > < > > >. > > > NAML > > > < > > > http://camel.465427.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml > > > > > > > > > > > > > > > > > > -- > > View this message in context: > > > http://camel.465427.n5.nabble.com/Does-Camel-MongoDB-use-cursors-on-findAll-tp5750352p5750357.html > > Sent from the Camel - Users mailing list archive at Nabble.com. > > > > > ------------------------------------------------------------------------ > If you reply to this email, your message will be added to the > discussion below: > http://camel.465427.n5.nabble.com/Does-Camel-MongoDB-use-cursors-on-findAll-tp5750352p5750367.html > > > To unsubscribe from Does Camel MongoDB use cursors on findAll ?, click > here > <http://camel.465427.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=5750352&code=ZXBoZW1lcmlzLmxhcHBpc0BnbWFpbC5jb218NTc1MDM1Mnw0OTQyMjM2NDI=>. > NAML > <http://camel.465427.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > -- View this message in context: http://camel.465427.n5.nabble.com/Does-Camel-MongoDB-use-cursors-on-findAll-tp5750352p5750369.html Sent from the Camel - Users mailing list archive at Nabble.com.
