Colleen, Is work in progress, planned or on the todo list for near future, to make the collector behavior configurable in this respect? We did a bit of performance testing, and that showed that doing no transforms is always fastest, even though it runs the transactions in sequence instead of parallel..
Kind regards, Geert > -----Oorspronkelijk bericht----- > Van: [email protected] [mailto:general- > [email protected]] Namens Colleen Whitney > Verzonden: woensdag 24 oktober 2012 15:21 > Aan: MarkLogic Developer Discussion > Onderwerp: Re: [MarkLogic Dev General] info studio using CPU > > To avoid flooding the task queue, particularly in cases where transforms will also > be happening. (It does use more than one thread except on tiny batches, but not > many, and I've often wondered if number of loading threads should be made > controllable.) Ticket updates don't contend for a single document, but rather > write small documents; ticket status is done by query across them. > > > Sent from my iPhone > > On Oct 23, 2012, at 12:35 PM, "Geert Josten" <[email protected]> wrote: > > > Hi Colleen, > > > > Interesting. Why doesn't the collector transaction-manager spawn the > > transactions, instead of invoking them? Afraid that updating the ticket > > along the way from multiple threads will interfere with each other? > > > > Kind regards, > > Geert > > > >> -----Oorspronkelijk bericht----- > >> Van: [email protected] [mailto:general- > >> [email protected]] Namens Colleen Whitney > >> Verzonden: dinsdag 23 oktober 2012 17:40 > >> Aan: MarkLogic Developer Discussion > >> Onderwerp: Re: [MarkLogic Dev General] info studio using CPU > >> > >> Yes, that's right. Transformations take advantage of the task server. > > The > >> collector does not. > >> > >> ________________________________________ > >> From: [email protected] [general- > >> [email protected]] On Behalf Of Steiner, David J. > > (LNG-DAY) > >> [[email protected]] > >> Sent: Tuesday, October 23, 2012 8:29 AM > >> To: MarkLogic Developer Discussion > >> Subject: Re: [MarkLogic Dev General] info studio using CPU > >> > >> Doesn't appear that the OS is swapping. > >> > >> It appears that there are 16 task server threads. > >> > >> Upon further "watching", it appears that just the collector may not > > utilize > >> threads? It appears that once the transforming starts, all CPUs become > >> engaged. > >> > >> David > >> > >> -----Original Message----- > >> From: [email protected] [mailto:general- > >> [email protected]] On Behalf Of Michael Blakeley > >> Sent: Tuesday, October 23, 2012 11:23 AM > >> To: MarkLogic Developer Discussion > >> Cc: MarkLogic Developer Discussion > >> Subject: Re: [MarkLogic Dev General] info studio using CPU > >> > >> Check the OS metrics. If RAM is maxed out, does that mean the OS is > > swapping? > >> If so, it's the swap disk that is the bottleneck. > >> > >> If you can't find an OS bottleneck... How many task server threads are > >> configured? I think the default is 4. Adding more threads won't help if > > the system > >> is swapping or otherwise at its limits though. > >> > >> -- Mike > >> > >> On Oct 23, 2012, at 7:55, "Steiner, David J. (LNG-DAY)" > >> <[email protected]> wrote: > >> > >>> Using ML 6.0-1.1. > >>> > >>> In Information Studio, I'm using a CSV collector, to process hundreds > > of CSV > >> files. I'm also doing a transform to pull each row out of the CSV and > > write it as > >> an individual document into another DB (actually, a naked property, but > > I don't > >> think that matters). > >>> > >>> The files are all under 50MB (wasn't sure if that 64MB limit still > > existed). > >>> > >>> It seems like only one CPU is being used and we have 8 available. RAM > > (24GB) > >> is maxed out. It took 72 minutes to process 20 files. > >>> > >>> Is Info Studio specifically not utilizing more CPU because all of the > > RAM is > >> already being used? > >>> > >>> Ideally, I guess, I'd like for Info Studio to be able to take > > advantage of all CPUs > >> while ingesting. I'm thinking the ingestion where CSV is being > > translated to XML > >> is the intense part. The "splitting" out and "document" (property) > > insert > >> shouldn't be as intense? > >>> > >>> Thanks, > >>> David > >>> _______________________________________________ > >>> General mailing list > >>> [email protected] > >>> http://developer.marklogic.com/mailman/listinfo/general > >>> > >> _______________________________________________ > >> General mailing list > >> [email protected] > >> http://developer.marklogic.com/mailman/listinfo/general > >> _______________________________________________ > >> General mailing list > >> [email protected] > >> http://developer.marklogic.com/mailman/listinfo/general > >> _______________________________________________ > >> General mailing list > >> [email protected] > >> http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > > General mailing list > > [email protected] > > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
