At this point nothing definite planned AFAIK. Sent from my iPhone
On Oct 31, 2012, at 7:32 AM, "Geert Josten" <[email protected]> wrote: > Colleen, > > Is work in progress, planned or on the todo list for near future, to make > the collector behavior configurable in this respect? We did a bit of > performance testing, and that showed that doing no transforms is always > fastest, even though it runs the transactions in sequence instead of > parallel.. > > Kind regards, > Geert > >> -----Oorspronkelijk bericht----- >> Van: [email protected] [mailto:general- >> [email protected]] Namens Colleen Whitney >> Verzonden: woensdag 24 oktober 2012 15:21 >> Aan: MarkLogic Developer Discussion >> Onderwerp: Re: [MarkLogic Dev General] info studio using CPU >> >> To avoid flooding the task queue, particularly in cases where transforms > will also >> be happening. (It does use more than one thread except on tiny batches, > but not >> many, and I've often wondered if number of loading threads should be > made >> controllable.) Ticket updates don't contend for a single document, but > rather >> write small documents; ticket status is done by query across them. >> >> >> Sent from my iPhone >> >> On Oct 23, 2012, at 12:35 PM, "Geert Josten" <[email protected]> > wrote: >> >>> Hi Colleen, >>> >>> Interesting. Why doesn't the collector transaction-manager spawn the >>> transactions, instead of invoking them? Afraid that updating the > ticket >>> along the way from multiple threads will interfere with each other? >>> >>> Kind regards, >>> Geert >>> >>>> -----Oorspronkelijk bericht----- >>>> Van: [email protected] [mailto:general- >>>> [email protected]] Namens Colleen Whitney >>>> Verzonden: dinsdag 23 oktober 2012 17:40 >>>> Aan: MarkLogic Developer Discussion >>>> Onderwerp: Re: [MarkLogic Dev General] info studio using CPU >>>> >>>> Yes, that's right. Transformations take advantage of the task > server. >>> The >>>> collector does not. >>>> >>>> ________________________________________ >>>> From: [email protected] [general- >>>> [email protected]] On Behalf Of Steiner, David J. >>> (LNG-DAY) >>>> [[email protected]] >>>> Sent: Tuesday, October 23, 2012 8:29 AM >>>> To: MarkLogic Developer Discussion >>>> Subject: Re: [MarkLogic Dev General] info studio using CPU >>>> >>>> Doesn't appear that the OS is swapping. >>>> >>>> It appears that there are 16 task server threads. >>>> >>>> Upon further "watching", it appears that just the collector may not >>> utilize >>>> threads? It appears that once the transforming starts, all CPUs > become >>>> engaged. >>>> >>>> David >>>> >>>> -----Original Message----- >>>> From: [email protected] [mailto:general- >>>> [email protected]] On Behalf Of Michael Blakeley >>>> Sent: Tuesday, October 23, 2012 11:23 AM >>>> To: MarkLogic Developer Discussion >>>> Cc: MarkLogic Developer Discussion >>>> Subject: Re: [MarkLogic Dev General] info studio using CPU >>>> >>>> Check the OS metrics. If RAM is maxed out, does that mean the OS is >>> swapping? >>>> If so, it's the swap disk that is the bottleneck. >>>> >>>> If you can't find an OS bottleneck... How many task server threads > are >>>> configured? I think the default is 4. Adding more threads won't help > if >>> the system >>>> is swapping or otherwise at its limits though. >>>> >>>> -- Mike >>>> >>>> On Oct 23, 2012, at 7:55, "Steiner, David J. (LNG-DAY)" >>>> <[email protected]> wrote: >>>> >>>>> Using ML 6.0-1.1. >>>>> >>>>> In Information Studio, I'm using a CSV collector, to process > hundreds >>> of CSV >>>> files. I'm also doing a transform to pull each row out of the CSV > and >>> write it as >>>> an individual document into another DB (actually, a naked property, > but >>> I don't >>>> think that matters). >>>>> >>>>> The files are all under 50MB (wasn't sure if that 64MB limit still >>> existed). >>>>> >>>>> It seems like only one CPU is being used and we have 8 available. > RAM >>> (24GB) >>>> is maxed out. It took 72 minutes to process 20 files. >>>>> >>>>> Is Info Studio specifically not utilizing more CPU because all of > the >>> RAM is >>>> already being used? >>>>> >>>>> Ideally, I guess, I'd like for Info Studio to be able to take >>> advantage of all CPUs >>>> while ingesting. I'm thinking the ingestion where CSV is being >>> translated to XML >>>> is the intense part. The "splitting" out and "document" (property) >>> insert >>>> shouldn't be as intense? >>>>> >>>>> Thanks, >>>>> David >>>>> _______________________________________________ >>>>> General mailing list >>>>> [email protected] >>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>> >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
