At this point nothing definite planned AFAIK. 

Sent from my iPhone

On Oct 31, 2012, at 7:32 AM, "Geert Josten" <[email protected]> wrote:

> Colleen,
> 
> Is work in progress, planned or on the todo list for near future, to make
> the collector behavior configurable in this respect? We did a bit of
> performance testing, and that showed that doing no transforms is always
> fastest, even though it runs the transactions in sequence instead of
> parallel..
> 
> Kind regards,
> Geert
> 
>> -----Oorspronkelijk bericht-----
>> Van: [email protected] [mailto:general-
>> [email protected]] Namens Colleen Whitney
>> Verzonden: woensdag 24 oktober 2012 15:21
>> Aan: MarkLogic Developer Discussion
>> Onderwerp: Re: [MarkLogic Dev General] info studio using CPU
>> 
>> To avoid flooding the task queue, particularly in cases where transforms
> will also
>> be happening. (It does use more than one thread except on tiny batches,
> but not
>> many, and I've often wondered if number of loading threads should be
> made
>> controllable.) Ticket updates don't contend for a single document, but
> rather
>> write small documents; ticket status is done by query across them.
>> 
>> 
>> Sent from my iPhone
>> 
>> On Oct 23, 2012, at 12:35 PM, "Geert Josten" <[email protected]>
> wrote:
>> 
>>> Hi Colleen,
>>> 
>>> Interesting. Why doesn't the collector transaction-manager spawn the
>>> transactions, instead of invoking them? Afraid that updating the
> ticket
>>> along the way from multiple threads will interfere with each other?
>>> 
>>> Kind regards,
>>> Geert
>>> 
>>>> -----Oorspronkelijk bericht-----
>>>> Van: [email protected] [mailto:general-
>>>> [email protected]] Namens Colleen Whitney
>>>> Verzonden: dinsdag 23 oktober 2012 17:40
>>>> Aan: MarkLogic Developer Discussion
>>>> Onderwerp: Re: [MarkLogic Dev General] info studio using CPU
>>>> 
>>>> Yes, that's right.  Transformations take advantage of the task
> server.
>>> The
>>>> collector does not.
>>>> 
>>>> ________________________________________
>>>> From: [email protected] [general-
>>>> [email protected]] On Behalf Of Steiner, David J.
>>> (LNG-DAY)
>>>> [[email protected]]
>>>> Sent: Tuesday, October 23, 2012 8:29 AM
>>>> To: MarkLogic Developer Discussion
>>>> Subject: Re: [MarkLogic Dev General] info studio using CPU
>>>> 
>>>> Doesn't appear that the OS is swapping.
>>>> 
>>>> It appears that there are 16 task server threads.
>>>> 
>>>> Upon further "watching", it appears that just the collector may not
>>> utilize
>>>> threads?  It appears that once the transforming starts, all CPUs
> become
>>>> engaged.
>>>> 
>>>> David
>>>> 
>>>> -----Original Message-----
>>>> From: [email protected] [mailto:general-
>>>> [email protected]] On Behalf Of Michael Blakeley
>>>> Sent: Tuesday, October 23, 2012 11:23 AM
>>>> To: MarkLogic Developer Discussion
>>>> Cc: MarkLogic Developer Discussion
>>>> Subject: Re: [MarkLogic Dev General] info studio using CPU
>>>> 
>>>> Check the OS metrics. If RAM is maxed out, does that mean the OS is
>>> swapping?
>>>> If so, it's the swap disk that is the bottleneck.
>>>> 
>>>> If you can't find an OS bottleneck... How many task server threads
> are
>>>> configured? I think the default is 4. Adding more threads won't help
> if
>>> the system
>>>> is swapping or otherwise at its limits though.
>>>> 
>>>> -- Mike
>>>> 
>>>> On Oct 23, 2012, at 7:55, "Steiner, David J. (LNG-DAY)"
>>>> <[email protected]> wrote:
>>>> 
>>>>> Using ML 6.0-1.1.
>>>>> 
>>>>> In Information Studio, I'm using a CSV collector, to process
> hundreds
>>> of CSV
>>>> files.  I'm also doing a transform to pull each row out of the CSV
> and
>>> write it as
>>>> an individual document into another DB (actually, a naked property,
> but
>>> I don't
>>>> think that matters).
>>>>> 
>>>>> The files are all under 50MB (wasn't sure if that 64MB limit still
>>> existed).
>>>>> 
>>>>> It seems like only one CPU is being used and we have 8 available.
> RAM
>>> (24GB)
>>>> is maxed out.  It took 72 minutes to process 20 files.
>>>>> 
>>>>> Is Info Studio specifically not utilizing more CPU because all of
> the
>>> RAM is
>>>> already being used?
>>>>> 
>>>>> Ideally, I guess, I'd like for Info Studio to be able to take
>>> advantage of all CPUs
>>>> while ingesting.  I'm thinking the ingestion where CSV is being
>>> translated to XML
>>>> is the intense part.  The "splitting" out and "document" (property)
>>> insert
>>>> shouldn't be as intense?
>>>>> 
>>>>> Thanks,
>>>>> David
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> [email protected]
>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to