Re: [MarkLogic Dev General] question about logging

helen chen Tue, 09 Nov 2010 07:20:19 -0800

Hi Geert,

I think I got it.  Thanks a lot for your information.


Helen


On Nov 9, 2010, at 4:47 AM, Geert Josten wrote:

> Hi Helen,
> 
> There will probably be a slight overhead for each fragment in terms of 
> database size, but I wouldn't worry about that too much.
> 
> In a nutshell: the idea behind fragmentation is that you use it to optimize 
> your content for searching and processing purposes. The search indexes are 
> optimized for fragments, while documents are loaded into memory fragment by 
> fragment for processing. If you have small fragments, your tree cache can be 
> small, resulting in a smaller memory footprint for MarkLogic Server..
> 
> Kind regards,
> Geert
> 
>> 
> 
> 
> drs. G.P.H. (Geert) Josten
> Consultant
> 
> Daidalos BV
> Hoekeindsehof 1-4
> 2665 JZ Bleiswijk
> 
> T +31 (0)10 850 1200
> F +31 (0)10 850 1199
> 
> mailto:[email protected]
> http://www.daidalos.nl/
> 
> KvK 27164984
> 
> 
> De informatie - verzonden in of met dit e-mailbericht - is afkomstig van 
> Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
> bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
> bericht kunnen geen rechten worden ontleend.
> 
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of
>> helen chen
>> Sent: maandag 8 november 2010 23:03
>> To: General Mark Logic Developer Discussion
>> Subject: Re: [MarkLogic Dev General] question about logging
>> 
>> Hi David,
>> 
>> So I'll try to say my understanding is:  fragmentation is
>> internal to Marklogic used to build index for search
>> calculation and performance,   like logical definition of
>> what is a row in a table. As far as data storage inside
>> marklogic,  that's separate from fragment.  I hope I got some.
>> 
>> At least I don't have to concern about it.
>> 
>> Thanks, Helen
>> 
>> 
>> 
>> On Nov 8, 2010, at 3:41 PM, Lee, David wrote:
>> 
>>> Fragmentation is not a per-database constant so you do not
>> waste any
>>> more or less space by putting a small document  in one
>> database with
>>> big documents or another with only small documents.
>>> 
>>> 
>>> -----Original Message-----
>>> From: [email protected]
>>> [mailto:[email protected]] On Behalf Of helen
>>> chen
>>> Sent: Monday, November 08, 2010 3:14 PM
>>> To: Jason Booth
>>> Cc: General Mark Logic Developer Discussion; helen chen
>>> Subject: Re: [MarkLogic Dev General] question about logging
>>> 
>>> Hi Jason,
>>> 
>>> I tested it, it seems very fast for the logging part, and
>> very easy to
>>> search.  I think I can use it.
>>> 
>>> I don't have much understanding for fragment, maybe I'm asking some
>>> silly questions here, but I want to ask before I use it :
>>> 
>>> My impression for marklogic is: I put articles' xml into database,
>>> this article's xml contains almost everything for this article.
>>> Depending on each article, the xml file can be very big, some are
>>> relatively small, but much bigger than this tiny log document.  And
>>> because it is hard to say that these elements will fit in
>> one fragment
>>> or not, we didn't do fragmentation for article.
>>> 
>>> Now I have these tiny documents in the same database,  that
>> is pretty
>>> much to say: my xml can be very big and also can be tiny, each of
>>> these tiny document will use one fragment (I think this way, let me
>>> know if I'm wrong).
>>> 
>>> Then do these tiny document waste the fragment? Should I
>> put these log
>>> document in a separate database ? Or maybe I totally mixed
>> the concept
>>> for fragmentation?
>>> 
>>> 
>>> Thanks, Helen
>>> 
>>> 
>>> 
>>> 
>>> On Nov 5, 2010, at 9:50 PM, Jason Booth wrote:
>>> 
>>>> 
>>>> Hi Helen,
>>>> 
>>>> As far as an xdmp:invoke I would highly recommend it (instead of
>>> xdmp:eval) as it will prove beneficial for the performance
>> gains with
>>> module caching.
>>>> 
>>>> My suggestion for creating a document for each step will
>> benefit you
>>> in a few ways. First, you won't need to keep updating the same
>>> document with more steps - unnecessary overhead. Second, if
>> you kept
>>> updating a single document with many steps you could
>> possibly get into
>>> a situation where you have an extremely large document (say, 1GB or
>>> more) and you certainly don't want that. Lastly, when you have many
>>> small documents you will have a more efficient search - many step
>>> documents vs a single document with many steps works best
>> for a search
>>> on your data. Placing your smaller step documents into a collection
>>> allows you to organize them in a way that you can search
>> against them
>>> (e.g. perform a search on the "step-123" collection giving
>> me back all
>>> steps, in date-time order, descending). Think of it like
>> this, if you
>>> were using an RDBMS your collection would be a table and
>> the documents would be rows in a table.
>>>> 
>>>> I recommend you experiment in with the above approach in a sandbox
>>> database to see what I mean. Then, once you feel
>> comfortable with it
>>> move your code into your application.
>>>> 
>>>> Best Regards,
>>>> 
>>>> Jason
>>>> 
>>>> 
>>>> ________________________________________
>>>> From: [email protected]
>>> [[email protected]] On Behalf Of Helen Chen
>>> [[email protected]]
>>>> Sent: Friday, November 05, 2010 8:34 PM
>>>> To: [email protected]
>>>> Cc: Helen Chen
>>>> Subject: Re: [MarkLogic Dev General] question about logging
>>>> 
>>>> Hi Jason,
>>>> 
>>>> The invoke() needs to use main module, I prefer to not use main
>>>> module
>>> so I can
>>>> call functions.
>>>> 
>>>> And if each step is a document, I feel that will create too many
>>> documents, and
>>>> these documents will be tiny.  I prefer one batch as a document,
>>> that's kind of
>>>> say I need to do a lot of update to the document.  Not sure if it
>>> sounds good
>>>> using eval.
>>>> 
>>>> Thanks, Helen
>>>> 
>>>> 
>>>>>>> Jason Booth <[email protected]> 11/05/10 4:39 PM >>>
>>>> 
>>>> I would use xdmp:invoke() vs xdmp:eval() for better
>> performance, esp.
>>> w/heavy
>>>> use  - you take advantage of caching.
>>>> 
>>>> To follow-up with what Walter said, you could add each
>> document to a
>>> collection
>>>> when you create them. Each "step" could be a document, and
>> a series
>>>> of
>>> steps
>>>> could belong to the same collection, say based off an article id.
>>>> And,
>>> on top of
>>>> that, they could belong to a general collection as a
>> bigger umbrella
>>> to search
>>>> on all "steps". Below is some psuedo-code:
>>>> 
>>>> let $article-id := "123"
>>>> let $step :=
>>>> <step>
>>>>  <step-datetime>{fn:current-dateTime()}</step-datetime>
>>>>  <article-id>{$article-id}</article-id>
>>>>  <action>Published article</action>
>>>> </step>
>>>> 
>>>> return xdmp:document-insert(
>>>>       fn:concat("/steps/", xdmp:hash64(xdmp:quote($step))), $step,
>>> (),
>>>>       (fn:concat("step-", $article-id), "steps"))
>>>> 
>>>> 
>>>> ________________________________________
>>>> From: [email protected]
>>>> [[email protected]] On Behalf Of helen chen
>>>> [[email protected]]
>>>> Sent: Friday, November 05, 2010 4:36 PM
>>>> To: General Mark Logic Developer Discussion
>>>> Subject: Re: [MarkLogic Dev General] question about logging
>>>> 
>>>> Hi Wunder,
>>>> 
>>>> This sounds reasonable to me and might work for me.
>>>> 
>>>> we have two database, one for original format, and one for working
>>> format.
>>>> During publishing process, I need to move data around, and I might
>>> also need to
>>>> insert some data back to both working version and original version,
>>> then I want
>>>> to verify the data after modification, that means I need to use
>>> xdmp:eval() for
>>>> transaction. And depending on which database I'm in or to
>> avoid lock,
>>> if I want
>>>> to write the log message to a document, I think I also need to use
>>> xdmp:eval().
>>>> 
>>>> I tested xdmp:eval() with some document change, it seems
>> fast. I want
>>> to ask:
>>>> should xdmp:eval() be used whenever I need to guarantee the data
>>> change and
>>>> whenever I want get the latest data information?  If I use
>> a lot is
>>>> it
>>> going to
>>>> have impact?
>>>> 
>>>> Thanks, Helen
>>>> 
>>>> 
>>>> 
>>>> On Nov 5, 2010, at 4:12 PM, Walter Underwood wrote:
>>>> 
>>>>> You could log that information to a document in MarkLogic. One
>>> history
>>>> document for each document that is processed.  --wunder
>>>>> 
>>>>> On Nov 5, 2010, at 12:46 PM, helen chen wrote:
>>>>> 
>>>>>> Hi David,
>>>>>> 
>>>>>> What happened here is: when the article get published, I
>> have to do
>>> something
>>>> and move data around.  I want to log the important steps and
>>> information to a
>>>> separate file, and this file will be kept as a reference in case
>>> something
>>>> wrong.  This publishing can involve a few hundred articles at one
>>> time, that
>>>> means the message I'm going to create can be a lot at some
>> point.  If
>>> the
>>>> message is not much, I think web service can do it, but
>> when I have a
>>> lot, and
>>>> those messages are from each steps, not at one time, it
>> means I have
>>> to call web
>>>> service many many times, I'm not sure the impact of this to the
>>> program.  It
>>>> sounds to me that it is going to slow down a lot.  But I
>> never tried
>>> this way.
>>>>>> 
>>>>>> In what situation did you use this way?
>>>>>> 
>>>>>> Thanks,
>>>>>> Helen
>>>>>> 
>>>>>> On Nov 5, 2010, at 3:23 PM, Lee, David wrote:
>>>>>> 
>>>>>>> Another alternative is you could log to a web service.
>>>>>>> That requires some substantial infrastructure but it
>> may be worth
>>> it
>>>>>>> depending on what your doing.
>>>>>>> Technically its quite easy to do, but it adds one more
>> big piece
>>>>>>> to
>>> the
>>>>>>> puzzle, and depending on how frequent your logs are may slow
>>>>>>> things
>>> down
>>>>>>> a lot.
>>>>>>> 
>>>>>>> 
>>>>>>> ----------------------------------------
>>>>>>> David A. Lee
>>>>>>> Senior Principal Software Engineer Epocrates, Inc.
>>>>>>> [email protected]
>>>>>>> 812-482-5224
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: [email protected]
>>>>>>> [mailto:[email protected]] On Behalf Of
>>>>>>> helen
>>> chen
>>>>>>> Sent: Friday, November 05, 2010 3:18 PM
>>>>>>> To: General Mark Logic Developer Discussion
>>>>>>> Subject: Re: [MarkLogic Dev General] question about logging
>>>>>>> 
>>>>>>> Hi Wunder,
>>>>>>> 
>>>>>>> Thanks for this information. I have to study it and see if I can
>>> make it
>>>>>>> work.
>>>>>>> 
>>>>>>> Helen
>>>>>>> 
>>>>>>> 
>>>>>>> On Nov 5, 2010, at 2:58 PM, Walter Underwood wrote:
>>>>>>> 
>>>>>>>> Using the admin UI, navigate to the group your hosts are in. On
>>> that
>>>>>>> page, near the bottom, there is an choice for "system
>> log level".
>>> That
>>>>>>> chooses what level of log events will be sent to the
>> system log.
>>>>>>> On Unix, that is syslog. There is a separate choice for
>> "file log
>>> level",
>>>>>>> which controls what is logged in ErrorLog.txt.
>>>>>>>> 
>>>>>>>> syslog_ng can use patterns to match log messages and
>> route them
>>>>>>>> to
>>> a
>>>>>>> particular log.
>>>>>>>> 
>>>>>>>> wunder
>>>>>>>> 
>>>>>>>> On Nov 5, 2010, at 11:44 AM, helen chen wrote:
>>>>>>>> 
>>>>>>>>> Hi Walter,
>>>>>>>>> 
>>>>>>>>> When you say "configure MarkLogic so the system log level
>>> includes
>>>>>>> your extra log messages", where can I do the configuration? Does
>>> that
>>>>>>> mean the log that write to ErrorLog.txt will also be written to
>>> syslog?
>>>>>>> is there any special format that I need to do for the
>> message that
>>> I
>>>>>>> want to log?  If it is in some document, can you point me which
>>> document
>>>>>>> I can find it?
>>>>>>>>> 
>>>>>>>>> Thanks, Helen
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Nov 5, 2010, at 1:37 PM, Walter Underwood wrote:
>>>>>>>>> 
>>>>>>>>>> If you are on a system that uses syslog_ng, you can do this
>>>>>>>>>> with
>>>>>>> that tool.
>>>>>>>>>> 
>>>>>>>>>> Log messages normally, but configure MarkLogic so the system
>>>>>>>>>> log
>>>>>>> level includes your extra log messages. Configure syslog_ng to
>>> route
>>>>>>> those log messages to the file you want.
>>>>>>>>>> 
>>>>>>>>>> wunder
>>>>>>>>>> ==
>>>>>>>>>> Walter Underwood
>>>>>>>>>> [email protected]
>>>>>>>>>> 
>>>>>>>>>> On Nov 5, 2010, at 8:33 AM, helen chen wrote:
>>>>>>>>>> 
>>>>>>>>>>> Maybe I didn't say it clearly.
>>>>>>>>>>> 
>>>>>>>>>>> fn:concat() is for the message part.   I also want to write
>>> this
>>>>>>> message to a separate file on the file system, the file name is
>>>>>>> specified dynamically. And if this file already exists on file
>>> system,
>>>>>>> it should be the append , not overwrite.  It is similar to the
>>>>>>> unix script that I write my log to some file I want.
>>>>>>>>>>> 
>>>>>>>>>>> In the meantime I don't want to stop the
>> xdmp:log(), if I use
>>>>>>> xdmp:log, it should still write to ErrorLog.txt file.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks, Helen
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Nov 5, 2010, at 11:19 AM, Tim Meagher wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> I just embed fn:concat() within the call the xdmp:log() and
>>>>>>> concatenation
>>>>>>>>>>>> the various message parts, e.g.
>>>>>>>>>>>> 
>>>>>>>>>>>> xdmp:log(concat("Path: ", {$path}))
>>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: [email protected]
>>>>>>>>>>>> [mailto:[email protected]]
>> On Behalf Of
>>>>>>> helen chen
>>>>>>>>>>>> Sent: Friday, November 05, 2010 11:16 AM
>>>>>>>>>>>> To: General Mark Logic Developer Discussion
>>>>>>>>>>>> Subject: [MarkLogic Dev General] question about logging
>>>>>>>>>>>> 
>>>>>>>>>>>> Hello there,
>>>>>>>>>>>> 
>>>>>>>>>>>> In Marklogic, I use xdmp:log() to log message to
>>>>>>>>>>>> ErrorLog.txt
>>>>>>> file.  I want
>>>>>>>>>>>> to do some logging similar to script, like I
>> specify the path
>>> and
>>>>>>> file name,
>>>>>>>>>>>> then I write just the message I want to this file and then
>>> keep
>>>>>>> appending
>>>>>>>>>>>> message to this file.  I expect that this should
>> not stop the
>>>>>>> normal logging
>>>>>>>>>>>> of xdmp:log().
>>>>>>>>>>>> 
>>>>>>>>>>>> Does anyone have suggestion on how to do it?
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks, Helen
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> General mailing list
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> General mailing list
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>>>>>> 
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> General mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> General mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> General mailing list
>>>>>>>>> [email protected]
>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Walter Underwood
>>>>>>>> [email protected]
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> [email protected]
>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> [email protected]
>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> [email protected]
>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>>> 
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> [email protected]
>>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>> 
>>>>> --
>>>>> Walter Underwood
>>>>> [email protected]
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> [email protected]
>>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>> 
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> 
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] question about logging

Reply via email to