Hi Helen,

There will probably be a slight overhead for each fragment in terms of database 
size, but I wouldn't worry about that too much.

In a nutshell: the idea behind fragmentation is that you use it to optimize 
your content for searching and processing purposes. The search indexes are 
optimized for fragments, while documents are loaded into memory fragment by 
fragment for processing. If you have small fragments, your tree cache can be 
small, resulting in a smaller memory footprint for MarkLogic Server..

Kind regards,
Geert

>


drs. G.P.H. (Geert) Josten
Consultant

Daidalos BV
Hoekeindsehof 1-4
2665 JZ Bleiswijk

T +31 (0)10 850 1200
F +31 (0)10 850 1199

mailto:[email protected]
http://www.daidalos.nl/

KvK 27164984


De informatie - verzonden in of met dit e-mailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.

> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> helen chen
> Sent: maandag 8 november 2010 23:03
> To: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev General] question about logging
>
> Hi David,
>
> So I'll try to say my understanding is:  fragmentation is
> internal to Marklogic used to build index for search
> calculation and performance,   like logical definition of
> what is a row in a table. As far as data storage inside
> marklogic,  that's separate from fragment.  I hope I got some.
>
> At least I don't have to concern about it.
>
> Thanks, Helen
>
>
>
> On Nov 8, 2010, at 3:41 PM, Lee, David wrote:
>
> > Fragmentation is not a per-database constant so you do not
> waste any
> > more or less space by putting a small document  in one
> database with
> > big documents or another with only small documents.
> >
> >
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of helen
> > chen
> > Sent: Monday, November 08, 2010 3:14 PM
> > To: Jason Booth
> > Cc: General Mark Logic Developer Discussion; helen chen
> > Subject: Re: [MarkLogic Dev General] question about logging
> >
> > Hi Jason,
> >
> > I tested it, it seems very fast for the logging part, and
> very easy to
> > search.  I think I can use it.
> >
> > I don't have much understanding for fragment, maybe I'm asking some
> > silly questions here, but I want to ask before I use it :
> >
> > My impression for marklogic is: I put articles' xml into database,
> > this article's xml contains almost everything for this article.
> > Depending on each article, the xml file can be very big, some are
> > relatively small, but much bigger than this tiny log document.  And
> > because it is hard to say that these elements will fit in
> one fragment
> > or not, we didn't do fragmentation for article.
> >
> > Now I have these tiny documents in the same database,  that
> is pretty
> > much to say: my xml can be very big and also can be tiny, each of
> > these tiny document will use one fragment (I think this way, let me
> > know if I'm wrong).
> >
> > Then do these tiny document waste the fragment? Should I
> put these log
> > document in a separate database ? Or maybe I totally mixed
> the concept
> > for fragmentation?
> >
> >
> > Thanks, Helen
> >
> >
> >
> >
> > On Nov 5, 2010, at 9:50 PM, Jason Booth wrote:
> >
> >>
> >> Hi Helen,
> >>
> >> As far as an xdmp:invoke I would highly recommend it (instead of
> > xdmp:eval) as it will prove beneficial for the performance
> gains with
> > module caching.
> >>
> >> My suggestion for creating a document for each step will
> benefit you
> > in a few ways. First, you won't need to keep updating the same
> > document with more steps - unnecessary overhead. Second, if
> you kept
> > updating a single document with many steps you could
> possibly get into
> > a situation where you have an extremely large document (say, 1GB or
> > more) and you certainly don't want that. Lastly, when you have many
> > small documents you will have a more efficient search - many step
> > documents vs a single document with many steps works best
> for a search
> > on your data. Placing your smaller step documents into a collection
> > allows you to organize them in a way that you can search
> against them
> > (e.g. perform a search on the "step-123" collection giving
> me back all
> > steps, in date-time order, descending). Think of it like
> this, if you
> > were using an RDBMS your collection would be a table and
> the documents would be rows in a table.
> >>
> >> I recommend you experiment in with the above approach in a sandbox
> > database to see what I mean. Then, once you feel
> comfortable with it
> > move your code into your application.
> >>
> >> Best Regards,
> >>
> >> Jason
> >>
> >>
> >> ________________________________________
> >> From: [email protected]
> > [[email protected]] On Behalf Of Helen Chen
> > [[email protected]]
> >> Sent: Friday, November 05, 2010 8:34 PM
> >> To: [email protected]
> >> Cc: Helen Chen
> >> Subject: Re: [MarkLogic Dev General] question about logging
> >>
> >> Hi Jason,
> >>
> >> The invoke() needs to use main module, I prefer to not use main
> >> module
> > so I can
> >> call functions.
> >>
> >> And if each step is a document, I feel that will create too many
> > documents, and
> >> these documents will be tiny.  I prefer one batch as a document,
> > that's kind of
> >> say I need to do a lot of update to the document.  Not sure if it
> > sounds good
> >> using eval.
> >>
> >> Thanks, Helen
> >>
> >>
> >>>>> Jason Booth <[email protected]> 11/05/10 4:39 PM >>>
> >>
> >> I would use xdmp:invoke() vs xdmp:eval() for better
> performance, esp.
> > w/heavy
> >> use  - you take advantage of caching.
> >>
> >> To follow-up with what Walter said, you could add each
> document to a
> > collection
> >> when you create them. Each "step" could be a document, and
> a series
> >> of
> > steps
> >> could belong to the same collection, say based off an article id.
> >> And,
> > on top of
> >> that, they could belong to a general collection as a
> bigger umbrella
> > to search
> >> on all "steps". Below is some psuedo-code:
> >>
> >> let $article-id := "123"
> >> let $step :=
> >> <step>
> >>   <step-datetime>{fn:current-dateTime()}</step-datetime>
> >>   <article-id>{$article-id}</article-id>
> >>   <action>Published article</action>
> >> </step>
> >>
> >> return xdmp:document-insert(
> >>        fn:concat("/steps/", xdmp:hash64(xdmp:quote($step))), $step,
> > (),
> >>        (fn:concat("step-", $article-id), "steps"))
> >>
> >>
> >> ________________________________________
> >> From: [email protected]
> >> [[email protected]] On Behalf Of helen chen
> >> [[email protected]]
> >> Sent: Friday, November 05, 2010 4:36 PM
> >> To: General Mark Logic Developer Discussion
> >> Subject: Re: [MarkLogic Dev General] question about logging
> >>
> >> Hi Wunder,
> >>
> >> This sounds reasonable to me and might work for me.
> >>
> >> we have two database, one for original format, and one for working
> > format.
> >> During publishing process, I need to move data around, and I might
> > also need to
> >> insert some data back to both working version and original version,
> > then I want
> >> to verify the data after modification, that means I need to use
> > xdmp:eval() for
> >> transaction. And depending on which database I'm in or to
> avoid lock,
> > if I want
> >> to write the log message to a document, I think I also need to use
> > xdmp:eval().
> >>
> >> I tested xdmp:eval() with some document change, it seems
> fast. I want
> > to ask:
> >> should xdmp:eval() be used whenever I need to guarantee the data
> > change and
> >> whenever I want get the latest data information?  If I use
> a lot is
> >> it
> > going to
> >> have impact?
> >>
> >> Thanks, Helen
> >>
> >>
> >>
> >> On Nov 5, 2010, at 4:12 PM, Walter Underwood wrote:
> >>
> >>> You could log that information to a document in MarkLogic. One
> > history
> >> document for each document that is processed.  --wunder
> >>>
> >>> On Nov 5, 2010, at 12:46 PM, helen chen wrote:
> >>>
> >>>> Hi David,
> >>>>
> >>>> What happened here is: when the article get published, I
> have to do
> > something
> >> and move data around.  I want to log the important steps and
> > information to a
> >> separate file, and this file will be kept as a reference in case
> > something
> >> wrong.  This publishing can involve a few hundred articles at one
> > time, that
> >> means the message I'm going to create can be a lot at some
> point.  If
> > the
> >> message is not much, I think web service can do it, but
> when I have a
> > lot, and
> >> those messages are from each steps, not at one time, it
> means I have
> > to call web
> >> service many many times, I'm not sure the impact of this to the
> > program.  It
> >> sounds to me that it is going to slow down a lot.  But I
> never tried
> > this way.
> >>>>
> >>>> In what situation did you use this way?
> >>>>
> >>>> Thanks,
> >>>> Helen
> >>>>
> >>>> On Nov 5, 2010, at 3:23 PM, Lee, David wrote:
> >>>>
> >>>>> Another alternative is you could log to a web service.
> >>>>> That requires some substantial infrastructure but it
> may be worth
> > it
> >>>>> depending on what your doing.
> >>>>> Technically its quite easy to do, but it adds one more
> big piece
> >>>>> to
> > the
> >>>>> puzzle, and depending on how frequent your logs are may slow
> >>>>> things
> > down
> >>>>> a lot.
> >>>>>
> >>>>>
> >>>>> ----------------------------------------
> >>>>> David A. Lee
> >>>>> Senior Principal Software Engineer Epocrates, Inc.
> >>>>> [email protected]
> >>>>> 812-482-5224
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: [email protected]
> >>>>> [mailto:[email protected]] On Behalf Of
> >>>>> helen
> > chen
> >>>>> Sent: Friday, November 05, 2010 3:18 PM
> >>>>> To: General Mark Logic Developer Discussion
> >>>>> Subject: Re: [MarkLogic Dev General] question about logging
> >>>>>
> >>>>> Hi Wunder,
> >>>>>
> >>>>> Thanks for this information. I have to study it and see if I can
> > make it
> >>>>> work.
> >>>>>
> >>>>> Helen
> >>>>>
> >>>>>
> >>>>> On Nov 5, 2010, at 2:58 PM, Walter Underwood wrote:
> >>>>>
> >>>>>> Using the admin UI, navigate to the group your hosts are in. On
> > that
> >>>>> page, near the bottom, there is an choice for "system
> log level".
> > That
> >>>>> chooses what level of log events will be sent to the
> system log.
> >>>>> On Unix, that is syslog. There is a separate choice for
> "file log
> > level",
> >>>>> which controls what is logged in ErrorLog.txt.
> >>>>>>
> >>>>>> syslog_ng can use patterns to match log messages and
> route them
> >>>>>> to
> > a
> >>>>> particular log.
> >>>>>>
> >>>>>> wunder
> >>>>>>
> >>>>>> On Nov 5, 2010, at 11:44 AM, helen chen wrote:
> >>>>>>
> >>>>>>> Hi Walter,
> >>>>>>>
> >>>>>>> When you say "configure MarkLogic so the system log level
> > includes
> >>>>> your extra log messages", where can I do the configuration? Does
> > that
> >>>>> mean the log that write to ErrorLog.txt will also be written to
> > syslog?
> >>>>> is there any special format that I need to do for the
> message that
> > I
> >>>>> want to log?  If it is in some document, can you point me which
> > document
> >>>>> I can find it?
> >>>>>>>
> >>>>>>> Thanks, Helen
> >>>>>>>
> >>>>>>>
> >>>>>>> On Nov 5, 2010, at 1:37 PM, Walter Underwood wrote:
> >>>>>>>
> >>>>>>>> If you are on a system that uses syslog_ng, you can do this
> >>>>>>>> with
> >>>>> that tool.
> >>>>>>>>
> >>>>>>>> Log messages normally, but configure MarkLogic so the system
> >>>>>>>> log
> >>>>> level includes your extra log messages. Configure syslog_ng to
> > route
> >>>>> those log messages to the file you want.
> >>>>>>>>
> >>>>>>>> wunder
> >>>>>>>> ==
> >>>>>>>> Walter Underwood
> >>>>>>>> [email protected]
> >>>>>>>>
> >>>>>>>> On Nov 5, 2010, at 8:33 AM, helen chen wrote:
> >>>>>>>>
> >>>>>>>>> Maybe I didn't say it clearly.
> >>>>>>>>>
> >>>>>>>>> fn:concat() is for the message part.   I also want to write
> > this
> >>>>> message to a separate file on the file system, the file name is
> >>>>> specified dynamically. And if this file already exists on file
> > system,
> >>>>> it should be the append , not overwrite.  It is similar to the
> >>>>> unix script that I write my log to some file I want.
> >>>>>>>>>
> >>>>>>>>> In the meantime I don't want to stop the
> xdmp:log(), if I use
> >>>>> xdmp:log, it should still write to ErrorLog.txt file.
> >>>>>>>>>
> >>>>>>>>> Thanks, Helen
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Nov 5, 2010, at 11:19 AM, Tim Meagher wrote:
> >>>>>>>>>
> >>>>>>>>>> I just embed fn:concat() within the call the xdmp:log() and
> >>>>> concatenation
> >>>>>>>>>> the various message parts, e.g.
> >>>>>>>>>>
> >>>>>>>>>> xdmp:log(concat("Path: ", {$path}))
> >>>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: [email protected]
> >>>>>>>>>> [mailto:[email protected]]
> On Behalf Of
> >>>>> helen chen
> >>>>>>>>>> Sent: Friday, November 05, 2010 11:16 AM
> >>>>>>>>>> To: General Mark Logic Developer Discussion
> >>>>>>>>>> Subject: [MarkLogic Dev General] question about logging
> >>>>>>>>>>
> >>>>>>>>>> Hello there,
> >>>>>>>>>>
> >>>>>>>>>> In Marklogic, I use xdmp:log() to log message to
> >>>>>>>>>> ErrorLog.txt
> >>>>> file.  I want
> >>>>>>>>>> to do some logging similar to script, like I
> specify the path
> > and
> >>>>> file name,
> >>>>>>>>>> then I write just the message I want to this file and then
> > keep
> >>>>> appending
> >>>>>>>>>> message to this file.  I expect that this should
> not stop the
> >>>>> normal logging
> >>>>>>>>>> of xdmp:log().
> >>>>>>>>>>
> >>>>>>>>>> Does anyone have suggestion on how to do it?
> >>>>>>>>>>
> >>>>>>>>>> Thanks, Helen
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> General mailing list
> >>>>>>>>>> [email protected]
> >>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> General mailing list
> >>>>>>>>>> [email protected]
> >>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> General mailing list
> >>>>>>>>> [email protected]
> >>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> General mailing list
> >>>>>>>> [email protected]
> >>>>>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> General mailing list
> >>>>>>> [email protected]
> >>>>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>>>
> >>>>>> --
> >>>>>> Walter Underwood
> >>>>>> [email protected]
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> General mailing list
> >>>>>> [email protected]
> >>>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>>
> >>>>> _______________________________________________
> >>>>> General mailing list
> >>>>> [email protected]
> >>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>> _______________________________________________
> >>>>> General mailing list
> >>>>> [email protected]
> >>>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>>
> >>>> _______________________________________________
> >>>> General mailing list
> >>>> [email protected]
> >>>> http://developer.marklogic.com/mailman/listinfo/general
> >>>
> >>> --
> >>> Walter Underwood
> >>> [email protected]
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> General mailing list
> >>> [email protected]
> >>> http://developer.marklogic.com/mailman/listinfo/general
> >>
> >> _______________________________________________
> >> General mailing list
> >> [email protected]
> >> http://developer.marklogic.com/mailman/listinfo/general
> >> _______________________________________________
> >> General mailing list
> >> [email protected]
> >> http://developer.marklogic.com/mailman/listinfo/general
> >>
> >> _______________________________________________
> >> General mailing list
> >> [email protected]
> >> http://developer.marklogic.com/mailman/listinfo/general
> >
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to