Hi Geert, I think I got it. Thanks a lot for your information.
Helen On Nov 9, 2010, at 4:47 AM, Geert Josten wrote: > Hi Helen, > > There will probably be a slight overhead for each fragment in terms of > database size, but I wouldn't worry about that too much. > > In a nutshell: the idea behind fragmentation is that you use it to optimize > your content for searching and processing purposes. The search indexes are > optimized for fragments, while documents are loaded into memory fragment by > fragment for processing. If you have small fragments, your tree cache can be > small, resulting in a smaller memory footprint for MarkLogic Server.. > > Kind regards, > Geert > >> > > > drs. G.P.H. (Geert) Josten > Consultant > > Daidalos BV > Hoekeindsehof 1-4 > 2665 JZ Bleiswijk > > T +31 (0)10 850 1200 > F +31 (0)10 850 1199 > > mailto:[email protected] > http://www.daidalos.nl/ > > KvK 27164984 > > > De informatie - verzonden in of met dit e-mailbericht - is afkomstig van > Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit > bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit > bericht kunnen geen rechten worden ontleend. > >> From: [email protected] >> [mailto:[email protected]] On Behalf Of >> helen chen >> Sent: maandag 8 november 2010 23:03 >> To: General Mark Logic Developer Discussion >> Subject: Re: [MarkLogic Dev General] question about logging >> >> Hi David, >> >> So I'll try to say my understanding is: fragmentation is >> internal to Marklogic used to build index for search >> calculation and performance, like logical definition of >> what is a row in a table. As far as data storage inside >> marklogic, that's separate from fragment. I hope I got some. >> >> At least I don't have to concern about it. >> >> Thanks, Helen >> >> >> >> On Nov 8, 2010, at 3:41 PM, Lee, David wrote: >> >>> Fragmentation is not a per-database constant so you do not >> waste any >>> more or less space by putting a small document in one >> database with >>> big documents or another with only small documents. >>> >>> >>> -----Original Message----- >>> From: [email protected] >>> [mailto:[email protected]] On Behalf Of helen >>> chen >>> Sent: Monday, November 08, 2010 3:14 PM >>> To: Jason Booth >>> Cc: General Mark Logic Developer Discussion; helen chen >>> Subject: Re: [MarkLogic Dev General] question about logging >>> >>> Hi Jason, >>> >>> I tested it, it seems very fast for the logging part, and >> very easy to >>> search. I think I can use it. >>> >>> I don't have much understanding for fragment, maybe I'm asking some >>> silly questions here, but I want to ask before I use it : >>> >>> My impression for marklogic is: I put articles' xml into database, >>> this article's xml contains almost everything for this article. >>> Depending on each article, the xml file can be very big, some are >>> relatively small, but much bigger than this tiny log document. And >>> because it is hard to say that these elements will fit in >> one fragment >>> or not, we didn't do fragmentation for article. >>> >>> Now I have these tiny documents in the same database, that >> is pretty >>> much to say: my xml can be very big and also can be tiny, each of >>> these tiny document will use one fragment (I think this way, let me >>> know if I'm wrong). >>> >>> Then do these tiny document waste the fragment? Should I >> put these log >>> document in a separate database ? Or maybe I totally mixed >> the concept >>> for fragmentation? >>> >>> >>> Thanks, Helen >>> >>> >>> >>> >>> On Nov 5, 2010, at 9:50 PM, Jason Booth wrote: >>> >>>> >>>> Hi Helen, >>>> >>>> As far as an xdmp:invoke I would highly recommend it (instead of >>> xdmp:eval) as it will prove beneficial for the performance >> gains with >>> module caching. >>>> >>>> My suggestion for creating a document for each step will >> benefit you >>> in a few ways. First, you won't need to keep updating the same >>> document with more steps - unnecessary overhead. Second, if >> you kept >>> updating a single document with many steps you could >> possibly get into >>> a situation where you have an extremely large document (say, 1GB or >>> more) and you certainly don't want that. Lastly, when you have many >>> small documents you will have a more efficient search - many step >>> documents vs a single document with many steps works best >> for a search >>> on your data. Placing your smaller step documents into a collection >>> allows you to organize them in a way that you can search >> against them >>> (e.g. perform a search on the "step-123" collection giving >> me back all >>> steps, in date-time order, descending). Think of it like >> this, if you >>> were using an RDBMS your collection would be a table and >> the documents would be rows in a table. >>>> >>>> I recommend you experiment in with the above approach in a sandbox >>> database to see what I mean. Then, once you feel >> comfortable with it >>> move your code into your application. >>>> >>>> Best Regards, >>>> >>>> Jason >>>> >>>> >>>> ________________________________________ >>>> From: [email protected] >>> [[email protected]] On Behalf Of Helen Chen >>> [[email protected]] >>>> Sent: Friday, November 05, 2010 8:34 PM >>>> To: [email protected] >>>> Cc: Helen Chen >>>> Subject: Re: [MarkLogic Dev General] question about logging >>>> >>>> Hi Jason, >>>> >>>> The invoke() needs to use main module, I prefer to not use main >>>> module >>> so I can >>>> call functions. >>>> >>>> And if each step is a document, I feel that will create too many >>> documents, and >>>> these documents will be tiny. I prefer one batch as a document, >>> that's kind of >>>> say I need to do a lot of update to the document. Not sure if it >>> sounds good >>>> using eval. >>>> >>>> Thanks, Helen >>>> >>>> >>>>>>> Jason Booth <[email protected]> 11/05/10 4:39 PM >>> >>>> >>>> I would use xdmp:invoke() vs xdmp:eval() for better >> performance, esp. >>> w/heavy >>>> use - you take advantage of caching. >>>> >>>> To follow-up with what Walter said, you could add each >> document to a >>> collection >>>> when you create them. Each "step" could be a document, and >> a series >>>> of >>> steps >>>> could belong to the same collection, say based off an article id. >>>> And, >>> on top of >>>> that, they could belong to a general collection as a >> bigger umbrella >>> to search >>>> on all "steps". Below is some psuedo-code: >>>> >>>> let $article-id := "123" >>>> let $step := >>>> <step> >>>> <step-datetime>{fn:current-dateTime()}</step-datetime> >>>> <article-id>{$article-id}</article-id> >>>> <action>Published article</action> >>>> </step> >>>> >>>> return xdmp:document-insert( >>>> fn:concat("/steps/", xdmp:hash64(xdmp:quote($step))), $step, >>> (), >>>> (fn:concat("step-", $article-id), "steps")) >>>> >>>> >>>> ________________________________________ >>>> From: [email protected] >>>> [[email protected]] On Behalf Of helen chen >>>> [[email protected]] >>>> Sent: Friday, November 05, 2010 4:36 PM >>>> To: General Mark Logic Developer Discussion >>>> Subject: Re: [MarkLogic Dev General] question about logging >>>> >>>> Hi Wunder, >>>> >>>> This sounds reasonable to me and might work for me. >>>> >>>> we have two database, one for original format, and one for working >>> format. >>>> During publishing process, I need to move data around, and I might >>> also need to >>>> insert some data back to both working version and original version, >>> then I want >>>> to verify the data after modification, that means I need to use >>> xdmp:eval() for >>>> transaction. And depending on which database I'm in or to >> avoid lock, >>> if I want >>>> to write the log message to a document, I think I also need to use >>> xdmp:eval(). >>>> >>>> I tested xdmp:eval() with some document change, it seems >> fast. I want >>> to ask: >>>> should xdmp:eval() be used whenever I need to guarantee the data >>> change and >>>> whenever I want get the latest data information? If I use >> a lot is >>>> it >>> going to >>>> have impact? >>>> >>>> Thanks, Helen >>>> >>>> >>>> >>>> On Nov 5, 2010, at 4:12 PM, Walter Underwood wrote: >>>> >>>>> You could log that information to a document in MarkLogic. One >>> history >>>> document for each document that is processed. --wunder >>>>> >>>>> On Nov 5, 2010, at 12:46 PM, helen chen wrote: >>>>> >>>>>> Hi David, >>>>>> >>>>>> What happened here is: when the article get published, I >> have to do >>> something >>>> and move data around. I want to log the important steps and >>> information to a >>>> separate file, and this file will be kept as a reference in case >>> something >>>> wrong. This publishing can involve a few hundred articles at one >>> time, that >>>> means the message I'm going to create can be a lot at some >> point. If >>> the >>>> message is not much, I think web service can do it, but >> when I have a >>> lot, and >>>> those messages are from each steps, not at one time, it >> means I have >>> to call web >>>> service many many times, I'm not sure the impact of this to the >>> program. It >>>> sounds to me that it is going to slow down a lot. But I >> never tried >>> this way. >>>>>> >>>>>> In what situation did you use this way? >>>>>> >>>>>> Thanks, >>>>>> Helen >>>>>> >>>>>> On Nov 5, 2010, at 3:23 PM, Lee, David wrote: >>>>>> >>>>>>> Another alternative is you could log to a web service. >>>>>>> That requires some substantial infrastructure but it >> may be worth >>> it >>>>>>> depending on what your doing. >>>>>>> Technically its quite easy to do, but it adds one more >> big piece >>>>>>> to >>> the >>>>>>> puzzle, and depending on how frequent your logs are may slow >>>>>>> things >>> down >>>>>>> a lot. >>>>>>> >>>>>>> >>>>>>> ---------------------------------------- >>>>>>> David A. Lee >>>>>>> Senior Principal Software Engineer Epocrates, Inc. >>>>>>> [email protected] >>>>>>> 812-482-5224 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: [email protected] >>>>>>> [mailto:[email protected]] On Behalf Of >>>>>>> helen >>> chen >>>>>>> Sent: Friday, November 05, 2010 3:18 PM >>>>>>> To: General Mark Logic Developer Discussion >>>>>>> Subject: Re: [MarkLogic Dev General] question about logging >>>>>>> >>>>>>> Hi Wunder, >>>>>>> >>>>>>> Thanks for this information. I have to study it and see if I can >>> make it >>>>>>> work. >>>>>>> >>>>>>> Helen >>>>>>> >>>>>>> >>>>>>> On Nov 5, 2010, at 2:58 PM, Walter Underwood wrote: >>>>>>> >>>>>>>> Using the admin UI, navigate to the group your hosts are in. On >>> that >>>>>>> page, near the bottom, there is an choice for "system >> log level". >>> That >>>>>>> chooses what level of log events will be sent to the >> system log. >>>>>>> On Unix, that is syslog. There is a separate choice for >> "file log >>> level", >>>>>>> which controls what is logged in ErrorLog.txt. >>>>>>>> >>>>>>>> syslog_ng can use patterns to match log messages and >> route them >>>>>>>> to >>> a >>>>>>> particular log. >>>>>>>> >>>>>>>> wunder >>>>>>>> >>>>>>>> On Nov 5, 2010, at 11:44 AM, helen chen wrote: >>>>>>>> >>>>>>>>> Hi Walter, >>>>>>>>> >>>>>>>>> When you say "configure MarkLogic so the system log level >>> includes >>>>>>> your extra log messages", where can I do the configuration? Does >>> that >>>>>>> mean the log that write to ErrorLog.txt will also be written to >>> syslog? >>>>>>> is there any special format that I need to do for the >> message that >>> I >>>>>>> want to log? If it is in some document, can you point me which >>> document >>>>>>> I can find it? >>>>>>>>> >>>>>>>>> Thanks, Helen >>>>>>>>> >>>>>>>>> >>>>>>>>> On Nov 5, 2010, at 1:37 PM, Walter Underwood wrote: >>>>>>>>> >>>>>>>>>> If you are on a system that uses syslog_ng, you can do this >>>>>>>>>> with >>>>>>> that tool. >>>>>>>>>> >>>>>>>>>> Log messages normally, but configure MarkLogic so the system >>>>>>>>>> log >>>>>>> level includes your extra log messages. Configure syslog_ng to >>> route >>>>>>> those log messages to the file you want. >>>>>>>>>> >>>>>>>>>> wunder >>>>>>>>>> == >>>>>>>>>> Walter Underwood >>>>>>>>>> [email protected] >>>>>>>>>> >>>>>>>>>> On Nov 5, 2010, at 8:33 AM, helen chen wrote: >>>>>>>>>> >>>>>>>>>>> Maybe I didn't say it clearly. >>>>>>>>>>> >>>>>>>>>>> fn:concat() is for the message part. I also want to write >>> this >>>>>>> message to a separate file on the file system, the file name is >>>>>>> specified dynamically. And if this file already exists on file >>> system, >>>>>>> it should be the append , not overwrite. It is similar to the >>>>>>> unix script that I write my log to some file I want. >>>>>>>>>>> >>>>>>>>>>> In the meantime I don't want to stop the >> xdmp:log(), if I use >>>>>>> xdmp:log, it should still write to ErrorLog.txt file. >>>>>>>>>>> >>>>>>>>>>> Thanks, Helen >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Nov 5, 2010, at 11:19 AM, Tim Meagher wrote: >>>>>>>>>>> >>>>>>>>>>>> I just embed fn:concat() within the call the xdmp:log() and >>>>>>> concatenation >>>>>>>>>>>> the various message parts, e.g. >>>>>>>>>>>> >>>>>>>>>>>> xdmp:log(concat("Path: ", {$path})) >>>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: [email protected] >>>>>>>>>>>> [mailto:[email protected]] >> On Behalf Of >>>>>>> helen chen >>>>>>>>>>>> Sent: Friday, November 05, 2010 11:16 AM >>>>>>>>>>>> To: General Mark Logic Developer Discussion >>>>>>>>>>>> Subject: [MarkLogic Dev General] question about logging >>>>>>>>>>>> >>>>>>>>>>>> Hello there, >>>>>>>>>>>> >>>>>>>>>>>> In Marklogic, I use xdmp:log() to log message to >>>>>>>>>>>> ErrorLog.txt >>>>>>> file. I want >>>>>>>>>>>> to do some logging similar to script, like I >> specify the path >>> and >>>>>>> file name, >>>>>>>>>>>> then I write just the message I want to this file and then >>> keep >>>>>>> appending >>>>>>>>>>>> message to this file. I expect that this should >> not stop the >>>>>>> normal logging >>>>>>>>>>>> of xdmp:log(). >>>>>>>>>>>> >>>>>>>>>>>> Does anyone have suggestion on how to do it? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, Helen >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> General mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> General mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> General mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> General mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> General mailing list >>>>>>>>> [email protected] >>>>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>>> >>>>>>>> -- >>>>>>>> Walter Underwood >>>>>>>> [email protected] >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> General mailing list >>>>>>>> [email protected] >>>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>> >>>>>>> _______________________________________________ >>>>>>> General mailing list >>>>>>> [email protected] >>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>>> _______________________________________________ >>>>>>> General mailing list >>>>>>> [email protected] >>>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>>> >>>>>> _______________________________________________ >>>>>> General mailing list >>>>>> [email protected] >>>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>> >>>>> -- >>>>> Walter Underwood >>>>> [email protected] >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> General mailing list >>>>> [email protected] >>>>> http://developer.marklogic.com/mailman/listinfo/general >>>> >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>>> >>>> _______________________________________________ >>>> General mailing list >>>> [email protected] >>>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
