Golden. Thanks mike What about thousands of writes per second. Any differences?
Sent from my iPhone On Feb 6, 2012, at 4:56 PM, "Michael Blakeley" <m...@blakeley.com> wrote: > That doesn't sound too challenging. The points you've already raised are > good, but you will need whatever indexing you need. You might try to avoid > using property fragments, if possible (disable maintain-last-modified, for > example). Depending on your queries, you may be able to disable some or all > of the default full-text indexing, and rely on a combination of the built-in > XPath indexes and application-specific range indexes. > > Think hard about your document URIs. You will want the URIs to be such that > lock contention simply won't happen. For example you could use xdmp:random to > generate URIs, or some combination of ids and timestamps that will guarantee > uniqueness. Let's say you receive an update for each ticker symbol once per > second, for example. You might structure your URIs as SYMBOL/TIMESTAMP, or as > TIMESTAMP/SYMBOL. Put some thought into which of those might be more useful > at query time. > > You may want to reduce the size of your in-memory stands. This may sound > backward. Folks often try to optimize ingestion by using really large > in-memory stands, but with small documents this can be counter-productive. > With high-frequency updates and small documents, you may be better off > limiting each in-memory stand to less than 32k fragments, and reducing the > in-memory limits accordingly so that you can use that memory elsewhere. > > After that it will mostly be a question of keeping up with the demands on > CPU, memory, and disk. Given modern Xeon CPUs and memory sizes, the disk is > probably the hardest part. You want fast sequential writes for journaling and > for saving in-memory stands as they fill up. You'll also need fairly good > read performance for merges. As a rule of thumb, try to have 10-MB/sec of > read-write capacity per 1-MB/sec of incoming XML. > > You might also benefit from a little SSD storage configured as a fast data > directory for your forests (requires MarkLogic 5). But I think you can hit > your targets with spinning disks, as long as you configure them properly. > > You'll probably want to have 1-2 forests per filesystem, spread out across > multiple block devices, rather than putting everything on one giant > filesystem. Consider avoiding RAID entirely, and using forest replication > instead. If you do use RAID, use RAID-1 and RAID-10. Avoid RAID-5 and RAID-6, > because their write performance is likely to be a problem. > > -- Mike > > On 6 Feb 2012, at 14:11 , seme...@hotmail.com wrote: > >> Not sure, but let's say hundreds a second. >> >>> From: m...@blakeley.com >>> Date: Mon, 6 Feb 2012 14:10:42 -0800 >>> To: general@developer.marklogic.com >>> Subject: Re: [MarkLogic Dev General] Optimiziing for several writes >>> >>> How many inserts/sec do you think the database will need to sustain? >>> >>> -- Mike >>> >>> On 6 Feb 2012, at 13:57 , seme...@hotmail.com wrote: >>> >>>> So I've normally dealt with optimizing MarkLogic for few writes but many >>>> reads. In a situation where there are several writes and fewer reads (as >>>> with reports on stock ticks for example), are there any pointers or tips >>>> for speeding up writes? I can imagine that reducing the number of indexes >>>> helps, as does always writing new files rather than updating existing >>>> ones, and keep the files small. Anything else? I may need some indexes for >>>> reporting purposes. And I realize that it may be better to let another >>>> system write the data while MarkLogic ingests soon thereafter, but I am >>>> interested in truly realtime data views, not next-day, or next-hour views >>>> into the data. >>>> >>>> thanks, >>>> Ryan >>>> _______________________________________________ >>>> General mailing list >>>> General@developer.marklogic.com >>>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> General@developer.marklogic.com >>> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> General@developer.marklogic.com >> http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general