Thanks a bunch, Danny!

When I posted these questions, I was feeling overwhelmed...I feel a little
better! I really want to learn MarkLogic because I can see it's going to
help the government, specially the military, tremendously.

thanks again for y'all guidance!

Bob O.

On Tue, May 21, 2013 at 4:41 PM, Danny Sokolsky <
[email protected]> wrote:

>  OK, so it sounds like your forest sizes actually are in MB, not GB
> (unless you made the same mistake I did :), so that does not sound so bad.
> Given that you are using so many range indexes and that you only have a 6GB
> slice, you might be using up lots of memory.  Look to tools like top to
> help you figure that out.****
>
> ** **
>
> I would try and concentrate on understanding your loading issues.   You
> need to figure out things like what the code is doing, how it is
> structured, and be a little concrete about how fast or slow it is.  For
> example, you might be locking more documents than you need to during load.
> So try and get a test case that shows your issues, then you can start to
> whittle it down.  That is where having your own sandbox can help.****
>
> ** **
>
> As to whether you will be able to see the errors if you turn off uncaught
> errors, the errors will still go back to the client, but not to the logs.
> ****
>
> ** **
>
> -Danny****
>
> ** **
>
> *From:* [email protected] [mailto:
> [email protected]] *On Behalf Of *Bob O
> *Sent:* Tuesday, May 21, 2013 2:22 PM
> *To:* MarkLogic Developer Discussion
> *Subject:* Re: [MarkLogic Dev General] ML Project Issues****
>
> ** **
>
> Danny,****
>
>  ****
>
> So here's what I gathered so far to answer your questions:****
>
>  ****
>
>  ·         I am confused what version you are on – is it 5.0-4.1 for this
> project 4.0? We are currently on MarkLogic V5.0-4.1 ****
>
> ·         Is this production or development?  If it is production, you
> might consider contacting MarkLogic support. I am using a test bed. We
> currently have a test, dev, and a prod environment. We deploy VMs from prod.
> ****
>
> ·         Do you have a test cluster?  If not, I would make that a
> priority so you can try stuff easily. I believe it's what I'm using right
> now but it is shared by developers and engineers. Did you mean for me to
> build one on my own box?****
>
> ·         600MB forests sound very large.  The rule-of-thumb for size is
> max 200MB, so you are way off here.  But the important number is how many
> fragments are in the forests.  You should be able to get that number from
> the database status page (show forest info), or from the xdmp:forest-status
> function. We have 4 forests (I'll call them F1, F2, F3, F4). Their sizes
> are 362MB, 348Mb, 340MB, and 345Mb respectively. Each forest has over
> 48,000 fragments with their "deleted fragments" ranging from 9500 to
> 11,100.  Each forest averages 25,000+ documents.****
>
>  ·         How big are your VMs? (How much memory) The VMs have 6GB of
> memory with 2 CPUs and 5 Hard Disks (60GB for the 1st HD and 610GB for the
> remaining 4 HDs with a total of 2,500GB of disk space (2.5TB)****
>
> ·         How many Range indexes? I counted 50 Range Indexes and 6 Range
> Attribute Indexes.****
>
> ·         How is your I/O rate on the system?  Ideally, it should be
> capable of roughly 20Mb/sec per forest. I was told 200Mb/sec...I haven't
> verified this. Where can I verify this number? Under the Database status
> page?****
>
> ·         As far as the logs, you can turn off log uncaught errors on the
> App Server doing the loads (although you might need that info).  The more
> interesting question is why the loads are throwing errors. All the "log
> uncaught errors" are turned ON on ALL http servers. will I be able to still
> see and diagnose what errors I have if I turn this to "false"? I'm still
> looking into why the loads are failing.****
>
> ·         How many nodes in this cluster? Without much info, my guess is
> that finding some decent disk is a high priority. That should give you a
> few things to scratch your head over. I believe there's only one node and
> it is not coupled. Is there a way to check how many nodes there are?****
>
> ** **
>
> On Tue, May 21, 2013 at 11:21 AM, Danny Sokolsky <
> [email protected]> wrote:****
>
> Hi Bob,****
>
>  ****
>
> Here are a few questions and a few things I would focus on:****
>
>  ****
>
> ·         I am confused what version you are on – is it 5.0-4.1 for this
> project 4.0?****
>
> ·         Is this production or development?  If it is production, you
> might consider contacting MarkLogic support.****
>
> ·         Do you have a test cluster?  If not, I would make that a
> priority so you can try stuff easily.****
>
> ·         600MB forests sound very large.  The rule-of-thumb for size is
> max 200MB, so you are way off here.  But the important number is how many
> fragments are in the forests.  You should be able to get that number from
> the database status page (show forest info), or from the xdmp:forest-status
> function.  ****
>
> ·         How big are your VMs? (how much memory)****
>
> ·         How many Range indexes?****
>
> ·         How is your I/O rate on the system?  Ideally, it should be
> capable of roughly 20Mb/sec per forest.****
>
> ·         As far as the logs, you can turn off log uncaught errors on the
> App Server doing the loads (although you might need that info).  The more
> interesting question is why are the loads throwing errors.****
>
> ·         How many nodes in this cluster?****
>
>  ****
>
> Without much info, my guess is that finding some decent disk is a high
> priority.****
>
>  ****
>
> That should give you a few things to scratch your head over.****
>
>  ****
>
> -Danny****
>
>  ****
>
> *From:* [email protected] [mailto:
> [email protected]] *On Behalf Of *Bob O
> *Sent:* Tuesday, May 21, 2013 8:39 AM
> *To:* [email protected]
> *Subject:* [MarkLogic Dev General] ML Project Issues****
>
>  ****
>
> Hello Everyone,****
>
>  ****
>
> I am taking over a new project that I would consider large scale. I was
> hired as a ML DBA but I am really fairly new at MarkLogic. We were using
> ML4.0 and this project they are using ML v5.0-4.1 and they deploy the
> product on VMs.****
>
>  ****
>
> They are running into a bunch of issues and I feel overwhelmed by it. I
> have seen some of it before but some of the issues are these:****
>
> 1) logging issue: everytime their ingestions errors out, it logs off
> everything about it which amounts to about 2Mb everytime it happens. This
> happens quite often and they are getting tons of logs for a short period of
> time. Is there a way to minimize what the logs should spit out and cut down
> the extra unnecessaryinformation?****
>
>  ****
>
> 2) ingestion is slow: this could be anything that's causing the ingesstion
> to be so slow. Where should I look for the casue? I have contacted the SW
> Developer on the ingestion process and still waiting for his response. I am
> told that they are using an inhouse app called DDMS that I am not familiar
> with.****
>
>  ****
>
> 3) forest space: how do I check if there forest space is enough. They have
> 4 forests and are around 600GB a piece. Is there a formula to properly
> figure out the space allocation for each forest and to plan for future use?
> ****
>
>  ****
>
> 4) performance issues: they are experiencing some latency issues, CPU-IO
> scheduler, and they're fixing to buy NAS servers for their storage
> management.****
>
>  ****
>
> I apologize for dropping all of these issues at once but I figure there
> are more brains out there than this one. I feel I hae taken a much bigger
> task and role thatn I could handle. I appreciate any assistance or
> direction anyone can give. ****
>
>  ****
>
> --BobO****
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general****
>
> ** **
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to