Re: [MarkLogic Dev General] Reg: E-Node and D-Node configuration

Geert Josten Thu, 31 Oct 2013 07:59:09 -0700

Slight elaboration to below answer. There are two functions to acquire
locks: xdmp:lock-acquire, and xdmp:lock-for-update. They differ very much
in nature.




The first can be applied to docs and dirs, and is meant for long term
‘locking’. It is tied to the current user. If you have code running with
app-level authentication, they are not very helpful. The functions
xdmp:document-locks, xdmp:directory-locks, and xdmp:collection-locks
operate on those first kind of locks.



Using the latter should usually not be necessary. Inter-cluster
update/commit locks are acquired automatically as soon as an update on a
particular fragment is issued. It is also automatically release at the end
of the transaction, which is usually at the end of the request. This is all
part of the ACID compliance..



Kind regards,

Geert



*Van:* [email protected] [mailto:
[email protected]] *Namens *David Ennis
*Verzonden:* donderdag 31 oktober 2013 15:21
*Aan:* [email protected]
*Onderwerp:* Re: [MarkLogic Dev General] Reg: E-Node and D-Node
configuration



Locks can be obtained on a document(or documennts), directory or collection
level.  If you are locking a single document, then only that document is
locked.  If you lock a directory, then there are controls on what gets
locked (depth).

The automatic lock used for updates is xdmp:lock-for-update - which uses a *
single* URI as its parameter.

Regards,
David

On 31/10/13 15:04, Michael Malgeri wrote:

Does a "cluster-wide lock" only pertain to the document that is being

updated?



In other words if document1 is being updated, can document2 be updated

while the lock is still held on document1?





Michael Malgeri

Principal Technologist

MarkLogic Corporation

[email protected]

Cell: 1 310 704 6403

www.marklogic.com <http://www.marklogic.com> <http://www.marklogic.com>









On 10/29/13 12:39 PM, "Ron Hitchens" <[email protected]> <[email protected]> 
wrote:





  You can think of a MarkLogic cluster as a single virtual

server.  A cluster is made up of nodes (E, D or E/D) but the

cluster should be thought of as an indivisible unit.



  D (data) nodes are MarkLogic processes that have forests

attached.  E (evaluator) nodes are those nodes which run

XQuery/XSLT requests on an appserver.  In a cluster, all nodes

share the same appserver configuration, so any node can be an

E node.  Typically, when configuring dedicated E and D nodes,

you configure things to send requests to only those nodes that

you want to act as E's, allowing the others to act only as D's.



  Communication between nodes in a cluster is basically this:



  For queries (read-only) no locks are needed (read up on MVCC).

Each search operation is fired in parallel to every D node

in the cluster (this is the "map" phase).  When the last D node

has responded, the E node can then merge the results (the "reduce").



  So, the lower the latency in communication between nodes, the

better the overall throughput.  You really don't want any slow

links between nodes in the cluster because it can slow down all

the E nodes.



  For update (write), cluster-wide locks must be obtained for

documents that are, or might be, updated.  All nodes in the cluster

must acknowledge the lock(s) before the update(s) can proceed.  This

basically means that updates can't happen faster than the slowest

responding node in the cluster.  Oh, and the locks need to be

released as well, via inter-node communication.



  Again, bad for overall performance when communication links

between nodes slow down, even with super-fast, beefy hardware.



  As Mike pointed out, clusters are not database replication.

You cluster to improve performance by spreading the immediate

work across multiple CPU and disks co-located together.  You

can add synchronous replication between nodes in a cluster to

provide for HA failover in the event a node fails.  This has a

latency cost, but makes the cluster more robust.  You replicate

databases asynchronously between clusters to provide for disaster

recovery if an entire cluster is lost or becomes unreachable.



  Hope that helps.



---

Ron Hitchens {[email protected]}  +44 7879 358212



On Oct 28, 2013, at 10:03 PM, Arindam3 B <[email protected]>
<[email protected]> wrote:





Thanks Mike for the great walkthrough. Just trying to understand more

on the xqdp protocol. Can you throw some light on how it operates

between enodes n dnodes?



Thanks & Regards

Arindam



-----Michael Blakeley <[email protected]> <[email protected]> wrote: -----



=======================

To: MarkLogic Developer Discussion <[email protected]>
<[email protected]>

From: Michael Blakeley <[email protected]> <[email protected]>

Date: 10/28/2013 10:31PM

Subject: Re: [MarkLogic Dev General] Reg: E-Node and D-Node

configuration

=======================

  Hosts within a cluster should have low-latency communications:

gigabit ethernet or better. Ideally they should all be on the same

switch and/or VLAN, with no router hops between hosts. If you try to set

up a cluster across a WAN link you are likely to see poor performance

and poor reliability. You might be trying to handle high availability

(HA) and disaster recovery (DR) with a single cluster: that would be a

mistake.



For high availability, use a single cluster with low-latency

communications. Configure forest replication and host failover to

provide the desired degree of protection against host failures. The docs

at http://docs.marklogic.com/guide/cluster/failover talk about this as

"local-disk failover".



For disaster recovery - scenarios where an entire data center goes

offline - use database replication to a different cluster. This can use

higher-latency communications, such as a WAN link. The docs at

http://docs.marklogic.com/guide/database-replication describe this. The

DR replica cluster can also implement local-disk failover to provide its

own HA.



-- Mike



On 28 Oct 2013, at 06:41 , Arindam3 B <[email protected]>
<[email protected]> wrote:



Hi,



I had a query regarding the E-Node and D-Node setup in Marklogic.



In a distributed environment, if I plan to keep the Enodes and DNodes

separately in different physical locations over the LAN or WAN (across

geographies), what is the potential risk?

How does failover work in that scenario?

I have read that ENodes and DNodes communicate through XQDP protocol,

so in this case will there be performance issues?



Does Marklogic recommend having ENode and DNode cluster in the same

physical box?

If so, then across the network if we have a set of E-D-Nodes, how is

the network latency reduced while synching the data during replication?



If you can provide me with some information about XQDP protocol it

would be great!!



Thanks & Regards

Arindam Bose

=====-----=====-----=====

Notice: The information contained in this e-mail

message and/or attachments to it may contain

confidential or privileged information. If you are

not the intended recipient, any dissemination, use,

review, distribution, printing or copying of the

information contained in this e-mail message

and/or attachments to it are strictly prohibited. If

you have received this communication in error,

please notify us by reply e-mail or telephone and

immediately and permanently delete the message

and any attachments. Thank you





_______________________________________________

General mailing list

[email protected]

http://developer.marklogic.com/mailman/listinfo/general



_______________________________________________

General mailing list

[email protected]

http://developer.marklogic.com/mailman/listinfo/general



_______________________________________________

General mailing list

[email protected]

http://developer.marklogic.com/mailman/listinfo/general



_______________________________________________

General mailing list

[email protected]

http://developer.marklogic.com/mailman/listinfo/general



_______________________________________________

General mailing list

[email protected]

http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Reg: E-Node and D-Node configuration

Reply via email to