RE: [MarkLogic Dev General] Difference between eNode and Data Node

Danny Sokolsky Tue, 24 Mar 2009 09:52:24 -0700

Hi Saptarshi,


There is no requirement that an e-node must also have a forest attached
to it.  In fact, in large implementations, the norm is to configure
e-nodes to do only e-node work and d-nodes to do only d-node work.  That
is what Groups are for.  You might, for example, set up 2 groups, one
for e-nodes and one for d-nodes.  The d-node groups do not need to have
any app servers on them, and the e-node groups do not need any databases
or forests.  This means that each node can devote its entire life (and
all of its resources) to its role.  For example, if you have a group
that only has d-nodes, you do not need to allocate much expanded tree
cache (that is used for e-node processing).  Similarly, if a group is
only e-nodes, they do not need to allocate much list cache or compressed
tree cache.   Be extra careful when changing these values, however, and
make sure you know what role your hosts are playing.

 

Hosts in a MarkLogic Server cluster communicate via the xdqp protocol,
which is an internal communication mechanism.  Any changes to the
cluster are communicated to the other hosts via xdqp, and forest data is
transferred to the e-node via xdqp.  All of this communication happens
automatically.

 

Hope that helps,

-Danny

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Saptarshi
Newyork
Sent: Monday, March 23, 2009 10:08 PM
To: General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General] Difference between eNode and Data
Node

 

Hi All,

 Thanks for a great description and examples. I still have couple of
questions to add:

 

1) I understand that same node can work as both eNode and dNode, bu if I
want to have separate eNode and dNode, in that case, is there any
difference in configuration of the host for these two nodes?

 

2) In an architecture where both eNode and dNode exist, suppose a
request comes to eNode which requires an access to forest. Then it's
written that eNode will send the request to dNode to access the forest.
But every evaluator node(eNode) is also attached to some forests. How
this transfer of request is achieved? How eNode can make a call to
dNode?  Is there any configuration or coding required to achieve this?
Can under any scenario eNode access its own forest?

 

Thanks in advance.

regards,

Saptarshi

--- On Mon, 3/23/09, Danny Sokolsky <[email protected]> wrote:

        
        From: Danny Sokolsky <[email protected]>
        Subject: RE: [MarkLogic Dev General] Difference between eNode
and Data Node
        To: "General Mark Logic Developer Discussion"
<[email protected]>
        Date: Monday, March 23, 2009, 4:18 PM

        Hi Geert,
        
        Thanks for the great description.  I will just add one thing to
what you
        said:
        
        Whether a host acts as an e-node or a d-node depends on what it
is doing
        at the time, and a given host in a MarkLogic cluster can behave
as an
        e-node, a d-node, or both.  For example, if you have a single
host
        instance of MarkLogic Server, that host acts as both the e-node
(to
        evaluate XQuery) and as the d-node (to perform forest operations
on
        content).  
        
        -Danny
        
        -----Original Message-----
        From: [email protected]
<http://us.mc588.mail.yahoo.com/mc/compose?to=general-boun...@developer.
marklogic.com> 
        [mailto:[email protected]
<http://us.mc588.mail.yahoo.com/mc/compose?to=general-boun...@developer.
marklogic.com> ] On Behalf Of Geert
        Josten
        Sent: Monday, March 23, 2009 12:59 PM
        To: General Mark Logic Developer Discussion
        Subject: RE: [MarkLogic Dev General] Difference between eNode
and Data
        Node
        
        Saptarshi,
        
        I am not an authority on this matter either, but I will try to
explain
        as well as possible..
        
        1) MarkLogic Server is designed to operate with evaluator nodes
and
        database nodes. The database nodes access content stored in
forests and
        perform search queries over the forests. The evaluator nodes are
        responsible for executing the Xquery code, webdav requests, XDBC
calls
        etc. If the involved code to be executed doesn't access any
content
        stored in the database (no cts:search calls, no doc statements,
etc),
        but purely relies on in memory constructed content, then
database nodes
        are not accessed. It has nothing to do with caching of any kind,
it is
        just that content can be constructed on the fly, by just
incorporating
        it in the Xquery script for instance. The example Eric supplied
is
        valid.
        
        2) MarkLogic Server does not handle failover when filesystems
crash. The
        documentation
        (http://developer.marklogic.com/pubs/4.0/books/cluster.pdf)
explains
        that filesystem crashes should be handled by using a clustered
        filesystem. There are some suggestions in that document, but I
can
        imagine that a RAID configuration might suffice for simples
situations
        as well. Forest-level failover works as follows: you assign
multiple
        hosts to one physically shared forest. These hosts are listed in
order.
        If the 1st host drops out, the 2nd host takes that forest over.
        Replication of data is not necessary that way, making it more
efficient
        and much more scalable. At the front-end you have also the HTTP
servers
        etc on the hosts. You can have as many as you like. By putting a
        hardware or software load-balancer in front you can distribute
calls
        coming in at a single port to all available 'evaluator' nodes.
        Load-balancing is not handled by MarkLogic Server itself, there
are
        plenty solutions readily available so why bother. ;-)
        
        I am not sure whether an HTTP server is the actual evaluator
node, but I
        don't think so. There is this Task Server configuration page
within the
        MarkLogic Server Group Administration. This configures Task
threads on
        all hosts within a single group. I have the impression these act
as
        evaluator nodes and the Databases in the MarkLogic Server
Administration
        correspond to the database nodes. Forest-level failover is
configured at
        the Forest configuration pages.
        
        I hope this makes things clearer to you!
        
        Kind regards,
        Geert
        
        >
        
        
        Drs. G.P.H. Josten
        Consultant
        
        
        http://www.daidalos.nl/
        Daidalos BV
        Source of Innovation
        Hoekeindsehof 1-4
        2665 JZ Bleiswijk
        Tel.: +31 (0) 10 850 1200
        Fax: +31 (0) 10 850 1199
        http://www.daidalos.nl/
        KvK 27164984
        De informatie - verzonden in of met dit emailbericht - is
afkomstig van
        Daidalos BV en is uitsluitend bestemd voor de geadresseerde.
Indien u
        dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te
        verwijderen. Aan dit bericht kunnen geen rechten worden
ontleend.
        
        
        > From: [email protected]
<http://us.mc588.mail.yahoo.com/mc/compose?to=general-boun...@developer.
marklogic.com> 
        > [mailto:[email protected]
<http://us.mc588.mail.yahoo.com/mc/compose?to=general-boun...@developer.
marklogic.com> ] On Behalf Of
        > Saptarshi Newyork
        > Sent: maandag 23 maart 2009 12:30
        > To: [email protected]
<http://us.mc588.mail.yahoo.com/mc/[email protected]
c.com> 
        > Subject: [MarkLogic Dev General] Difference between eNode and
        > Data Node
        >
        > Hi ,
        > I have a few questions:
        >
        > 1)  What is the difference between eNode and dNode? I have
        > read that E-nodes are required to evaluate XQuery programs,
        > XCC/XDBC requests, WebDAV requests, and other server
        > requests.and dNodes are those which directly talks with the
        > database/forest. It is also told that if the request does not
        > need any forest data to complete, then an e-node request is
        > evaluated entirely on the e-node. I do not understand how
        > this is possible!! If eNode is meant for XQuery evaluation
        > and XQuery needs an XML to process, then every eNode request
        > should talk to dNode. Is there any caching mechanism? It will
        > be great if anybody can explain this to me?
        >
        >
        >
        > 2) There are two failover mechanism explained in the
        > documentation. Forest level failover and eNode level
        > failover. It seems that forest data level failover is not
        > handled by Marklogic. Like if the filesystem crashes, is
        > there anyway by which Marklogic server replicates the forest
        > to other hosts in same or different cluster? If this feature
        > is not presently supported, then when can we expect this on
        > the roadmap?
        >
        >
        >
        > Thanks in advance.
        >
        >
        >
        > regards,
        >
        > Saptarshi
        >
        >
        
        _______________________________________________
        General mailing list
        [email protected]
<http://us.mc588.mail.yahoo.com/mc/[email protected]
c.com> 
        http://xqzone.com/mailman/listinfo/general
        _______________________________________________
        General mailing list
        [email protected]
<http://us.mc588.mail.yahoo.com/mc/[email protected]
c.com> 
        http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

RE: [MarkLogic Dev General] Difference between eNode and Data Node

Reply via email to