subject:"\[jira\] \[Commented\] \(HADOOP\-10641\) Introduce Coordination Engine"

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-25 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108989#comment-14108989
 ] 

Steve Loughran commented on HADOOP-10641:
-

Thank your for the really good TLA document —I do think it makes what's going 
on a lot clearer as now others can see strictly what implementations do. (It 
now places a requirement to me to do the same for my proposals, but I'm happy 
with that). 

It also places a requirement for me to look at the code alongside the spec, so 
here goes:

h3. Algorithm

# presumably usedIds aren't collected forever in common implementations, 
instead they use enough of a time marker in their IDs to be self-windowing.

h3. core code

# {{CoordinationEngine}} should extend Hadoop common's {{AbstractService}}; 
this makes it trivial to integrate with the lifecycle of YARN apps and when 
someone migrates the NN/DN to the lifecycle will hook to HDFS the same way. The 
init/start/stop operations all match. 
# {{ZKConfigKeys}} needs a name more tied to the coordination engine.
# {{ZKCoordinationEngine}} could use guava {{Precondition}} as a check on 
{{localNodeId}} and move it up to the init method.
# I don't like downgrading all ZK exceptions to IOE, as it hides things like 
security exceptions, missing parents c. Again this is something we could do 
that is more generic than just for the co-ord engine, as I've ended up doing 
some of this in my code.

# {{ZKCoordinationEngine.loopLearningUntilCaughtUp()}} 
# {{createOrGetGlobalSequenceNumber}}'s  {{while(true)}} loop appears to spin 
forever if the exception raised in its ZK actions is 
{{KeeperException.NoAuthException}}... that is, if starting on a secure cluster 
where it can't access the path. More filtering of exception types is needed 
with the unrecoverables thrown up ({{SessionExpiredException}} +maybe some 
others). 

# {{ZKCoordinationEngine.submitProposal()}}: needs better exception text, 
ideally including path and text of nested exception.
# I'm not sure I like the way {{ZKCoordinationEngine.processImpl()}} shuts 
itself down. I'd prefer some exception for the caller to process, so that owner 
of the engine is in sync.
# if an {{AgreementsThread}} fails its exception doesn't get picked up ... 
these need to be propagated back to the junit thread. Ideall also logged in 
that {{AgreementsThread}}  after {{Thread.currentThread().setName()}} has given 
it a name for the log statements. 
  
h3. tests
  
# MiniZKCluster ... I have the one from Twill converted to a YARN svc; this is 
is one i'd like to see this switch to *once the code is checked in*. It's 
lighter weight than the HBase one.
# there's always a delay for ZK startup. Could you make the test cases start 
one as a @BeforeClass and all share the same one?
# needs tests for failure conditions: no ZK cluster
# if there's a way to do this, tests for against a secure cluster, both 
succeeding and failing.
# add a test timeout via {{@Rule public final Timeout testTimeout = new 
Timeout(3);}}
  
  
h3. minor

# minor: needs formatting to style, use of {} in all conditional clauses.
# Can you use SLF4K everywhere -it's more efficient as can do string expansion 
only when needed  will stop people complaining that every log action needs to 
be wrapped by log-level checks. Then switch calls like {{LOG.info(Got watched 
event:  + watchedEvent)}} to {{LOG.info(Got watched event: {}, 
watchedEvent)}}



h2. What now?


I'm pretty happy with it —though more reviewers are needed. What now? 
# We commit {{hadoop-coordination}} with the implementation into Hadoop common, 
and work towards getting it into a version of the NN which can use it for 
committing operations; maybe later in YARN and downstream.
# We work with the curator team to get it into curator and pull that into 
server-side Hadoop. YARN-913 is going to need that in the RM anyway, though so 
far HDFS hasn't. Strengths: lives with the rest of their ZK work. Weaknesses: 
looser coupling to Hadoop  a different release cycle. 
# it goes into a hadoop-zookeeper module that depends on ZK  Curator, and on 
which things which need these can depend on, including client-side code that 
wants it (ZK trickles out somehow already, pulling it apart would be cleaner). 
For example, my mini-ZK cluster as YARN service could be anothe service to add.

Strategy 3 appeals to me : it's not that different from what is there today 
(indeed, we just fix the module name for now  add more features as/when needed)


 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch,

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-25 Thread Steve Loughran (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108993#comment-14108993
]

Steve Loughran commented on HADOOP-10641:
-

oh, one more thing.. could you add the .tla file to the patch too. maybe we
could start having {{src/tla}} as the home for these files

Introduce Coordination Engine interface
---

Key: HADOOP-10641
URL: https://issues.apache.org/jira/browse/HADOOP-10641
Project: Hadoop Common
Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
Attachments: HADOOP-10641.patch, HADOOP-10641.patch,
HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch,
NNThroughputBenchmark Results.pdf, ce-tla.zip, hadoop-coordination.patch,
zkCEBenchmark.pdf, zkbench.pdf

Coordination Engine (CE) is a system, which allows to agree on a sequence of
events in a distributed system. In order to be reliable CE should be
distributed by itself.
Coordination Engine can be based on different algorithms (paxos, raft, 2PC,
zab) and have different implementations, depending on use cases, reliability,
availability, and performance requirements.
CE should have a common API, so that it could serve as a pluggable component
in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and
HBase (HBASE-10909).
First implementation is proposed to be based on ZooKeeper.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-25 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109596#comment-14109596
 ] 

Konstantin Shvachko commented on HADOOP-10641:
--

Thanks Henry and Andrey for the benchmark results. To summarize this, we have 
three benchmarks
# NNThroughputBenchmark, which gives us the upper bound of NN throughput.
# ZK benchmark, which measures the performance of ZK itself.
# ZK-CE benchmark measuring performance of CE based on ZK.

Ideally we would like to see NNThroughput = ZK-CE throughput = ZK throughput.
Just looking at create operation 
- NNThroughput yields 13K ops/sec with 400 threads, which seems to be optimal 
for that hardware configuration.
- ZK throughput is substantially higher on SSD: 34K ops/sec
- ZK-CE runs at 8.4K ops/sec, which is slower than NNThroughput.

So there is work to do here. I think CE implementation can be optimized to get 
on par with or close to ZK performance.

 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, 
 NNThroughputBenchmark Results.pdf, ce-tla.zip, hadoop-coordination.patch, 
 zkCEBenchmark.pdf, zkbench.pdf


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-25 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110070#comment-14110070
]

Konstantin Shvachko commented on HADOOP-10641:
--

Steve, thanks for thorough code review.
The question about what's next has been discussed in this jira (see Aaron's
and others comments), on the meetup on July 15 and in person. The decision
everybody agreed on is to continue the CE and CNode development on a branch.
- I would be glad to commit it to hadoop-common as you propose. That way we can
see faster adoption.
- I don't think the interface should be tied to ZooKeeper itself or higher
level projects on top of it like Curator. Because CE is intended to be have
implementations for different consensus algorithms.

Introduce Coordination Engine interface
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-22 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107691#comment-14107691
]

Hadoop QA commented on HADOOP-10641:

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12663755/zkbench.pdf
against trunk revision .

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HADOOP-Build/4535//console

This message is automatically generated.

Introduce Coordination Engine interface
---

Key: HADOOP-10641
URL: https://issues.apache.org/jira/browse/HADOOP-10641
Project: Hadoop Common
Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
Attachments: HADOOP-10641.patch, HADOOP-10641.patch,
HADOOP-10641.patch, HADOOP-10641.patch, NNThroughputBenchmark Results.pdf,
ce-tla.zip, hadoop-coordination.patch, zkbench.pdf

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-22 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107750#comment-14107750
 ] 

Plamen Jeliazkov commented on HADOOP-10641:
---

The test, testSimpleProposals, was also updated to validate that the 
CoordinateEngine's GlobalSequenceNumber is incrementing monotonically per 
Agreement reached.

 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, 
 NNThroughputBenchmark Results.pdf, ce-tla.zip, hadoop-coordination.patch, 
 zkbench.pdf


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107768#comment-14107768
 ] 

Hadoop QA commented on HADOOP-10641:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12663803/HADOOP-10641.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.
See 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4537//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-coordination.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4537//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4537//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4537//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-coordination.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4537//console

This message is automatically generated.

 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, 
 NNThroughputBenchmark Results.pdf, ce-tla.zip, hadoop-coordination.patch, 
 zkbench.pdf


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-19 Thread Alex Newman (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102447#comment-14102447
 ] 

Alex Newman commented on HADOOP-10641:
--

[~mparkin] this looks really neat. 

[~ste...@apache.org] I am curious if that helps at all? 

 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, HADOOP-10641.patch, ce-tla.zip, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085084#comment-14085084
 ] 

Hadoop QA commented on HADOOP-10641:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12659686/HADOOP-10641.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-coordination.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4420//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4420//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4420//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-coordination.html
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4420//console

This message is automatically generated.

 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-08-04 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085170#comment-14085170
 ] 

Konstantin Boudnik commented on HADOOP-10641:
-

Looks like new pom.xml file is missing the ASL boiler-plate.

 Introduce Coordination Engine interface
 ---

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-07-31 Thread Steve Loughran (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080800#comment-14080800
]

Steve Loughran commented on HADOOP-10641:
-

bq. It seems that you cite HADOOP-9361 as an example of a specification of an
API. Same as Andrew I could not find specifications or any documents linked
there. May be you can clarify. Is it a TLA+ spec of HDFS?

Look in [the site
docs|https://github.com/apache/hadoop-common/tree/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem].

I actually used the Z model with a python syntax in the hope it would be more
broadly understood, though now that I'm trying to integrate it with other work
I'm trying to think shall I go back and TLA+ it?...because then its possible
to start thinking about code using the public module definitions.

bq. If FileSystem specs are done and the tests developed according to specs,
should that be sufficient to safeguard FS behaviour of any internal changes
including introduction of CE? Which I assume is your main concern with this
jira.

My concern is slightly different: we're producing a plugin point for
co-ordination across the Hadoop stack, and we need to know what it is meant to
do.

bq. If I introduce an implementation of CE using ZK and it does not break the
tests and therefore does not alter FileSystem semantics, isn't that a
verification of the implementation.

The real HDFS tests are the full stack, with failure injection...those FS API
ones are limited to those API calls and how things like {{seek()}} work. It'll
be the full stack tests with failure injection that will highlight where a test
fails after the changes.

As for the tests in this JIRA so far, they're pretty minimal and verify that
the ZK implementation does something provided nothing appears to fail. They
don't seem designed to be run against any other implementation of the
interface, nor is there any failure injection. Even without the fault
injection, any tests for this should be designed to be targetable at any
implementation, to show consistent behaviour in the core actions.

Such tests won't verify robustness though...it looks to me that I could
implement this API using in-memory data structures, something would be utterly
lacking in the durability things need. Or I could try to use Gossip, which may
have the durability, but a different ordering guarantee which may show up in
production. Its things like the latter I'm hoping to catch, by spelling out to
implementors what they have to do.

bq. It looks like you propose to introduce a new requirement for Hadoop
contributions, that new features should be formally specified and a
mathematically proven. I think this should be discussed in a separate thread
before it can be enforced. Would be good to hear what people think.

Proven? Not a chance. What I would like to see is those critical co-ordination
points to be defined formally enough that there's no ambiguity about what they
can do. This proposal is for a plugin to define what HDFS, HBase c will expect
from a consensus service, so we have to ensure there's no ambiguity. Then we
can use those documents to derive the tests to break things.

bq. Generally great minds were thinking about disciplined software development
techniques way back, probably starting from Dijkstra and Donald Knuth. I found
them very useful dealing with complex algorithms, not sure about APIs.

+Parnas, who invented interfaces as the combination of {{(signature,
semantics)}} in 1972. What I'm trying to do here is get those semantics nailed
down.

Introduce Coordination Engine interface
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-07-29 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078724#comment-14078724
]

Konstantin Shvachko commented on HADOOP-10641:
--

Steve, I am glad we are talking about the same thing now, which is the
CoordinationEngine interface.
# It seems that you cite HADOOP-9361 as an example of a specification of an
API. Same as Andrew I could not find specifications or any documents linked
there. May be you can clarify. Is it a TLA+ spec of HDFS?
# If FileSystem specs are done and the tests developed according to specs,
should that be sufficient to safeguard FS behaviour of any internal changes
including introduction of CE? Which I assume is your main concern with this
jira.
If I introduce an implementation of CE using ZK and it does not break the tests
and therefore does not alter FileSystem semantics, isn't that a verification of
the implementation.
# It looks like you propose to introduce a new requirement for Hadoop
contributions, that new features should be _formally specified and
mathematically proven_. I think this should be discussed in a separate thread
before it can be enforced. Would be good to hear what people think.
# Generally great minds were thinking about disciplined software development
techniques way back, probably starting from Dijkstra and Donald Knuth. I found
them very useful dealing with complex algorithms, not sure about APIs.

Introduce Coordination Engine interface
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-07-29 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078733#comment-14078733
]

Konstantin Shvachko commented on HADOOP-10641:
--

I asked this question in the linked jira and want to post it here.
Do the points raised so far are objections to creating a branch and starting
the implementation of CoordinationEngine along with ConsensusNode on it? As it
was agreed on the July 15 meetup.

Introduce Coordination Engine interface
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface

2014-07-24 Thread Steve Loughran (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073075#comment-14073075
]

Steve Loughran commented on HADOOP-10641:
-

bq. this jira is not proposing new Consensus protocols, as stated in this
comment. CoordinationEngine here is an interface to be used with existing
consensus algorithms,

Exactly. This JIRA is proposing a plugin interface to co-ordination systems
using consensus algorithms, a plugin point intended for use by HDFS and others.
It is absolutely critical that all implementations of this plug in do exactly
what is expected of them -and we cannot do that without a clear definition of
what they are meant to do, what guarantees must be met and what failure modes
are expected.

The consensus node design document is not such a document. It's an outline of
what can be done, but it doesn't specify the API. The current patch for this
JIRA contains some interfaces, a ZK class and a single test case. Can we trust
this ZK class to do what is required? Not without a clear definition of what is
required. Can we trust the test case to verify that the ZK implementations does
what is required? Not now, no. What do we do if there is a difference between
what the ZK implementation does and the interface defines -is it the interface
at fault, or the ZK implementation? What if a third-party implementation does
something differently? Whose implementation is considered the correct one?

For the filesystems, HDFS defines the behavior; my '9361 JIRA was deriving a
specification from that implementation, generating more corner case tests, and
making the details of how (every) other filesystem behaves differently a
declarative bit of XML for each FS -now we can see how they differ. We've even
used it to bring the other filesystems (especially S3N) more in line with what
is expected.

This new plugin point is intended become a critical failure point for HDFS and
YARN, where the incorrect behaviour of an implementations potentially places
data at risk. Yet to date, all we have is a PDF file which, as Amazon describes
it conventional design documents consist of prose, static diagrams, and
perhaps pseudo-code in an ad hoc untestable language.

This is not a full consensus protocol; it will be straightforward to specify
strictly enough to derive tests, to tell implementors of consensus
protocol-based systems how to hook up their work to Hadoop. And, as those
implementors are expected to be experts in distributed systems and such topics,
we should be able to expect them to pick up basic specification languages just
as we expect submitters of all patches to be able to write JUnit tests.

Introduce Coordination Engine interface
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-21 Thread Steve Loughran (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068815#comment-14068815
]

Steve Loughran commented on HADOOP-10641:
-

I'm very much an =0 to the changes to HDFS, as that level is not an area of my
understanding. If something does go into HDFS, then as noted, hadoop-common
does seem an appropriate location - if it can't go into hadoop-hdfs itself.

Before that happens, consider this:

Consensus protocols are where CS-hard mathematics comes out of the textbooks
and into the codebase; it is a key place where you are expected to prove the
correctness of your algorithm before your peers will trust it. And, hopefully,
before you make the correctness of that algorithm a critical part of your own
application.

If Hadoop is going to provide a plug-in point for distributed co-ordination
systems -- which is what this proposal is -- then we need to specify what is
expected of an implementation strictly enough that it is possible to prove that
implementations meet the specification, and that downstream projects can
demonstrate that if an implementation meets this specification then their own
algorithms with be correct.

More succinctly: *I want a formal specification of the API, and what we have in
the current PDF design document is not it. I will also need evidence that the
reference ZK implementation is consistent with that specification, both by any
maths that can be provided, and the test cases derived from the specification.

This may seem a harsh requirement, but HADOOP-9361 shows that it is nothing I
would not impose on myself. It is [What Amazon is doing in their
stack|http://perspectives.mvdirona.com/CommentView,guid,5638e46e-eb91-4dbf-9f75-351afcb7a199.aspx],
and it has also been done for [Distributed File
Systems|https://birrell.org/andrew/papers/FileSysSpec-DCCS.pdf].

I would recommend using TLA+ here -and for any downstream uses. Once the
foundations are done, then we can move onto YARN, and then finally to the
applications which run on it.

I'm not going to comment on the code at all at this point, except to observe
that you should be making this a YARN service to integrate with the rest of the
services and workflow being built around them. The core classes are in
hadoop-common.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-21 Thread Andrew Purtell (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068857#comment-14068857
]

Andrew Purtell commented on HADOOP-10641:
-

bq. I want a formal specification of the API, and what we have in the current
PDF design document is not it. I will also need evidence that the reference ZK
implementation is consistent with that specification, both by any maths that
can be provided, and the test cases derived from the specification.

This is a good idea in the abstract, but the notion of applying Amazon's
process to a volunteer open source project is problematic. In terms of the
Hadoop contribution process, this is a novel requirement. It is up to the
Hadoop committership to determine commit criteria of course, but I humbly
suggest that the intersection of contributors able to mathematically prove the
correctness of a large code change while simultaneously being able to implement
production quality systems code is vanishingly small. In this case, the
contributors might be able to meet the challenge but going forward if
significant changes to Hadoop will require a team of engineers and
mathematicians, probably this marks the end of external contributions to the
project. Also, I looked at HADOOP-9361. The documentation updates there are
fantastic but I did not find any mathematical proofs of correctness.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-21 Thread Steve Loughran (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069344#comment-14069344
]

Steve Loughran commented on HADOOP-10641:
-

bq. This is a good idea in the abstract, but the notion of applying Amazon's
process to a volunteer open source project is problematic.

Consensus protocols are expected to provide proofs of the algorithms
correctness; anything derived from Paxos, Raft et al rely on those algorithms
being considered valid, and the implementors being able to understand the
algorithms. Open source consensus protocol *implementations* are expected to
publish their inner workings, else they can't be trusted. I will site Apache
Zookeeper's [ZAB protocol|http://web.stanford.edu/class/cs347/reading/zab.pdf],
and [Anubis's consistent T-space
model|http://www.hpl.hp.com/techreports/2005/HPL-2005-72.html], as examples of
two OSS products that I have used and implementations that I trust.

bq. In terms of the Hadoop contribution process, this is a novel requirement.

Implementations of distributed consensus protocols already a one place where
the team needs people who understands the maths. If a team implementing a
protocol aren't able to specify it formally in some form or other: run. And if
someone tries to submit changes to the core protocols of an OSS implementation
who can't prove that it works, I would hope that the patch will be rejected.

Which is why I believe this specific JIRA provide an API and reference
implementation of distributed updates is suitable for the criteria provide a
strict specification. I'm confident that someone in the WanDisco dev team will
be able to do this, and would make understand this specification a pre req
for anyone else doing their own implementation.

Even so, we can't expect complete proofs of correctness. Which is why I said
any maths that can be provided, and test cases.

For HADOOP-9361, the test cases were the main outcome: by enumerating
invariants and pre/post conditions, some places where we didn't have enough
tests became apparent. These were mostly failure modes of some operations (e.g.
what happens when preconditions aren't met).

Derived tests are great as:
# Jenkins can run them; you can't get mathematicians to prove things during
automated regression tests.
# It makes it easier to decide if a test failure is due to an error in the
test, or a failure of the code. If a specification-derived test fails, then it
is now due to either an error in the specification or the code.

I think we need to do the same here: from a specification of the API, build the
test cases which can verify the behavior as well as local tests can. Those
implementors of the back end now get those tests alongside a specification
which defines what they have to implement.

The next issue becomes can people implementing things understand the
specification?. It's why I used a notation that uses Python expressions and
data structures; one that should be easy to understand. It's also why users of
the TLA+ stuff in the Java C/C++ world tend to use the curly-braced form of
the language.

I'm sorry if this appears harsh or that I've suddenly added a new criteria to
what Hadoop patches have to do, but given this Coordination Manager is proposed
as a central part in a future HDFS and YARN RM, then yes, we do have to define
it properly.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-21 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069625#comment-14069625
 ] 

Konstantin Shvachko commented on HADOOP-10641:
--

??Consensus protocols are expected to provide proofs of the algorithms 
correctness??

Steve this jira is not proposing new Consensus protocols, as [stated in this 
comment|https://issues.apache.org/jira/browse/HADOOP-10641?focusedCommentId=14027111page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14027111].
CoordinationEngine here is an interface to be used with existing consensus 
algorithms, which indeed go through a rigorous math scrutiny before they become 
trustworthy.


 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064621#comment-14064621
 ] 

Konstantin Boudnik commented on HADOOP-10641:
-

As has been proposed above and agreed during the meet-up yesterday, I will go 
ahead and clear new branch {{ConsensusNode}} off the trunk, so we'll start 
adding the implementation there.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Alex Newman (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064623#comment-14064623
 ] 

Alex Newman commented on HADOOP-10641:
--

Hey dude. Should we delay this a bit?

On Wed, Jul 16, 2014 at 11:11 PM, Konstantin Boudnik (JIRA)


 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064987#comment-14064987
 ] 

Allen Wittenauer commented on HADOOP-10641:
---

Did you mean ConsensusNameNode?

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065754#comment-14065754
 ] 

Konstantin Boudnik commented on HADOOP-10641:
-

I think in the code we refer to it as CNode or ConsensusNode, hence the name 
for the branch.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-15 Thread Plamen Jeliazkov (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063127#comment-14063127
 ] 

Plamen Jeliazkov commented on HADOOP-10641:
---

We hosted a meet-up at the WANdisco office in San Ramon today. Thank you to 
everyone who came. I'd especially like to thank [~atm] and [~sanjay.radia] for 
taking their time to connect with us.

I took the liberty to record some of the comments / concerns people raised 
during our meet-up. I will list all of them here and provide a few responses.

* Is NoQuorumException and ProposalNotAcceptedException enough? Are there other 
exceptions CoordinationEngine might throw?
** My own feeling is that these two in particular were the most general and 
universal. We could always add IOException, if desired.  

* In submitProposal() there is ProposalReturnCode return value and possible 
Exception to be thrown. It is unclear which one we should use.
** I agree. Konstantin looked at me for an answer during this but I remained 
silent. The reason for this is for ProposalReturnCode to return a deterministic 
result (NoQuorum has a deterministic event; the Proposal was not sent), and to 
treat the Exception case as something wrong with the Proposal itself (i.e., 
doesn't implement equal() or hashcode() correctly, or cannot be serialized 
properly). I understand the confusion and we could do better with just the 
Exception case.

* ConsensusNode is non-specific. Consider renaming the project to 
ConsensusNameNode.
** This applies to HDFS-6469. I think ConsensusNameNode is a good name. I'll 
probably always continue to call them CNodes though. :)

* Concern for PAXOS to effectively load balance clients. Two round trips makes 
writes slow.

* CNodeProxyProvider should allow for deterministic host selection. Consider a 
round-robin approach.

* We are weakening read semantics to provide the fast read path. This makes 
stale reads possible.
** Konstantin discussed the 'coordinated read' mechanism and how we ensure 
clients talk to up-to-date NameNodes via Proposals.

* Sub-namespace WAN replication is highly desirable but double-journaling in 
the CoordinationEngine and the EditsLog is concerning.

* An address of the impact on write performance is desirable by the community.

* HBase coming up with WAL plugin for possible coordination. Wary of membership 
coordination (multiple Distributed State Machines) for HBase WALs.

* Small separate project might make it more likely for people to import CE into 
their own projects and build their own CoordinationEngines. Separate branch 
also possible.

Some of these clearly correspond to the HDFS and HBase projects and not just 
the CoordinationEngine itself. Apologies if I missed anyone's concern / point; 
pretty sure I captured everybody though.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-14 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061357#comment-14061357
]

Aaron T. Myers commented on HADOOP-10641:
-

bq. There is no resistance. The plan has always been to build CNode on a
branch. I am just trying to optimize development of CNode and HBase region
replication, which is going on in parallel. My thinking was to commit the CE
interface to trunk and then branch off HDFS of it. That way both both HDFS and
HBase can use the interface.

I'm not comfortable with committing this to Hadoop trunk before it's actually
something that Hadop trunk will use. How about committing this to both HBase
and the HDFS-6469 development branch? Or, you could of course go the route I
originally suggested of making the CE interface and ZK reference implementation
an entirely separate project that both HBase and the HDFS-6469 branch could
depend on.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-14 Thread Konstantin Boudnik (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061397#comment-14061397
]

Konstantin Boudnik commented on HADOOP-10641:
-

bq. I'm not comfortable with committing this to Hadoop trunk before it's
actually something that Hadop trunk will use.
This is a chicken-n-egg problem, don't you think? You don't want to get this
piece into common before something in the trunk will use it. However, it isn't
possible to have _anything_ in the trunk to use the APO until it is committed.
Am I missing anything?

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-14 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061447#comment-14061447
]

Aaron T. Myers commented on HADOOP-10641:
-

I'm saying you should commit the Coordination Engine interface to the
ConsensusNode feature branch and use it on that branch, and then at some point
we may merge the whole branch to trunk, CE and CN simultaneously. This is
exactly what I said previously:

{quote}
I'm fine with you proceeding with this on a development branch. That will give
you an opportunity to commit the coordination engine interface and start making
progress on HDFS-6469. If and when that materializes as a stable system that
the community wants to adopt into Hadoop, then we'll merge it back to trunk
just like we've done with many large features that are better accomplished via
multiple JIRAs and doing the work piecemeal.
{quote}

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-12 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059687#comment-14059687
 ] 

Eric Yang commented on HADOOP-10641:


Should executeAgreement updateGSN first or doExecute first?  It seems more 
reliable to do 2 phase commit, otherwise you can run into situations where GSN 
is updated but execution failed.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-11 Thread Konstantin Boudnik (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059098#comment-14059098
]

Konstantin Boudnik commented on HADOOP-10641:
-

bq. Please give me a little more credit than this. Even though it may not be
mentioned in the design doc, it's fairly transparent that the primary goal of
this work is to introduce a plugin point for WANdisco's coordination engine
implementation into Hadoop.
And as Konstantin said elsewhere: Hadoop has a number of features that are
targeted to a proprietary technologies, which doesn't seem to be bothering
anyone this far. So, I can't consider this as a real objection.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-11 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059130#comment-14059130
]

Konstantin Shvachko commented on HADOOP-10641:
--

Sorry if it sounded as an overreaction, non intended.

??doing this work on a branch??
There is no resistance. The plan has always been to build CNode on a branch. I
am just trying to optimize development of CNode and HBase region replication,
which is going on in parallel. My thinking was to commit the CE interface to
trunk and then branch off HDFS of it. That way both both HDFS and HBase can use
the interface.

??the primary goal of this work is to introduce a plugin point for WANdisco's
coordination engine implementation??
I don't see anything bad with plugging in WANdisco CE into Hadoop, [as I argued
in the other jira
comment|https://issues.apache.org/jira/browse/HDFS-6469?focusedCommentId=14057235page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14057235].
But saying its a primary goal is not fair, you know me better than that.

Let me comment on the design. We actually looked at multiple consensus
algorithms and their implementations and came up with an abstractions that
suite the area in the most general way. Particularly, the call back from
agreement to update the application state is separated from the proposing
action is because it is more generic. With some implementations of Raft a
proposer can just wait when the agreement is made and then proceed with its
execution - synchronously. But with ZK you have to set a watcher and wait for a
callback acknowledging the event - asynchronously. So Asynchronous approach
wins as more generic.

If you take Bart we can organize pickup for participants from the near station.
Also we should have a dial up.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-10 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057265#comment-14057265
]

Aaron T. Myers commented on HADOOP-10641:
-

Konst, you are overreacting here. As I said previously, I'm fine with you
proceeding with this on a development branch. That will give you an opportunity
to commit the coordination engine interface and start making progress on
HDFS-6469. If and when that materializes as a stable system that the community
wants to adopt into Hadoop, then we'll merge it back to trunk just like we've
done with many large features that are better accomplished via multiple JIRAs
and doing the work piecemeal. In particular, this comment doesn't make any
sense:

{quote}
Committing this to a development branch wouldn't make sense without you being
convinced or comfortable to have it merged to trunk once the work is done.
{quote}

The work on HDFS-6469 is not done, so how can I possibly know whether or not
I'll be convinced and comfortable to have it merged to trunk, when it is time
to merge the feature branch? My skepticism of this feature at this point is not
a good reason to commit it to trunk first without a development branch. I think
the exact opposite is the case: doing this on a feature branch is a way for you
to be able to demonstrate the benefits and prove out how low-risk the NN
changes are that are required for this work. I do not understand at all your
resistance to doing this work on a branch.

bq. Not sure which 3rd party system dependencies you see here. There are non
mentioned in the CNode design. And ZK is already a dependency for Hadoop HA.

Please give me a little more credit than this. Even though it may not be
mentioned in the design doc, it's fairly transparent that the primary goal of
this work is to introduce a plugin point for WANdisco's coordination engine
implementation into Hadoop.

I will try to make the meeting next week but at the moment my schedule does not
allow it. I will try to move things around, though. San Ramon is also a very
inconvenient location for me. Will there be a dial-in provided for those who
cannot attend in-person?

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-09 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057049#comment-14057049
]

Konstantin Shvachko commented on HADOOP-10641:
--

??several unanswered or un-retracted objections??

* I did address the complexity issue in [the first paragraph of my reply to
Suresh|https://issues.apache.org/jira/browse/HDFS-6469?focusedCommentId=14021017page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14021017]
I cannot and probably should not address comfort levels of community members in
general. But I can and will gladly address technical issues should you raise
any.
These two jiras do introduce some concepts, which may be new to somebody (as
they were to me when I started the project). But distributed coordination is
the direction in which distributed systems are moving towards their maturity.
I'll just mention Google's Spanner and Facebook's HydraBase here as examples.
In my experience such concepts in fact simplify system architectures rather
than complicate them.
* I will address Todd's comment in HDFS-6469 in more details.

??the design does not add much (or perhaps any) benefit over a simpler solution
that builds on the current HA system in Hadoop??

* I discussed the alternative solution in [my reply to
Todd|https://issues.apache.org/jira/browse/HDFS-6469?focusedCommentId=14027179page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14027179],
see section on ActiveActive vs ActiveStanby HA.
This approach faces essentially the same problems as ConsensusNode, or opens
the same can of worms as the ConsensusNode in Todd's words. But CNode in the
end gives us all active NNs, rather than single active other RD-only standbys.
* Coordination opens an opportunity for geographically distributed HDFS, which
allows to scale file system across data centers.
* Coordination opens an opportunity for active-active Yarn.
* Coordination opens an opportunity for replicated regions in HBase.

??I'm concerned about baking in dependence on a proprietary 3rd party system
for HA capabilities??
Not sure which 3rd party system dependencies you see here. There are non
mentioned in the CNode design. And ZK is already a dependency for Hadoop HA.

??general agreement??
I really don't know how to answer to the rest of your comments, Aaron.
* You seem to have issues with the design of HDFS-6469, but did not present any
technical reasons there.
* You make HDFS-6469 a pre-condition for HADOOP-10641, but CNode implementation
cannot start without the CE interface.
* Committing this to a development branch wouldn't make sense without you being
convinced or comfortable to have it merged to trunk once the work is done.
* You do not give a clue on what would indicate a general agreement or what
would convince you that there is one.

We are hosting a [community meeting next
week|https://www.eventbrite.com/e/consensus-based-replication-in-hadoop-a-deep-dive-tickets-12158236613],
which was announce on the dev lists. The topics in the agenda include
technical discussion as well as the logistics of moving forward. Are you
available to talk about this issues at the meeting and potentially work out a
general agreement or a compromise?

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-08 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054589#comment-14054589
]

Konstantin Shvachko commented on HADOOP-10641:
--

I was thinking about ATM's suggestion of taking this as a separate project some
place other than Hadoop. The main problem with that is cross dependencies.
'Coordination Engine' project depends on types from Hadoop common, and HDFS
will depend on the CE project. Since Common and HDFS are packaged together this
will be hard to break unless CE project is a part of Common. Currently the
interface uses Configuration, logging, and looking forward it will be rather
tightly integrated with Hadoop RPC layer, since many RPC calls should trigger
coordination.

I think the patch looks good now. +1
I see how the interfaces can be the base for HDFS and HBase coordination. I
also see that ZKCoordinationEngine needs more work. I think once we start using
it we will see where it should evolve.
I propose to commit this upon Jenkins approval and if there are no objections.
This should free up the implementation of coordination for HBase and HDFS.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-08 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055156#comment-14055156
]

Aaron T. Myers commented on HADOOP-10641:
-

I don't think we should be committing this to the trunk of Hadoop Common until
it's actually 100% clear that HDFS will have a dependency on it. As it stands,
there are a lot of objections in HDFS-6469 about adding this to the NN at all.

If you want to add it to a development branch in Hadoop then I won't stand in
the way of that, but I honestly believe you'll be able to make a lot more
headway more quickly if you go the route of putting this in a separate project.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-08 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055383#comment-14055383
 ] 

Konstantin Boudnik commented on HADOOP-10641:
-

[~atm] HDFS-6469 doesn't have _a lot of objections_ really. It has some 
questions, but they are already pretty much covered, IMO, by [~shv]. So it 
doesn't looks like HDFS ticket is blocking the passage of this one.


 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-08 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055505#comment-14055505
]

Aaron T. Myers commented on HADOOP-10641:
-

HDFS-6469 still has several unanswered or un-retracted objections, perhaps most
notably this one from Suresh:

{quote}
I am very uncomfortable about adding all this complexity into HDFS.
{quote}

and this one from Todd:

{quote}
Lastly, a fully usable solution would be available to the community at large,
whereas the design you're proposing seems like it will only be usably
implemented by a proprietary extension (I don't consider the ZK reference
implementation likely to actually work in a usable fashion).
{quote}

I personally share both of these concerns with Suresh and Todd, specifically
that this design seems overly-complex and does not add much (or perhaps any)
benefit over a simpler solution that builds on the current HA system in Hadoop,
and I'm concerned about baking in dependence on a proprietary 3rd party system
for HA capabilities.

My main point here is that this work (HADOOP-10641) will not be useful to
Hadoop except in the context of HDFS-6469. So, I'm not OK with committing this
to trunk, at least until there's general agreement that HDFS-6469 is a
reasonable design that we should move forward with in Hadoop. As it stands, I
don't think there is such agreement; I certainly have not been convinced of
this yet. I'm OK with you committing this to a development branch if you want
to try to make progress that way, though.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-04 Thread Rakesh R (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052297#comment-14052297
]

Rakesh R commented on HADOOP-10641:
---

Hi [~posix4e],
bq. zab or zk?
ZAB is an fresh project idea and this is in the initial phase. I'm not having
much details now. Please follow ZOOKEEPER-1931 to know more on this.
Motivation - there could be many use cases where you need a quorum based
replication. So fresh thought came up to define consensus algorithm (ZAB) more
cleanly so that the users can define their own data models and use ZAB to
replicate their own data.

Actually after seeing 'Coordination Engine' feature in HDFS, I thought of
introducing this new idea.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-03 Thread Alex Newman (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051959#comment-14051959
 ] 

Alex Newman commented on HADOOP-10641:
--

[~rakeshr] zab or zk?

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-27 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045813#comment-14045813
 ] 

Rakesh R commented on HADOOP-10641:
---

Thanks [~shv] for the interest on ZAB internship and if everything goes well 
will try use ZAB as a base for CoordinationEngine. I'll keep an eye on this 
JIRA to watch the progress.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-13 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030312#comment-14030312
]

Konstantin Shvachko commented on HADOOP-10641:
--

Rakesh, thanks for the link. This is indeed in line with this effort.
The interface defined here is not dependent on the storage provided by
Zookeeper. It only needs the coordination piece in the form defined. So if you
implement stand-alone ZAB it could potentially be used as a base for
CoordinationEngine.
I see the work is just starting on github - hope the internship will be fun and
productive.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-12 Thread Andrey Stepachev (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029899#comment-14029899
]

Andrey Stepachev commented on HADOOP-10641:
---

[~zero45], good work.
just a couple of picky notes :)

org/apache/hadoop/coordination/zk/ZkCoordinationEngine.java:117
Not configured localNodeId can lead to uncontrolled creation of new zk
sessions.
Should code check that before creating ZooKeeper object?
'catch' should close zookeeper, if it was opened.

org/apache/hadoop/coordination/zk/ZkCoordinationEngine.java:132
split doesn’t take into account chrooted zk configurations.
it is much better to use org.apache.zookeeper.client.ConnectStringParser

org/apache/hadoop/coordination/zk/ZkCoordinationEngine.java:301
currentGSN is long, but treated as integer. easy to create overflow.

More serious problem, that zk sequential nodes are _integer_.
Code should guard sequence overflows over zero and handle that using
more then one parent znodes or zero crossing detection or other techniques
preventing integer seqence id overflow.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-11 Thread Rakesh R (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027601#comment-14027601
]

Rakesh R commented on HADOOP-10641:
---

bq.The intent of this jira is not to solve the general problem of distributed
consensus. That is, I do not propose to build an implementation of paxos or
other coordination algorithms here. This is only to introduce a common
interface, so that real implementations such as ZooKeeper could be plugged into
hadoop projects.

This sounds interesting. Thanks for the effort!. If I understood the discussion
correctly here, the idea is to build a quorum based replication. For example,
the events(I think this represents data) are submitted as proposals to a quorum
of nodes. In ZooKeeper terms, Leader proposes values to the Followers. Now
Leader wait for acknowledgements from a quorum of Followers before considering
a proposal committed. Also, Leader queues COMMIT(zxid) events to all Followers
so that all other nodes learn the events. This ensures that the events will be
reached to all nodes in the system. Adding one more point, in general ZK
provides strong ordering guarantees.

Sometime back ZooKeeper folks initiated discussions to decouple ZAB from
ZooKeeper, so that users can make use of this and can define their own models
and reliably replicate the data. There is a related JIRA ZOOKEEPER-1931 talks
similar feature, now this is in initial dev stage. Please have a look at this.
I hope this would help to define a common interface, also an opportunity for us
to know more about the use cases.

Regards,
Rakesh

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-10 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027111#comment-14027111
]

Konstantin Shvachko commented on HADOOP-10641:
--

there's not much expertise in Hadoop for the general problem of distributed
consensus

I just want to make sure we are on the same page here. The intent of this jira
is not to solve the general problem of distributed consensus. That is, I do not
propose to build an implementation of paxos or other coordination algorithms
here. This is only to introduce a common interface, so that real
implementations such as ZooKeeper could be plugged into hadoop projects.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-10 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027444#comment-14027444
]

Aaron T. Myers commented on HADOOP-10641:
-

bq. I just want to make sure we are on the same page here. The intent of this
jira is not to solve the general problem of distributed consensus. That is, I
do not propose to build an implementation of paxos or other coordination
algorithms here. This is only to introduce a common interface, so that real
implementations such as ZooKeeper could be plugged into hadoop projects.

Totally get that, but I think the point still remains that there's little
expertise for defining a common interface for coordination engines in general
in this project, and no real reason that the Hadoop project should necessarily
be the place where that interface is defined. The ZooKeeper project, a ZK
sub-project, or an entirely new TLP makes more sense to me.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-09 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025388#comment-14025388
]

Aaron T. Myers commented on HADOOP-10641:
-

bq. Aaron, a separate project is a great idea.

Cool, glad we agree. Shall we resolve this JIRA then and take this proposal to
the Apache Incubator? A standalone separate project seems like a much more
reasonable place to put this work to me.

bq. ZK is probably not the best home for the interface as ZKCoordinationEngine
is only one its implementations. People talked to me about possible
implementations using Raft, Bookeeper, Paxos, and two other engines. I think
the variety makes sense as there could be different performance, network,
reliability etc. requirements. Having said that, I am flexible if ZK community
wants to adopt it.

I was only suggesting ZK since that project's focus is about reliable
distributed coordination, which is not what Hadoop aims to do at all. If for
some reason you didn't want to try to make this a TLP (which, again, seems more
reasonable to me) then trying to contribute it to that project makes a lot more
sense to me.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-09 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026015#comment-14026015
]

Konstantin Shvachko commented on HADOOP-10641:
--

reliable distributed coordination, which is not what Hadoop aims to do at all.

Not sure where this leaves QJM. I thought it satisfies all these requirements
including the Hadoop's aim.
Anyways, we have it, and we also do distributed coordination between active and
standby NNs, RMs.

Hadoop common is chosen for the coordination engine interface as the lowest
common ancestor. It could be used for anything Hadoop from here: HDFS, HBase,
Yarn. I don't have other use cases for it in my mind at the moment. Did not
have enough experience with TLPs and Incubators, but thought it needs like
something bigger. Say, in this case, implementations for more than one CE.
Sorry if I misunderstood you, but if you want to take it to Apache Incubator
I'll be on your side.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-09 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026025#comment-14026025
]

Aaron T. Myers commented on HADOOP-10641:
-

bq. Not sure where this leaves QJM. I thought it satisfies all these
requirements including the Hadoop's aim.

The NameNode QuorumJournalManager and JNs are expressly for storing HDFS NN
edit logs. Not for general purpose consensus, not for use by other projects
like HBase, etc.

bq. Hadoop common is chosen for the coordination engine interface as the lowest
common ancestor.

But a separate project could just as well be a common ancestor, just like both
Hadoop and HBase separately depend on ZooKeeper. There's no actual need for it
to be in Hadoop Common if HBase is to use it.

bq. It could be used for anything Hadoop from here: HDFS, HBase, Yarn.

But seems like it could also be used for arbitrary, non-Hadoop things, correct?
If so, why put it in Hadoop?

bq. I don't have other use cases for it in my mind at the moment. Did not have
enough experience with TLPs and Incubators, but thought it needs like something
bigger. Say, in this case, implementations for more than one CE. Sorry if I
misunderstood you, but if you want to take it to Apache Incubator I'll be on
your side.

I personally don't think there's any good reason for this to start out as part
of a larger project, and honestly think there are several downsides. For
example, Hadoop's release cadence is too slow for a new project like this,
there's not much expertise in Hadoop for the general problem of distributed
consensus, possible desire for other non-Hadoop projects to want to use it, etc.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-08 Thread Lohit Vijayarenu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021117#comment-14021117
]

Lohit Vijayarenu commented on HADOOP-10641:
---

Minor comments.
- It looks like checkQuorum is kind of noop for submitProposal in ZK based
implementation, since zooKeeper.create would fail if there is no quorum anyways?
- In ZK based Coordination Engine implementation, how are ZNodes cleaned up?
Looking at patch each proposal creates PERSISTENT_SEQUENTIAL, but no mention of
cleanup.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-08 Thread Plamen Jeliazkov (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021473#comment-14021473
]

Plamen Jeliazkov commented on HADOOP-10641:
---

Hi Lohit, thanks for your comments!
# checkQuorum is an optimization some coordination engines may choose to
implement in order to fail-fast to client requests. In the NameNode case, if
quorum loss was suspected, that NameNode could start issuing StandbyExceptions.
# You are correct that the ZKCoordinationEngine does not implement ZNode
clean-up currently. That is because it was made as a proof of concept for the
CoordinationEngine API. Nonetheless, proper clean-up can be implemented. All
one has to do is delete the ZNodes that everyone else has already learned about.
## Suppose you have Node A, B, and C, and Agreements 1, 2, 3, 4, and 5.
## Node A and B learn Agreement 1 first. Node C is a lagging node. A B
contain 1. C contains nothing.
## Node A and B continue onwards, learning up to Agreement 4. A B contain 1,
2, 3, and 4 now. C contains nothing.
## Node C finally learns Agreement 1. A B contain 1, 2, 3, and 4 now. C
contains 1.
## We can now discard Agreement 1 from persistence because we know that all the
Nodes, A, B, and C, have safely learned about and applied Agreement 1.
## We can apply this process for all other Agreements.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-07 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020942#comment-14020942
]

Konstantin Shvachko commented on HADOOP-10641:
--

I think that the latest patch is a better example on how the coordination
engine should work: it accepts a proposal, then agrees on its order, then
invokes Agreement.execute() to trigger the learner state update. This is an
initial patch and is a working progress. We should also introduce a generic
Proposer, which submits a proposal and then waits for agreement to be executed
in order to reply to the client.
Nicholas, good points:
- We chose Serializable interface to keep serialization as generic as possible,
as Plamen mentioned. We routinely define readObject(), writeObject() for our
proposals using protobuf. Actually in many cases we avoid extra serialization
by passing already serialized requests directly from RPC to CoordinationEngine.
- Not sure if it is exactly what you mean by the version of coordination
engine. I see two versions here:
-# CE software version. This should be tracked in the implementation of CE. We
assume there could be different implementations of CE. If one implementation
changes it should not effect other.
-# The version of a set of Proposals / Agreements. This should be tracked on
the application level. The set of proposals / agreements reflects a particular
application logic and is orthogonal to the engine implementation.
- Rolling upgrade is divided into two parts. Rolling upgrade of the application
(e.g. HDFS) and rolling upgrade of the engine itself. If both support it then
everything is good. If CE does not support RU, but the application does, then
we should still be able to upgrade the application in the rolling manner but
without upgrading CE. CE in this case will continue assigning GSNs and produce
agreements.
- Proposal and Agreement are made as generic as possible. Proposal is formally
empty, but it has everything it needs. One need to define equals(), hashCode(),
and serialization. We also routinely implement toString() so that CE could
print proposals in the logs.
Agreement has one method execute(), so that CE could make a call back to update
the application state. It also has two generic parameters: first defines the
application type, the second - the return type of the execute() method, which
is intended to be returned to the client.
ConsensusProposal from the patch combines Proposal and Agreement. This is
probably the most typical use case. But there could be Proposals that don't
assume Agreements, like control messages to the CE, and Agreements, which are
different from corresponding Proposals, or even agreements without proposals,
like control commands from CE to the application.
- ZKCoordinationEngine should not be abstract. Made the same comment to Plamen.
- Documentation for ZKCoordinationEngine makes sense.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-07 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020947#comment-14020947
]

Konstantin Shvachko commented on HADOOP-10641:
--

Aaron, a separate project is a great idea. Although I am not sure right now how
it will evolve. It may spin off from hadoop, like hadoop did from nutch. ZK is
probably not the best home for the interface as ZKCoordinationEngine is only
one its implementations. People talked to me about possible implementations
using Raft, Bookeeper, Paxos, and two other engines. I think the variety makes
sense as there could be different performance, network, reliability etc.
requirements.
Having said that, I am flexible if ZK community wants to adopt it.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-06 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020168#comment-14020168
]

Konstantin Shvachko commented on HADOOP-10641:
--

# NOTICE and LICENSE files are not needed as they are already contained in the
hadoop-common.
# MiniZooKeeperCluster should use [ClientBase
methods|http://svn.apache.org/viewvc/zookeeper/branches/branch-3.4/src/java/test/org/apache/zookeeper/test/ClientBase.java?view=markup]
(like setupTestEnv() and waitForServer*()) instead of copying them.
# Your SampleLearner is a good example of how it should not be used.
SampleLearner represents the state of the server, like namespace or a region.
Learner should be the first generic parameter of the Agreement, so that
Agreement.execute() could call the update methods of the Learner. Not the other
way around as in the example provided.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-06 Thread Tsz Wo Nicholas Sze (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020284#comment-14020284
]

Tsz Wo Nicholas Sze commented on HADOOP-10641:
--

Looked at the patch. Some questions:
- For serialization, I think we should use ProtoBuf but not
java.io.Serializable. Then, it is easier to support compatibility.
- It seems that there is no version check between CoordinationEngine(s). How
to detect if the engines are compatible with each other? Also, how to support
rolling upgrade?
- Do we really need Proposal and Agreement as two interfaces? Currently,
Proposal extending Serializable is an empty interface and Agreement also
extending Serializable only has one method.
- I also don't understand why Agreement.execute(..) needs a callBackObject
(learner?) parameter. The test does not make use of it.
- ZkCoordinationEngine also has a doExecute(..) method. Could it be combined
with Agreement.execute(..)? I suspect we can get rid of
ZkCoordinationEngine.doExecute(..) in order to make ZkCoordinationEngine
non-abstract. Then, applications like namenode only have to use the engine but
not extend it.
- In the CNode design doc, the Coordination Engine section describes some
general concepts and some details for the Paxos algorithm. Could you also add
some details for the ZkCoordinationEngine? Also, what are the advantages and
disadvantages of the ZkCoordinationEngine compared with the Paxos algorithm?

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-06 Thread Aaron T. Myers (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020308#comment-14020308
]

Aaron T. Myers commented on HADOOP-10641:
-

Hey Konstantin and Plamen, have y'all given any thought to contributing the
Coordination Engine somewhere other than the Hadoop project? Sounds like it's a
pretty general-purpose system that other projects that have nothing to do with
Hadoop might want to use. Seems to me like it might reasonably be a separate
top-level Apache project, which Hadoop and HBase (and perhaps others) could
then depend on. It might also make sense for it to be a sub-project of the
ZooKeeper project, much like BookKeeper is.

With something this new that you presumably want to iterate on quickly, seems
like a shame to have to wait around for a Hadoop release to be able to pick up
an updated Coordination Engine.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-06 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020341#comment-14020341
 ] 

Konstantin Boudnik commented on HADOOP-10641:
-

Good point [~atm]! While other projects might take an advantage of coordinated 
intent approach, the HDFS and HBase are the only two right now that are 
focusing on real HA with low MTTR. I am just a bit hesitant of 
over-generalizing things ;)

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-06 Thread Plamen Jeliazkov (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020426#comment-14020426
]

Plamen Jeliazkov commented on HADOOP-10641:
---

[~szetszwo], thanks for the review! It was nice to finally see a face at the
Summit as well. :)
[~atm], thanks for the comments! I think I am outside of that discussion, most
likely [~cos] or [~shv] can comment better on where to take the project.

I posted a new patch around the same time your review came in; there were
mistakes in the way agreement executions work.
* ProtoBuf is certainly a nice choice for serialization. However, we shouldn't
need to bind ourselves to any one serialization format. This is why we use
Serializable. It is certainly possible to have the writeObject call write out a
ProtoBuf of the proposal itself, for example, and read the values back using
ProtoBuf as well. This is feasible with the current interfaces.
* Good point on version compatibility. AFAIK, version compatibility would take
place once the quorum is established as prior to that there is no communication
between the engines. So the coordination engine, as part of bootstrap, should
perform a version check against its quorum peers. Perhaps this means extending
the API, or making it part of a larger interface?
(VersionedCoordinationEngine)? [~shv] might be able to comment better.
* Please see my new patch. The idea is indeed to make the agreement execute on
some callBack object, SampleLearner in this case. The new patch should show the
test making use of it.
* Yes we can probably do some refactoring here. I'll work on a new patch.
* Yes we can add details for ZkCoordinationEngine. Unsure of any clear
advantages and disadvantages. The only thing that comes to my mind right away
is that it may be possible to build Paxos directly into the CoordinationEngine
implementation, thus co-locating the coordination service with the server /
application itself, rather than having to make RPC calls and wait for
responses, like with ZooKeeper(s). I don't think the intent of this work is
really to compare any one coordination mechanism with another but so much as
provide a common interface for which one can implement whichever they prefer.

Introduce Coordination Engine
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

55 matches

Mail list logo