[
https://issues.apache.org/jira/browse/AURORA-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220111#comment-14220111
]
Maxim Khutornenko commented on AURORA-942:
------------------------------------------
The only concern I have about this approach is the inherent limitation of the
ZK max node size [1] of 1MB. I can see how that can be exceeded when a
TaskConfig with a large ExecutorConfig is stored. Addressing AURORA-540 may
help us work around this limitation but then the problem shifts into where the
ExecutorConfig is going to be stored?
[1] -
https://zookeeper.apache.org/doc/r3.1.2/api/org/apache/zookeeper/ZooKeeper.html#create(java.lang.String,
byte[], java.util.List, org.apache.zookeeper.CreateMode)
> Explore using a replicated log on top of ZooKeeper
> --------------------------------------------------
>
> Key: AURORA-942
> URL: https://issues.apache.org/jira/browse/AURORA-942
> Project: Aurora
> Issue Type: Task
> Components: Scheduler
> Reporter: Bill Farner
> Priority: Minor
>
> The scheduler uses the replicated log implementation provided by mesos
> (native libmesos.so). It would be interesting to compare this against a
> replacement that sllows us to:
> - shed code to implement backups and recovery
> - remove one use of a dynamically-linked native library
> - use a store that allows non-leaders to read, for faster recovery and
> serving from non-active members
> - avoid the need for periodic failover (we currently have to do this to
> induce compaction in LevelDB and minimize log replay time)
> At first glance, it seems like it would be relatively straightforward to come
> up with a Log implementation \[1\] that persists transactions as nodes in
> ZooKeeper. This would enable all the above results.
> \[1\]
> https://github.com/apache/incubator-aurora/blob/10da38a3a0ad6ebbee055c26adc3ed3437ec3930/src/main/java/org/apache/aurora/scheduler/log/Log.java#L26
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)