[ 
https://issues.apache.org/jira/browse/ATLAS-511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184343#comment-15184343
 ] 

Hemanth Yamijala commented on ATLAS-511:
----------------------------------------

[~cassiodossantos], thank you for looking through the document and sharing your 
comments.

For your first point, I did have solr 5 index added in bold initially. But 
during internal review, [~shwethags] clarified that this was OK to initialize 
on all instances. Looking through the code, what is happening is that we are 
letting Titan know that Solr5 index implementation class ({{Solr5Index}}) is 
being mapped to the name "solr5". This seems safe. I might change the name of 
this method to make it more clear that it is not modifying the Index as such.

Regarding your second point, I agree that with more types being added to the 
typesystem, the load time would increase. Not only that, there would possibly 
be other issues like unbounded memory growth etc. Currently in all the 
initialization steps, this is probably the only one that demands attention. A 
simple thing I can think of is to make this an opportunistic cache and reach 
back to the backend stores if it is not found in memory. This would help us to 
load lazily (or maybe asynchronously). It could also help us later to bound the 
cache size to control memory bounds. I think this is what you meant by 
"on-demand type cache initialization". 

In the interest of making changes (and reviews) manageable, I would like it if 
the second change is implemented in a separate JIRA from this one.

> Ability to run multiple instances of Atlas Server with automatic failover to 
> one active server
> ----------------------------------------------------------------------------------------------
>
>                 Key: ATLAS-511
>                 URL: https://issues.apache.org/jira/browse/ATLAS-511
>             Project: Atlas
>          Issue Type: Sub-task
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>         Attachments: HADesign.pdf
>
>
> One of the most important components that only supports active-standby mode 
> currently is the Atlas server which hosts the API / UI for Atlas. As 
> described in the [HA 
> Documentation|http://atlas.incubator.apache.org/0.6.0-incubating/HighAvailability.html],
>  we currently are limited to running only one instance of the Atlas server 
> behind a proxy service. If the running instance goes down, a manual process 
> is required to bring up another instance.
> In this JIRA, we propose to have an ability to run multiple Atlas server 
> instances. However, as a first step, only one of them will be actively 
> processing requests. To have a consistent terminology, let us call that 
> server the *master*. Any requests sent to the other servers will be 
> redirected to the master.
> When the master suffers a partition, one of the other servers must 
> automatically become the master and start processing requests. What this mode 
> brings us over the current system is the ability to automatically failover 
> the Atlas server instance without any  manual intervention. Note that this 
> can be arguably called an [active/active 
> setup|https://en.wikipedia.org/wiki/High-availability_cluster]
> ATLAS-488 raised to support multiple active Atlas server instances. While 
> that would be ideal, we have to learn more about the underlying system 
> behavior before we can get there, and hopefully we can take smaller steps to 
> improve the system systematically. The method proposed here is similar to 
> what is adopted in many other Hadoop components including HDFS NameNode, 
> HBase HMaster etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to