[
https://issues.apache.org/jira/browse/ATLAS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744923#comment-14744923
]
Shwetha G S edited comment on ATLAS-58 at 9/15/15 6:56 AM:
-----------------------------------------------------------
Hive hook sends notification messages (list of entities). The Notification
consumer on server side consumes these messages and registers the entities. The
server handles de-duping of entities based on the unique attribute of the entity
Big changes:
1. Concept of service that are started and stopped at atlas start and stop
2. De-duping of entities on server based on any unique attribute for the
entity. If entity doesn't have any unique attribute, de-duping is not done and
new entity is created
3. Changed entity submit API to take list of entities instead of just 1 entity
(required for hive hook) - backward incompatible
4. Moved submit and list from EntityResource to EntitiesResource - backward
incompatible
5. Moved security tests from integration tests to unit tests - as they were
creating issues with server start as jetty already starts another server for
integration tests
6. Removed some duplicate tests from repository module (the same tests exist in
typesystem module as well)
7. In webapp ITs, re-used the types defined
8. Hive hook now sends notifications instead of registering entities. Sending
notification is done synchronously. So, this adds to hive command execution
delay. But this also makes it reliable
Pending:
1. Entity updates like alter table commands are not handlded. Will create
another jira for this
2. Webapp jetty plugin doesn't shutdown embedded kafka at the end of
integration tests. So, hive bridge ITs fail. Hive bridge ITs pass if run on
their own. Still checking on this
was (Author: shwethags):
Hive hook sends notification messages (list of entities). The Notification
consumer on server side consumes these messages and registers the entities. The
server handles de-duping of entities based on the unique attribute of the entity
Big changes:
1. Concept of service that are started and stopped at atlas start and stop
2. De-duping of entities on server based on any unique attribute for the
entity. If entity doesn't have any unique attribute, de-duping is not done and
new entity is created
3. Changed entity submit API to take list of entities instead of just 1 entity
(required for hive hook)
4. Moved security tests from integration tests to unit tests - as they were
creating issues with server start as jetty already starts another server for
integration tests
5. Removed some duplicate tests from repository module (the same tests exist in
typesystem module as well)
6. In webapp ITs, re-used the types defined
7. Hive hook now sends notifications instead of registering entities. Sending
notification is done synchronously. So, this adds to hive command execution
delay. But this also makes it reliable
Pending:
1. Entity updates like alter table commands are not handlded. Will create
another jira for this
2. Webapp jetty plugin doesn't shutdown embedded kafka at the end of
integration tests. So, hive bridge ITs fail. Hive bridge ITs pass if run on
their own. Still checking on this
> Make hive hook reliable
> -----------------------
>
> Key: ATLAS-58
> URL: https://issues.apache.org/jira/browse/ATLAS-58
> Project: Atlas
> Issue Type: Sub-task
> Reporter: Shwetha G S
> Assignee: Shwetha G S
> Labels: incompatible
> Fix For: trunk
>
> Attachments: ATLAS-58-v2.patch, ATLAS-58.patch
>
>
> Currently, hive hook executes in background thread pool and is an best effort
> approach to register entities. But this needs to be reliable for data
> governance to be effective
> One way is - in hive hook, add the entities to some messaging framework and
> atlas server can read the entities from the message and register in atlas.
> Since, posting message is faster, we can do it synchronously and hence
> reliable entity registration.
> We can start with kafka for messaging, but any other messaging framework
> should be pluggable
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)