Re: Review Request 65790: Data Migration Utility: Export

2018-03-04 Thread Ashutosh Mestry

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65790/
---

(Updated March 5, 2018, 5:39 a.m.)


Review request for atlas, Apoorv Naik, Madhan Neethiraj, and Ruchi Solani.


Changes
---

Updates include: 
- Addressed review comments. 
- Attached latest migration utility.


Bugs: ATLAS-2461
https://issues.apache.org/jira/browse/ATLAS-2461


Repository: atlas


Description
---

**Background**
The data migration utility allows for exporting data from an Atlas instance 
without the server running. This is important as it will prevent Atlas from 
processing any requests.

**Approach**
The migration is a new utility that will perform export of data using a 
pre-defined export request file.

The approach used by this application:
- Create an application context (_migrationContext.xml_). This context, 
prevents instantiation of number of classes, most notably _webapp_, 
_notifications_, _listeners_.
- Create _ImportService_ using Spring's framework classes.

Here are the pieces:
- _MigrationApp_: Contains _main_. It is the entry point for the app.
- _Exporter_: Contains plumbing needed to use the new _migrationContext.xml_ 
and create _ApplicationContext_ for use.
- Dengenerate classes _EmptyNotification_, _EmptyNotificationChangeListener_. 
These are necessary to launch the application in a way that no notifications 
are sent out.
- _atlas_migration.py_ Creates environment for the app to execute. It launches 
the application.
- _migration-export-request.json_ This can be modified for environments that do 
not have latest improvements to _Export_.
- _README_: Instructions on usage.

**Build**
The project maven's assembly building plugin. The build will create a ZIP file 
with the necessary files.

_mvn install package_ generates a ZIP file in _target_ directory.

**Usage (from README)**
- Use Ambari to turn shutdown Atlas.
- Unzip contents of the ZIP to a directory on the server using: 'unzip 
atlas-migration-kit.zip'
- cd atlas-migration-kit
- Run 'python atlas_migration.py'. Use 'python atlas_migration.py 
' to export to a specific file.
- To watch the progress: 'tail -f /var/log/atlas/application.log'.


Diffs (updated)
-

  pom.xml 7db1be78 
  tools/atlas-migration-utility/pom.xml PRE-CREATION 
  tools/atlas-migration-utility/src/assembly/bin.xml PRE-CREATION 
  
tools/atlas-migration-utility/src/main/java/org/apache/atlas/migration/Exporter.java
 PRE-CREATION 
  
tools/atlas-migration-utility/src/main/java/org/apache/atlas/migration/NoOpNotification.java
 PRE-CREATION 
  
tools/atlas-migration-utility/src/main/java/org/apache/atlas/migration/NoOpNotificationChangeListener.java
 PRE-CREATION 
  tools/atlas-migration-utility/src/main/resources/README PRE-CREATION 
  tools/atlas-migration-utility/src/main/resources/atlas_migration.py 
PRE-CREATION 
  
tools/atlas-migration-utility/src/main/resources/migration-export-request.json 
PRE-CREATION 
  tools/atlas-migration-utility/src/main/resources/migrationContext.xml 
PRE-CREATION 


Diff: https://reviews.apache.org/r/65790/diff/5/

Changes: https://reviews.apache.org/r/65790/diff/4-5/


Testing
---

**Unit tests**
None.

**Functional tests**
Exports from existing clusters.


File Attachments (updated)


Migration Utility
  
https://reviews.apache.org/media/uploaded/files/2018/03/05/4d42eda3-cba8-441d-8505-12271cffc569__atlas-migration-kit-0.8.3-SNAPSHOT-bin.zip


Thanks,

Ashutosh Mestry



Re: Review Request 65885: ATLAS-2470 - Add JanusGraph Cassandra support to Atlas

2018-03-04 Thread Pierre Padovani


> On March 4, 2018, 11:07 a.m., David Radley wrote:
> > My review comments are from my initial look at the code; I hope to try 
> > running this patch to verify it works for me
> 
> Pierre Padovani wrote:
> I'll update the patch with the above changes today or tomorrow as I have 
> time.

I've updated the ATLAS-2470 with a new patch that should fix all of these 
issues.


- Pierre


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65885/#review198593
---


On March 2, 2018, 4:51 p.m., Pierre Padovani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65885/
> ---
> 
> (Updated March 2, 2018, 4:51 p.m.)
> 
> 
> Review request for atlas and David Radley.
> 
> 
> Bugs: ATLAS-2470
> https://issues.apache.org/jira/browse/ATLAS-2470
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This updates pom's to add the needed cassandra jars, and adds a dist profile 
> embedded-cassandra-solr.
> 
> 
> Diffs
> -
> 
>   distro/pom.xml 0103bef6 
>   distro/src/bin/atlas_config.py e6415cf4 
>   distro/src/bin/atlas_start.py 39be6b7c 
>   distro/src/bin/atlas_stop.py 66edd904 
>   distro/src/conf/cassandra.yml PRE-CREATION 
>   distro/src/conf/zookeeper/log4j.properties PRE-CREATION 
>   distro/src/conf/zookeeper/zoo.cfg PRE-CREATION 
>   distro/src/main/assemblies/standalone-package.xml 1881082e 
>   docs/src/site/twiki/InstallationSteps.twiki 6b9f0313 
>   graphdb/janus/pom.xml 143b775f 
> 
> 
> Diff: https://reviews.apache.org/r/65885/diff/1/
> 
> 
> Testing
> ---
> 
> Full build with the new embedded-cassandra-solr, and testing to make sure 
> Atlas comes up and is functional.
> 
> Be aware that we have been running Cassandra backed JanusGraph for months 
> with no issues.
> 
> 
> Thanks,
> 
> Pierre Padovani
> 
>



[jira] [Commented] (ATLAS-2470) Basic support for Cassandra

2018-03-04 Thread Pierre Padovani (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385407#comment-16385407
 ] 

Pierre Padovani commented on ATLAS-2470:


Added a new patch version with changes from the code review.

> Basic support for Cassandra 
> 
>
> Key: ATLAS-2470
> URL: https://issues.apache.org/jira/browse/ATLAS-2470
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 1.0.0
>Reporter: Pierre Padovani
>Assignee: Pierre Padovani
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: ATLAS-2470-1.patch, ATLAS-2470.patch
>
>
> Add the basic build support, and ability to run embedded Cassandra and solr 
> using a profile build:
> -Pdist,embedded-cassandra-solr



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ATLAS-2470) Basic support for Cassandra

2018-03-04 Thread Pierre Padovani (JIRA)

 [ 
https://issues.apache.org/jira/browse/ATLAS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Padovani updated ATLAS-2470:
---
Attachment: ATLAS-2470-1.patch

> Basic support for Cassandra 
> 
>
> Key: ATLAS-2470
> URL: https://issues.apache.org/jira/browse/ATLAS-2470
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 1.0.0
>Reporter: Pierre Padovani
>Assignee: Pierre Padovani
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: ATLAS-2470-1.patch, ATLAS-2470.patch
>
>
> Add the basic build support, and ability to run embedded Cassandra and solr 
> using a profile build:
> -Pdist,embedded-cassandra-solr



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65885: ATLAS-2470 - Add JanusGraph Cassandra support to Atlas

2018-03-04 Thread Pierre Padovani


> On March 4, 2018, 11:04 a.m., David Radley wrote:
> > distro/pom.xml
> > Lines 258 (patched)
> > 
> >
> > I see that you are downloading Zookeeper. Many components require 
> > zookeeper - so it may already exist on the system. I think for production 
> > you would want to run with an external Zookeeper. 
> > 
> > I suggest we allow this as a build option. I realise this jira is for 
> > the embedded Cssandra build only - but I could envisage you needing 
> > non-embedded cassandra builds as well; I thought I would suggest this here 
> > - if this is a requirement for you, you  may decide you want to address 
> > this in a separate Jira.

Yes, this was meant only for the embedded side of things, which would typically 
only happen for a development installation or a Docker image. My intention is 
to address setup and config for cassandra in a production environment in the 
documentation. I have another task specific to updating the documentation 
around this. My hope was to get this merged to master first, then I can get the 
Dockerfile built out that supports this configuration in another task.


> On March 4, 2018, 11:04 a.m., David Radley wrote:
> > distro/pom.xml
> > Lines 302 (patched)
> > 
> >
> > for consistency should we have  as well?

You cannot just have cassandra without a solr/elasticsearch install unless you 
are using DSE (the Datastax Enterprise with integrates with solr). Based on 
cursory reading over on JanusGraph, they do not recommend production 
deployments of JanusGraph with Cassandra embedded as the performance profiles 
are not entirely predictable. Again I would address this in the main 
documentation task.


> On March 4, 2018, 11:04 a.m., David Radley wrote:
> > distro/src/bin/atlas_start.py
> > Lines 121 (patched)
> > 
> >
> > For an embedded hbase build, it will use the hbase zk and the embeded 
> > solr. I assume this line ( and the matching stop) should not be called for 
> > the embedded hbase build (which is not using the downloaded zk).

I could swear I tested this, and it required I spin up a zookeeper to get it to 
work. I just reset to master and retested (after fixing the atlas_config.py) 
and it worked without a local ZK start. So I'll remove this from the profile.


> On March 4, 2018, 11:04 a.m., David Radley wrote:
> > docs/src/site/twiki/InstallationSteps.twiki
> > Lines 38 (patched)
> > 
> >
> > We did not have to do with before this change for embedded hbase solr. 
> > I think these notes should not apply to the embedded hbase solr profile.

Yep, as stated before I'll remove it, and find a way to only do it for a 
cassandra deploy.


- Pierre


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65885/#review198584
---


On March 2, 2018, 4:51 p.m., Pierre Padovani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65885/
> ---
> 
> (Updated March 2, 2018, 4:51 p.m.)
> 
> 
> Review request for atlas and David Radley.
> 
> 
> Bugs: ATLAS-2470
> https://issues.apache.org/jira/browse/ATLAS-2470
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This updates pom's to add the needed cassandra jars, and adds a dist profile 
> embedded-cassandra-solr.
> 
> 
> Diffs
> -
> 
>   distro/pom.xml 0103bef6 
>   distro/src/bin/atlas_config.py e6415cf4 
>   distro/src/bin/atlas_start.py 39be6b7c 
>   distro/src/bin/atlas_stop.py 66edd904 
>   distro/src/conf/cassandra.yml PRE-CREATION 
>   distro/src/conf/zookeeper/log4j.properties PRE-CREATION 
>   distro/src/conf/zookeeper/zoo.cfg PRE-CREATION 
>   distro/src/main/assemblies/standalone-package.xml 1881082e 
>   docs/src/site/twiki/InstallationSteps.twiki 6b9f0313 
>   graphdb/janus/pom.xml 143b775f 
> 
> 
> Diff: https://reviews.apache.org/r/65885/diff/1/
> 
> 
> Testing
> ---
> 
> Full build with the new embedded-cassandra-solr, and testing to make sure 
> Atlas comes up and is functional.
> 
> Be aware that we have been running Cassandra backed JanusGraph for months 
> with no issues.
> 
> 
> Thanks,
> 
> Pierre Padovani
> 
>



Re: Review Request 65885: ATLAS-2470 - Add JanusGraph Cassandra support to Atlas

2018-03-04 Thread Pierre Padovani


> On March 4, 2018, 11:07 a.m., David Radley wrote:
> > My review comments are from my initial look at the code; I hope to try 
> > running this patch to verify it works for me

I'll update the patch with the above changes today or tomorrow as I have time.


- Pierre


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65885/#review198593
---


On March 2, 2018, 4:51 p.m., Pierre Padovani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65885/
> ---
> 
> (Updated March 2, 2018, 4:51 p.m.)
> 
> 
> Review request for atlas and David Radley.
> 
> 
> Bugs: ATLAS-2470
> https://issues.apache.org/jira/browse/ATLAS-2470
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This updates pom's to add the needed cassandra jars, and adds a dist profile 
> embedded-cassandra-solr.
> 
> 
> Diffs
> -
> 
>   distro/pom.xml 0103bef6 
>   distro/src/bin/atlas_config.py e6415cf4 
>   distro/src/bin/atlas_start.py 39be6b7c 
>   distro/src/bin/atlas_stop.py 66edd904 
>   distro/src/conf/cassandra.yml PRE-CREATION 
>   distro/src/conf/zookeeper/log4j.properties PRE-CREATION 
>   distro/src/conf/zookeeper/zoo.cfg PRE-CREATION 
>   distro/src/main/assemblies/standalone-package.xml 1881082e 
>   docs/src/site/twiki/InstallationSteps.twiki 6b9f0313 
>   graphdb/janus/pom.xml 143b775f 
> 
> 
> Diff: https://reviews.apache.org/r/65885/diff/1/
> 
> 
> Testing
> ---
> 
> Full build with the new embedded-cassandra-solr, and testing to make sure 
> Atlas comes up and is functional.
> 
> Be aware that we have been running Cassandra backed JanusGraph for months 
> with no issues.
> 
> 
> Thanks,
> 
> Pierre Padovani
> 
>



Re: Review Request 65790: Data Migration Utility: Export

2018-03-04 Thread Apoorv Naik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65790/#review198595
---




tools/atlas-migration-utility/pom.xml
Lines 41 (patched)


Remove duplicate dependencies. Since we're inherting all versions from 
parent,  the version override is not necessary here.



tools/atlas-migration-utility/pom.xml
Lines 65 (patched)


Typesystem is removed from 1.0.0-SNAPSHOT so this seems incorrect.


- Apoorv Naik


On March 4, 2018, 5:25 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65790/
> ---
> 
> (Updated March 4, 2018, 5:25 p.m.)
> 
> 
> Review request for atlas, Apoorv Naik, Madhan Neethiraj, and Ruchi Solani.
> 
> 
> Bugs: ATLAS-2461
> https://issues.apache.org/jira/browse/ATLAS-2461
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> The data migration utility allows for exporting data from an Atlas instance 
> without the server running. This is important as it will prevent Atlas from 
> processing any requests.
> 
> **Approach**
> The migration is a new utility that will perform export of data using a 
> pre-defined export request file.
> 
> The approach used by this application:
> - Create an application context (_migrationContext.xml_). This context, 
> prevents instantiation of number of classes, most notably _webapp_, 
> _notifications_, _listeners_.
> - Create _ImportService_ using Spring's framework classes.
> 
> Here are the pieces:
> - _MigrationApp_: Contains _main_. It is the entry point for the app.
> - _Exporter_: Contains plumbing needed to use the new _migrationContext.xml_ 
> and create _ApplicationContext_ for use.
> - Dengenerate classes _EmptyNotification_, _EmptyNotificationChangeListener_. 
> These are necessary to launch the application in a way that no notifications 
> are sent out.
> - _atlas_migration.py_ Creates environment for the app to execute. It 
> launches the application.
> - _migration-export-request.json_ This can be modified for environments that 
> do not have latest improvements to _Export_.
> - _README_: Instructions on usage.
> 
> **Build**
> The project maven's assembly building plugin. The build will create a ZIP 
> file with the necessary files.
> 
> _mvn install package_ generates a ZIP file in _target_ directory.
> 
> **Usage (from README)**
> - Use Ambari to turn shutdown Atlas.
> - Unzip contents of the ZIP to a directory on the server using: 'unzip 
> atlas-migration-kit.zip'
> - cd atlas-migration-kit
> - Run 'python atlas_migration.py'. Use 'python atlas_migration.py 
> ' to export to a specific file.
> - To watch the progress: 'tail -f /var/log/atlas/application.log'.
> 
> 
> Diffs
> -
> 
>   pom.xml 7db1be78 
>   tools/atlas-migration-utility/pom.xml PRE-CREATION 
>   tools/atlas-migration-utility/src/assembly/bin.xml PRE-CREATION 
>   
> tools/atlas-migration-utility/src/main/java/org/apache/atlas/migration/Exporter.java
>  PRE-CREATION 
>   
> tools/atlas-migration-utility/src/main/java/org/apache/atlas/migration/NoOpNotification.java
>  PRE-CREATION 
>   
> tools/atlas-migration-utility/src/main/java/org/apache/atlas/migration/NoOpNotificationChangeListener.java
>  PRE-CREATION 
>   tools/atlas-migration-utility/src/main/resources/README PRE-CREATION 
>   tools/atlas-migration-utility/src/main/resources/atlas_migration.py 
> PRE-CREATION 
>   
> tools/atlas-migration-utility/src/main/resources/migration-export-request.json
>  PRE-CREATION 
>   tools/atlas-migration-utility/src/main/resources/migrationContext.xml 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65790/diff/4/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> None.
> 
> **Functional tests**
> Exports from existing clusters.
> 
> 
> File Attachments
> 
> 
> Migration Utility
>   
> https://reviews.apache.org/media/uploaded/files/2018/02/24/e8090ed0-13b6-4253-a59c-d3f2098943af__atlas-migration-kit-0.8.3-SNAPSHOT-bin.zip
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 65885: ATLAS-2470 - Add JanusGraph Cassandra support to Atlas

2018-03-04 Thread David Radley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65885/#review198593
---



My review comments are from my initial look at the code; I hope to try running 
this patch to verify it works for me

- David Radley


On March 2, 2018, 4:51 p.m., Pierre Padovani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65885/
> ---
> 
> (Updated March 2, 2018, 4:51 p.m.)
> 
> 
> Review request for atlas and David Radley.
> 
> 
> Bugs: ATLAS-2470
> https://issues.apache.org/jira/browse/ATLAS-2470
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This updates pom's to add the needed cassandra jars, and adds a dist profile 
> embedded-cassandra-solr.
> 
> 
> Diffs
> -
> 
>   distro/pom.xml 0103bef6 
>   distro/src/bin/atlas_config.py e6415cf4 
>   distro/src/bin/atlas_start.py 39be6b7c 
>   distro/src/bin/atlas_stop.py 66edd904 
>   distro/src/conf/cassandra.yml PRE-CREATION 
>   distro/src/conf/zookeeper/log4j.properties PRE-CREATION 
>   distro/src/conf/zookeeper/zoo.cfg PRE-CREATION 
>   distro/src/main/assemblies/standalone-package.xml 1881082e 
>   docs/src/site/twiki/InstallationSteps.twiki 6b9f0313 
>   graphdb/janus/pom.xml 143b775f 
> 
> 
> Diff: https://reviews.apache.org/r/65885/diff/1/
> 
> 
> Testing
> ---
> 
> Full build with the new embedded-cassandra-solr, and testing to make sure 
> Atlas comes up and is functional.
> 
> Be aware that we have been running Cassandra backed JanusGraph for months 
> with no issues.
> 
> 
> Thanks,
> 
> Pierre Padovani
> 
>



Re: Review Request 65885: ATLAS-2470 - Add JanusGraph Cassandra support to Atlas

2018-03-04 Thread David Radley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65885/#review198584
---




distro/pom.xml
Lines 258 (patched)


I see that you are downloading Zookeeper. Many components require zookeeper 
- so it may already exist on the system. I think for production you would want 
to run with an external Zookeeper. 

I suggest we allow this as a build option. I realise this jira is for the 
embedded Cssandra build only - but I could envisage you needing non-embedded 
cassandra builds as well; I thought I would suggest this here - if this is a 
requirement for you, you  may decide you want to address this in a separate 
Jira.



distro/pom.xml
Lines 302 (patched)


for consistency should we have  as well?



distro/src/bin/atlas_start.py
Lines 121 (patched)


For an embedded hbase build, it will use the hbase zk and the embeded solr. 
I assume this line ( and the matching stop) should not be called for the 
embedded hbase build (which is not using the downloaded zk).



distro/src/conf/cassandra.yml
Lines 22 (patched)


typo



distro/src/conf/cassandra.yml
Lines 24 (patched)


I suggest the install could text replace these to the correct values. Is it 
possible for it to refer to an environment variable with the absolute path, 
then the script pick up that environment variable?



distro/src/conf/zookeeper/zoo.cfg
Lines 20 (patched)


bad end of line character here and a few places below



docs/src/site/twiki/InstallationSteps.twiki
Lines 38 (patched)


We did not have to do with before this change for embedded hbase solr. I 
think these notes should not apply to the embedded hbase solr profile.


- David Radley


On March 2, 2018, 4:51 p.m., Pierre Padovani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65885/
> ---
> 
> (Updated March 2, 2018, 4:51 p.m.)
> 
> 
> Review request for atlas and David Radley.
> 
> 
> Bugs: ATLAS-2470
> https://issues.apache.org/jira/browse/ATLAS-2470
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This updates pom's to add the needed cassandra jars, and adds a dist profile 
> embedded-cassandra-solr.
> 
> 
> Diffs
> -
> 
>   distro/pom.xml 0103bef6 
>   distro/src/bin/atlas_config.py e6415cf4 
>   distro/src/bin/atlas_start.py 39be6b7c 
>   distro/src/bin/atlas_stop.py 66edd904 
>   distro/src/conf/cassandra.yml PRE-CREATION 
>   distro/src/conf/zookeeper/log4j.properties PRE-CREATION 
>   distro/src/conf/zookeeper/zoo.cfg PRE-CREATION 
>   distro/src/main/assemblies/standalone-package.xml 1881082e 
>   docs/src/site/twiki/InstallationSteps.twiki 6b9f0313 
>   graphdb/janus/pom.xml 143b775f 
> 
> 
> Diff: https://reviews.apache.org/r/65885/diff/1/
> 
> 
> Testing
> ---
> 
> Full build with the new embedded-cassandra-solr, and testing to make sure 
> Atlas comes up and is functional.
> 
> Be aware that we have been running Cassandra backed JanusGraph for months 
> with no issues.
> 
> 
> Thanks,
> 
> Pierre Padovani
> 
>