Repository: incubator-atlas Updated Branches: refs/heads/0.8-incubating dc5ad76f6 -> 24746dd97
ATLAS-1717-IX-Documentation Signed-off-by: apoorvnaik <an...@hortonworks.com> Project: http://git-wip-us.apache.org/repos/asf/incubator-atlas/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-atlas/commit/24746dd9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-atlas/tree/24746dd9 Diff: http://git-wip-us.apache.org/repos/asf/incubator-atlas/diff/24746dd9 Branch: refs/heads/0.8-incubating Commit: 24746dd97407644c2a874a551e2378939b9d4cac Parents: dc5ad76 Author: ashutoshm <ames...@hortonworks.com> Authored: Thu Apr 6 11:36:11 2017 -0700 Committer: apoorvnaik <an...@hortonworks.com> Committed: Wed Apr 12 11:59:24 2017 -0700 ---------------------------------------------------------------------- docs/src/site/twiki/Export-API.twiki | 150 +++++++++++++++++++++++ docs/src/site/twiki/Export-HDFS-API.twiki | 79 ++++++++++++ docs/src/site/twiki/Import-API.twiki | 109 ++++++++++++++++ docs/src/site/twiki/Import-Export-API.twiki | 33 +++++ docs/src/site/twiki/index.twiki | 1 + 5 files changed, 372 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-atlas/blob/24746dd9/docs/src/site/twiki/Export-API.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/Export-API.twiki b/docs/src/site/twiki/Export-API.twiki new file mode 100644 index 0000000..4027f3d --- /dev/null +++ b/docs/src/site/twiki/Export-API.twiki @@ -0,0 +1,150 @@ +---+ Export API +The general approach is: + * Consumer specifies the scope of data to be exported (details below). + * The API if successful, will return the stream in the format specified. + * Error will be returned on failure of the call. + +See [[Export-HDFS-API][here]] for details on exporting *hdfs_path* entities. + +|*Title*|*Export API*| +| _Example_ | See Examples sections below. | +| _URL_ |_api/atlas/admin/export_ | +| _Method_ |_POST_ | +| _URL Parameters_ |_None_ | +| _Data Parameters_| The class _!AtlasExportRequest_ is used to specify the items to export. The list of _!AtlasObjectId_(s) allow for specifying the multiple items to export in a session. The _!AtlasObjectId_ is a tuple of entity type, name of unique attribute, value of unique attribute. Several items can be specified. See examples below.| +| _Success Response_|File stream as _application/zip_.| +|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. | +| _Notes_ | Consumer could choose to consume the output of the API by programmatically using _java.io.ByteOutputStream_ or by manually, save the contents of the stream to a file on the disk.| + +__Method Signature__ +<verbatim> +@POST +@Path("/export") +@Consumes("application/json;charset=UTF-8") +</verbatim> + +---+++ Additional Options +It is possible to specify additional parameters for the _Export_ operation. + +Current implementation has 2 options. Both are optional: + * _matchType_ This option configures the approach used for fetching the starting entity. It has follow values: + * _startsWith_ Search for an entity that is prefixed with the specified criteria. + * _endsWith_ Search for an entity that is suffixed with the specified criteria. + * _contains_ Search for an entity that has the specified criteria as a sub-string. + * _matches_ Search for an entity that is a regular expression match with the specified criteria. + + * _fetchType_ This option configures the approach used for fetching entities. It has following values: + * _FULL_: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database. + * _CONNECTED_: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only. + +If no _matchType_ is specified, exact match is used. Which means, that the entire string is used in the search criteria. + +Searching using _matchType_ applies for all types of entities. It is particularly useful for matching entities of type hdfs_path (see [[Export-HDFS-API][here]]). + +The _fetchType_ option defaults to _FULL_. + +For complete example see section below. + +---+++ Contents of Exported ZIP File + +The exported ZIP file has the following entries within it: + * _atlas-export-result.json_: + * Input filters: The scope of export. + * File format: The format chosen for the export operation. + * Metrics: The number of entity definitions, classifications and entities exported. + * _atlas-typesdef.json_: Type definitions for the entities exported. + * _atlas-export-order.json_: Order in which entities should be exported. + * _{guid}.json_: Individual entities are exported with file names that correspond to their id. + +---+++ Examples +The _!AtlasExportRequest_ below shows filters that attempt to export 2 databases in cluster cl1: +<verbatim> +{ + "itemsToExport": [ + { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@cl1" } }, + { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "hr@cl1" } } + ] +} +</verbatim> + +The _!AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchType_ option will fetch _accounts@cl1_. +<verbatim> +{ + "itemsToExport": [ + { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@" } }, + ], + "options" { + "fetchType": "FULL", + "matchType": "startsWith" + } +} +</verbatim> + +The _!AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc present in the database. +<verbatim> +{ + "itemsToExport": [ + { "typeName": "hive_db", "uniqueAttributes": { "name": "accounts" } }, + ], + "options" { + "fetchType": "CONNECTED", + "matchType": "startsWith" + } +} +</verbatim> + +Below is the _!AtlasExportResult_ JSON for the export of the _Sales_ DB present in the _!QuickStart_. + +The _metrics_ contains the number of types and entities exported as part of the operation. + +<verbatim> +{ + "clientIpAddress": "10.0.2.15", + "hostName": "10.0.2.2", + "metrics": { + "duration": 1415, + "entitiesWithExtInfo": 12, + "entity:DB_v1": 2, + "entity:LoadProcess_v1": 2, + "entity:Table_v1": 6, + "entity:View_v1": 2, + "typedef:Column_v1": 1, + "typedef:DB_v1": 1, + "typedef:LoadProcess_v1": 1, + "typedef:StorageDesc_v1": 1, + "typedef:Table_v1": 1, + "typedef:View_v1": 1, + "typedef:classification": 6 + }, + "operationStatus": "SUCCESS", + "request": { + "itemsToExport": [ + { + "typeName": "DB_v1", + "uniqueAttributes": { + "name": "Sales" + } + } + ], + "options": { + "fetchType": "full" + } + }, + "userName": "admin" +} +</verbatim> + +---+++ CURL Calls +Below are sample CURL calls that demonstrate Export of _!QuickStart_ database. + +<verbatim> +curl -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{ + "itemsToExport": [ + { "typeName": "DB", "uniqueAttributes": { "name": "Sales" } + { "typeName": "DB", "uniqueAttributes": { "name": "Reporting" } + { "typeName": "DB", "uniqueAttributes": { "name": "Logging" } + } + ], + "options": "full" +}' "http://localhost:21000/api/atlas/admin/export" > quickStartDB.zip +</verbatim> http://git-wip-us.apache.org/repos/asf/incubator-atlas/blob/24746dd9/docs/src/site/twiki/Export-HDFS-API.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/Export-HDFS-API.twiki b/docs/src/site/twiki/Export-HDFS-API.twiki new file mode 100644 index 0000000..64ba123 --- /dev/null +++ b/docs/src/site/twiki/Export-HDFS-API.twiki @@ -0,0 +1,79 @@ +---+ Export & Import APIs for HDFS Path + +---+++ Introduction + +The general approach for using the Import-Export APIs for HDFS Paths remain the same. There are minor variations caused how HDFS paths are handled within Atlas. + +Unlike HIVE entities, HDFS entities within Atlas are created manually using the _Create Entity_ link within the Atlas Web UI. + +Also, HDFS paths tend to be hierarchical, in the sense that users tend to model the same HDFS storage structure within Atlas. + +__Sample HDFS Setup__ + + +<table border="1" cellpadding="pixels" cellspacing="pixels"> + <tr> + <th><strong>HDFS Path</strong></th> <th><strong>Atlas Entity</strong></th> + </tr> + <tr> + <td style="padding:0 15px 0 15px;"> + <em>/apps/warehouse/finance</em> + </td> + <td style="padding:0 15px 0 15px;"> + <strong>Entity type: </strong><em>hdfs_path</em> <br/> + <strong>Name: </strong><em>Finance</em> <br/> + <strong>QualifiedName: </strong><em>FinanceAll</em> + </td> + </tr> + <tr> + <td style="padding:0 15px 0 15px;"> + <em>/apps/warehouse/finance/accounts-receivable</em> + </td> + <td style="padding:0 15px 0 15px;"> + <strong>Entity type: </strong><em>hdfs_path</em> <br/> + <strong>Name: </strong><em>FinanceReceivable</em> <br/> + <strong>QualifiedName: </strong><em>FinanceReceivable</em> <br/> + <strong>Path: </strong><em>/apps/warehouse/finance</em> + </td> + </tr> + <td style="padding:0 15px 0 15px;"> + <em>/apps/warehouse/finance/accounts-payable</em> + </td> + <td style="padding:0 15px 0 15px;"> + <strong>Entity type: </strong><em>hdfs_path</em> <br/> + <strong>Name: </strong><em>Finance-Payable</em> <br/> + <strong>QualifiedName: </strong><em>FinancePayable</em> <br/> + <strong>Path: </strong><em>/apps/warehouse/finance/accounts-payable</em> + </td> + </tr> + </tr> + <td style="padding:0 15px 0 15px;"> + <em>/apps/warehouse/finance/billing</em> + </td> + <td style="padding:0 15px 0 15px;"> + <strong>Entity type: </strong><em>hdfs_path</em> <br/> + <strong>Name: </strong><em>FinanceBilling</em> <br/> + <strong>QualifiedName: </strong><em>FinanceBilling</em> <br/> + <strong>Path: </strong><em>/apps/warehouse/finance/billing</em> + </td> + </tr> +</table> + +---+++ Export API Using matchType +To export entities that represent HDFS path, use the Export API using the _matchType_ option. Details can be found [[Export-API][here]]. + +---+++ Example Using CURL Calls +Below are sample CURL calls that performs export operation on the _Sample HDFS Setup_ shown above. + +<verbatim> +curl -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{ + "itemsToExport": [ + { "typeName": "hdfs_path", "uniqueAttributes": { "name": "FinanceAll" } + } + ], + "options": { + "fetchType": "full", + "matchType": "startsWith" + } +}' "http://localhost:21000/api/atlas/admin/export" > financeAll.zip +</verbatim> http://git-wip-us.apache.org/repos/asf/incubator-atlas/blob/24746dd9/docs/src/site/twiki/Import-API.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/Import-API.twiki b/docs/src/site/twiki/Import-API.twiki new file mode 100644 index 0000000..7eedf19 --- /dev/null +++ b/docs/src/site/twiki/Import-API.twiki @@ -0,0 +1,109 @@ +---+ Import API + +The general approach is: + * Consumer makes a ZIP file available for import operation. See details below for the 2 flavors of the API. + * The API if successful, will return the results of the operation. + * Error will be returned on failure of the call. + +---+++ Import ZIP File Using POST + +|*Title*|*Import API*| +| _Example_ | See Examples sections below. | +| _Description_|Provide the contents of the file to be imported in the request body.| +| _URL_ |_api/atlas/admin/import_ | +| _Method_ |_POST_ | +| _URL Parameters_ |_None_ | +| _Data Parameters_|_None_| +| _Success Response_ | _!AtlasImporResult_ is returned as JSON. See details below.| +|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. | + +---+++ Import ZIP File Available on Server + +|*Title*|*Import API*| +| _Example_ | See Examples sections below. | +| _Description_|Provide the path of the file to be imported.| +| _URL_ |_api/atlas/admin/importfile_ | +| _Method_ |_POST_ | +| _URL Parameters_ |_?FILENAME=<path of file>_ Specify the options as name-value pairs. Use _FILENAME_ to specify the file path. | +| _Data Parameters_|_None_| +| _Success Response_ | _!AtlasImporResult_ is returned as JSON. See details below.| +|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. | +|_Notes_| The file to be imported needs to be present on the server at the location specified by the _FILENAME_ parameter.| + +__Method Signature for Import__ +<verbatim> +@POST +@Path("/import") +@Produces("application/json; charset=UTF-8") +@Consumes("application/octet-stream") +</verbatim> + +__Method Signature for Import File__ +<verbatim> +@POST +@Path("/importfile") +@Produces("application/json; charset=UTF-8") +@Consumes("application/json") +</verbatim> + +__!AtlasImportResult Response__ +The API will return the results of the import operation in the format defined by the _!AtlasImportResult_: + * _!AtlasImportParameters_: This contains a collection of name value pair of the options that are applied during the import operation. + * _Metrics_: Operation metrics. These include details on the number of types imported, number of entities imported, etc. + * _Processed Entities_: Contains list of GUIDs for the entities that were processed. + * _Operation Status_: Overall status of the operation. Values are _SUCCESS_, PARTIAL_SUCCESS, _FAIL_. + +---+++ Examples Using CURL Calls +The call below performs Import of _!QuickStart_ database using POST. +<verbatim> +curl -X POST -u admin:admin -H "Content-Type: application/octet-stream" -H "Cache-Control: no-cache" + --data-binary @quickStartDB.zip + "http://localhost:21000/api/atlas/admin/import" > quickStartDB-import-result.json +</verbatim> + +The call below performs Import of _!QuickStart_ database using a ZIP file available on server. +<verbatim> +curl -X POST -u admin:admin -H "Cache-Control: no-cache" +"http://localhost:21000/api/atlas/admin/importFile?FILENAME=/root/quickStartDB.zip" > quickStartDB-import-result.json +</verbatim> + +Below is the _!AtlasImportResult_ JSON for an import that contains _hive_db_. + +The _processedEntities_ contains the _guids_ of all the entities imported. + +The _metrics_ contain a breakdown of the types and entities imported along with the operation performed on them viz. _created_ or _updated_. + +<verbatim> +{ + "request": { + "options": {} + }, + "userName": "admin", + "clientIpAddress": "10.0.2.2", + "hostName": "10.0.2.15", + "timeStamp": 1491285622823, + "metrics": { + "duration": 9143, + "typedef:enum": 0, + "typedef:struct": 0, + "entity:hive_column:created": 461, + "entity:hive_storagedesc:created": 20, + "entity:hive_process:created": 12, + "entity:hive_db:created": 5, + "entity:hive_table:created": 20, + "entity:hdfs_path:created": 2, + "typedef:entitydef": 0, + "typedef:classification": 3 + }, + "processedEntities": [ + "2c4aa713-030b-4fb3-98b1-1cab23d9ac81", + "e4aa71ed-70fd-4fa7-9dfb-8250a573e293", + + ... + + "ea0f9bdb-1dfc-4e48-9848-a006129929f9", + "b5e2cb41-3e7d-4468-84e1-d87c320e75f9" + ], + "operationStatus": "SUCCESS" +} +</verbatim> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-atlas/blob/24746dd9/docs/src/site/twiki/Import-Export-API.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/Import-Export-API.twiki b/docs/src/site/twiki/Import-Export-API.twiki new file mode 100644 index 0000000..519f208 --- /dev/null +++ b/docs/src/site/twiki/Import-Export-API.twiki @@ -0,0 +1,33 @@ +---+ Export & Import REST APIs + + +---+++ Background +The Import-Export APIs for Atlas facilitate transfer of data to and from a cluster that has Atlas provisioned. + +The APIs when integrated with backup and/or disaster recovery process will ensure participation of Atlas. + +---+++ Introduction +There are 2 broad categories viz. Export & Import. The details of the APIs are discussed below. + +The APIs are available only to _admin_ user. + +Only a single import or export operation can be performed at a given time. The operations have a potential for generating large amount. They can also put pressure on resources. This restriction tries to alleviate this problem. + +For Import-Export APIs relating to HDFS path, can be found [[Import-Export-HDFS-Path][here]]. + +For additional information please refer to the following: + * [[https://issues.apache.org/jira/browse/ATLAS-1503][ATLAS-1503]] Original Import-Export API requirements. + * [[https://issues.apache.org/jira/browse/ATLAS-1618][ATLAS-1618]] Export API Scope Specification. + +---+++ Errors +If an import or export operation is initiated while another is in progress, the consumer will receive this error: +<verbatim> +"ATLAS5005E": "Another import or export is in progress. Please try again." +</verbatim> + +Unhandled errors will be returned as Internal error code 500. + +---++ REST API Reference + * __[[Export-API][Export API]]__ + * __[[Export-HDFS-API][Export HDFS API]]__ + * __[[Import-API][Import API]]__ http://git-wip-us.apache.org/repos/asf/incubator-atlas/blob/24746dd9/docs/src/site/twiki/index.twiki ---------------------------------------------------------------------- diff --git a/docs/src/site/twiki/index.twiki b/docs/src/site/twiki/index.twiki index 8a8626f..82d27c0 100755 --- a/docs/src/site/twiki/index.twiki +++ b/docs/src/site/twiki/index.twiki @@ -56,6 +56,7 @@ allows integration with the whole enterprise data ecosystem. ---++ API Documentation * <a href="api/v2/index.html">REST API Documentation</a> + * [[Import-Export-API][Export & Import REST API Documentation]] * <a href="api/rest.html">Legacy API Documentation</a> ---++ Developer Setup Documentation