This is an automated email from the ASF dual-hosted git repository. pinal pushed a commit to branch branch-2.0 in repository https://gitbox.apache.org/repos/asf/atlas.git
The following commit(s) were added to refs/heads/branch-2.0 by this push: new 88d631090 ATLAS-4633: Multiple typos in official Apache Atlas Docs 88d631090 is described below commit 88d63109049a2c2104fc5ec5cb659a78741c2f1e Author: ranger_qe_sr21089 <ranger...@cloudera.com> AuthorDate: Wed Jul 13 12:04:48 2022 +0530 ATLAS-4633: Multiple typos in official Apache Atlas Docs Signed-off-by: Pinal Shah <pinal.s...@freestoneinfotech.com> (cherry picked from commit bc41d132556dbb6dd7597ce92a130a01a503f2ee) --- docs/src/documents/Asf/asfinfo.md | 2 +- docs/src/documents/BusinessMetadata.md | 2 +- docs/src/documents/Downloads/Downloads.md | 8 +++---- docs/src/documents/Glossary.md | 16 ++++++------- docs/src/documents/HighAvailability.md | 2 +- docs/src/documents/Hook/HookHBase.md | 6 ++--- docs/src/documents/Hook/HookHive.md | 4 ++-- docs/src/documents/Hook/HookSqoop.md | 6 ++--- docs/src/documents/Hook/HookStorm.md | 2 +- docs/src/documents/Import-Export/ExportAPI.md | 6 ++--- .../documents/Import-Export/ImportAPIOptions.md | 4 ++-- .../Import-Export/ImportEntityTransforms.md | 4 ++-- .../src/documents/Import-Export/ImportExportAPI.md | 2 +- .../documents/Migration/Migration-0.8-to-1.0.md | 2 +- docs/src/documents/Misc/AtlasServer.md | 4 ++-- docs/src/documents/Misc/SoftReference.md | 4 ++-- docs/src/documents/Notifications.md | 2 +- docs/src/documents/Overview.md | 2 +- docs/src/documents/Project-Info/MailingLists.md | 2 +- docs/src/documents/Project-Info/TeamList.md | 2 +- docs/src/documents/Search/SearchAdvanced.md | 18 +++++++------- .../documents/Security/AtlasRangerAuthorizer.md | 2 +- .../documents/Security/AtlasSimpleAuthorizer.md | 2 +- docs/src/documents/Security/Authentication.md | 8 +++---- docs/src/documents/Security/Security.md | 10 ++++---- docs/src/documents/Setup/BuildInstruction.md | 4 ++-- docs/src/documents/Setup/Configuration.md | 4 ++-- docs/src/documents/Setup/EclipseSetup.md | 10 ++++---- .../src/documents/Setup/InstallationInstruction.md | 26 ++++++++++---------- docs/src/documents/Tools/AtlasRepairIndex.md | 2 +- docs/src/documents/TypeSystem.md | 28 +++++++++++----------- docs/src/documents/Whats-New/WhatsNew-2.0.md | 2 +- docs/src/documents/Whats-New/WhatsNew-2.1.md | 4 ++-- 33 files changed, 101 insertions(+), 101 deletions(-) diff --git a/docs/src/documents/Asf/asfinfo.md b/docs/src/documents/Asf/asfinfo.md index 99f7db675..1f78b84c1 100644 --- a/docs/src/documents/Asf/asfinfo.md +++ b/docs/src/documents/Asf/asfinfo.md @@ -6,7 +6,7 @@ menu: ASF import {CustomLink} from "theme/components/shared/common/CustomLink"; -# ASF Infomation +# ASF Information 1. <CustomLink href="http://www.apache.org/foundation/how-it-works.html">How Apache Works</CustomLink> diff --git a/docs/src/documents/BusinessMetadata.md b/docs/src/documents/BusinessMetadata.md index 65755fb35..8af32b52c 100644 --- a/docs/src/documents/BusinessMetadata.md +++ b/docs/src/documents/BusinessMetadata.md @@ -13,7 +13,7 @@ import SyntaxHighlighter from 'react-syntax-highlighter'; # Business Metadata ## Overview Atlas typesystem allows users to define a model and create entities for the metadata objects they want to manage. -Typically the model captures technical attributes - like name, description, create time, number of replicas, etc; and +Typically, the model captures technical attributes - like name, description, create time, number of replicas, etc.; and metadata objects are created and updated by processes that monitor the real objects. It is often necessary to augment technical attributes with additional attributes to capture business details that can help organize, search and manage metadata entities. For example, a steward from marketing department can define set of attributes for a campaign, diff --git a/docs/src/documents/Downloads/Downloads.md b/docs/src/documents/Downloads/Downloads.md index b8bf09250..e486a1d0b 100644 --- a/docs/src/documents/Downloads/Downloads.md +++ b/docs/src/documents/Downloads/Downloads.md @@ -104,9 +104,9 @@ pgp downloaded_file.asc`} * Entity Purge: added REST APIs to purge deleted entities * Search: ability to find entities by more than one classification * Performance: improvements in lineage retrieval and classification-propagation - * Notification: ability to process notificaitons from multiple Kafka topics + * Notification: ability to process notifications from multiple Kafka topics * Hive Hook: tracks process-executions via hive_process_execution entities - * Hive Hook: catures DDL operations via hive_db_ddl and hive_table_ddl entities + * Hive Hook: captures DDL operations via hive_db_ddl and hive_table_ddl entities * Notification: introduced shell entities to record references to non-existing entities in notifications * Spark: added model to capture Spark entities, processes and relationships * AWS S3: introduced updated model to capture AWS S3 entities and relationships @@ -133,7 +133,7 @@ pgp downloaded_file.asc`} * Notification processing to support batch-commits * New option in notification processing to ignore potentially incorrect hive_column_lineage * Updated Hive hook to avoid duplicate column-lineage entities; also updated Atlas server to skip duplicate column-lineage entities - * Improved batch processing in notificaiton handler to avoid processing of an entity multiple times + * Improved batch processing in notification handler to avoid processing of an entity multiple times * Add option to ignore/prune metadata for temporary/staging hive tables * Avoid unnecessary lookup when creating new relationships * UI Improvements: @@ -159,7 +159,7 @@ pgp downloaded_file.asc`} * Support for JanusGraph graph database * New DSL implementation, using ANTLR instead of Scala * Removal of older type system implementation in atlas-typesystem library - * Metadata security - fine grained authorization + * Metadata security - fine-grained authorization * Notification enhancements to support V2 style data structures * Jackson library update from 1.9.13 to 2.9.2 * Classification propagation via entity relationships diff --git a/docs/src/documents/Glossary.md b/docs/src/documents/Glossary.md index e6e1439d9..2041e4218 100644 --- a/docs/src/documents/Glossary.md +++ b/docs/src/documents/Glossary.md @@ -12,7 +12,7 @@ import Img from 'theme/components/shared/Img' # Glossary -A Glossary provides appropriate vocabularies for business users and it allows the terms (words) to be related to each +A Glossary provides appropriate vocabularies for business users, and it allows the terms (words) to be related to each other and categorized so that they can be understood in different contexts. These terms can be then mapped to assets like a Database, tables, columns etc. This helps abstract the technical jargon associated with the repositories and allows the user to discover/work with data in the vocabulary that is more familiar to them. @@ -29,13 +29,13 @@ allows the user to discover/work with data in the vocabulary that is more famili ### What is a Glossary term ? A term is a useful word for an enterprise. For the term(s) to be useful and meaningful, they need to grouped around their -use and context. A term in Apache Atlas must have a unique qualifiedName, there can be term(s) with same name but they +use and context. A term in Apache Atlas must have a unique qualifiedName, there can be term(s) with same name, but they cannot belong to the same glossary. Term(s) with same name can exist only across different glossaries. A term name can contain spaces, underscores and dashes (as natural ways of referring to words) but no "." or "@", as the qualifiedName takes the following form `term name`@`glossary qualified name`. The fully qualified name makes it easier to work with a specific term. -A term can only belong to single glossary and it's lifecycle is bound to the same i.e. if the Glossary is deleted then +A term can only belong to single glossary, and it's lifecycle is bound to the same i.e. if the Glossary is deleted then the term gets deleted as well. A term can belong to zero or more categories, which allows scoping them into narrower or wider contexts. A term can be assigned/linked to zero or more entities in Apache Atlas. A term can be classified using classifications (tags) and the same classification gets applied to the entities that the term is assigned to. @@ -43,7 +43,7 @@ classifications (tags) and the same classification gets applied to the entities ### What is a Glossary category ? A category is a way of organizing the term(s) so that the term's context can be enriched. A category may or may not have -contained hierarchies i.e. child category hierarchy. A category's qualifiedName is derived using it's hierarchical location +contained hierarchies i.e. child category hierarchy. A category's qualifiedName is derived using its hierarchical location within the glossary e.g. `Category name`.`parent category qualifiedName`. This qualified name gets updated when any hierarchical change happens, e.g. addition of a parent category, removal of parent category or change of parent category. @@ -52,19 +52,19 @@ hierarchical change happens, e.g. addition of a parent category, removal of pare Apache Atlas UI has been updated to provide user-friendly interface to work with various aspects of glossary, including: * create glossaries, terms and categories -* create various relationships between terms - like synonymns, antonymns, seeAlso +* create various relationships between terms - like synonyms, antonyms, seeAlso * organize categories in hierarchies * assign terms to entities * search for entities using associated terms -Most of glossary related UI can be found under a new tab named GLOSSARY, which is present right next to existing +Most glossary related UI can be found under a new tab named GLOSSARY, which is present right next to existing familiar tabs SEARCH and CLASSIFICATION. #### **Glossary tab** Apache Atlas UI provides two ways to work with a glossary - term view and category view. -Term view allows an user to perform the following operations: +Term view allows a user to perform the following operations: * create, update and delete terms * add, remove and update classifications associated with a term @@ -72,7 +72,7 @@ Term view allows an user to perform the following operations: * create various relationships between terms * view entities associated with a term -Category view allows an user to perform the following operations: +Category view allows a user to perform the following operations: * create, update and delete categories and sub-categories * associate terms to categories diff --git a/docs/src/documents/HighAvailability.md b/docs/src/documents/HighAvailability.md index ed0fe8b9f..6a60c56e6 100644 --- a/docs/src/documents/HighAvailability.md +++ b/docs/src/documents/HighAvailability.md @@ -37,7 +37,7 @@ becomes unavailable either because it is deliberately stopped, or due to unexpec instances will automatically be elected as an 'active' instance and start to service user requests. An 'active' instance is the only instance that can respond to user requests correctly. It can create, delete, modify -or respond to queries on metadata objects. A 'passive' instance will accept user requests, but will redirect them +or respond to the queries on metadata objects. A 'passive' instance will accept user requests, but will redirect them using HTTP redirect to the currently known 'active' instance. Specifically, a passive instance will not itself respond to any queries on metadata objects. However, all instances (both active and passive), will respond to admin requests that return information about that instance. diff --git a/docs/src/documents/Hook/HookHBase.md b/docs/src/documents/Hook/HookHBase.md index cbb8d81f2..c8dfbaf38 100644 --- a/docs/src/documents/Hook/HookHBase.md +++ b/docs/src/documents/Hook/HookHBase.md @@ -54,7 +54,7 @@ Follow the instructions below to setup Atlas hook in HBase: The following properties in atlas-application.properties control the thread pool and notification details: <SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}> -{`atlas.hook.hbase.synchronous=false # whether to run the hook synchronously. false recommended to avoid delays in HBase operations. Default: false +{`atlas.hook.hbase.synchronous=false # whether to run the hook synchronously. false is recommended to avoid delays in HBase operations. Default: false atlas.hook.hbase.numRetries=3 # number of retries for notification failure. Default: 3 atlas.hook.hbase.queueSize=10000 # queue size for the threadpool. Default: 10000 atlas.cluster.name=primary # clusterName to use in qualifiedName of entities. Default: primary @@ -68,12 +68,12 @@ Other configurations for Kafka notification producer can be specified by prefixi For list of configuration supported by Kafka producer, please refer to [Kafka Producer Configs](http://kafka.apache.org/documentation/#producerconfigs) ## NOTES - * Only the namespace, table and column-family create/update/ delete operations are captured by Atlas HBase hook. Changes to columns are be captured. + * Only the namespace, table and column-family create/update/delete operations are captured by Atlas HBase hook. Changes to columns are be captured. ## Importing HBase Metadata Apache Atlas provides a command-line utility, import-hbase.sh, to import metadata of Apache HBase namespaces and tables into Apache Atlas. -This utility can be used to initialize Apache Atlas with namespaces/tables present in a Apache HBase cluster. +This utility can be used to initialize Apache Atlas with namespaces/tables present in an Apache HBase cluster. This utility supports importing metadata of a specific table, tables in a specific namespace or all tables. <SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}> diff --git a/docs/src/documents/Hook/HookHive.md b/docs/src/documents/Hook/HookHive.md index 0fba4f5fa..a2df17cfd 100644 --- a/docs/src/documents/Hook/HookHive.md +++ b/docs/src/documents/Hook/HookHive.md @@ -80,7 +80,7 @@ Follow the instructions below to setup Atlas hook in Hive: The following properties in atlas-application.properties control the thread pool and notification details: <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> -{`atlas.hook.hive.synchronous=false # whether to run the hook synchronously. false recommended to avoid delays in Hive query completion. Default: false +{`atlas.hook.hive.synchronous=false # whether to run the hook synchronously. false is recommended to avoid delays in Hive query completion. Default: false atlas.hook.hive.numRetries=3 # number of retries for notification failure. Default: 3 atlas.hook.hive.queueSize=10000 # queue size for the threadpool. Default: 10000 atlas.cluster.name=primary # clusterName to use in qualifiedName of entities. Default: primary @@ -128,7 +128,7 @@ The lineage is captured as ## NOTES * Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source - * Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names + * Since database name, table name and column names are case-insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names * The following hive operations are captured by hive hook currently * create database * create table/view, create table as select diff --git a/docs/src/documents/Hook/HookSqoop.md b/docs/src/documents/Hook/HookSqoop.md index 3f600bb05..cd2b9f123 100644 --- a/docs/src/documents/Hook/HookSqoop.md +++ b/docs/src/documents/Hook/HookSqoop.md @@ -39,7 +39,7 @@ This is used to add entities in Atlas using the model detailed above. Follow the instructions below to setup Atlas hook in Hive: -Add the following properties to to enable Atlas hook in Sqoop: +Add the following properties to enable Atlas hook in Sqoop: * Set-up Atlas hook in `<sqoop-conf>`/sqoop-site.xml by adding the following: <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> @@ -53,7 +53,7 @@ Add the following properties to to enable Atlas hook in Sqoop: * untar apache-atlas-${project.version}-sqoop-hook.tar.gz * cd apache-atlas-sqoop-hook-${project.version} * Copy entire contents of folder apache-atlas-sqoop-hook-${project.version}/hook/sqoop to `<atlas package>`/hook/sqoop - * Copy `<atlas-conf>`/atlas-application.properties to to the sqoop conf directory `<sqoop-conf>`/ + * Copy `<atlas-conf>`/atlas-application.properties to the sqoop conf directory `<sqoop-conf>`/ * Link `<atlas package>`/hook/sqoop/*.jar in sqoop lib @@ -61,7 +61,7 @@ Add the following properties to to enable Atlas hook in Sqoop: The following properties in atlas-application.properties control the thread pool and notification details: <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> -{`atlas.hook.sqoop.synchronous=false # whether to run the hook synchronously. false recommended to avoid delays in Sqoop operation completion. Default: false +{`atlas.hook.sqoop.synchronous=false # whether to run the hook synchronously. false is recommended to avoid delays in Sqoop operation completion. Default: false atlas.hook.sqoop.numRetries=3 # number of retries for notification failure. Default: 3 atlas.hook.sqoop.queueSize=10000 # queue size for the threadpool. Default: 10000 atlas.cluster.name=primary # clusterName to use in qualifiedName of entities. Default: primary diff --git a/docs/src/documents/Hook/HookStorm.md b/docs/src/documents/Hook/HookStorm.md index 37b23e542..21011e3c3 100644 --- a/docs/src/documents/Hook/HookStorm.md +++ b/docs/src/documents/Hook/HookStorm.md @@ -117,7 +117,7 @@ STORM_JAR_JVM_OPTS:"-Datlas.conf=$ATLAS_HOME/conf/" where ATLAS_HOME is pointing to where ATLAS is installed. -You could also set this up programatically in Storm Config as: +You could also set this up programmatically in Storm Config as: <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> {`Config stormConf = new Config(); diff --git a/docs/src/documents/Import-Export/ExportAPI.md b/docs/src/documents/Import-Export/ExportAPI.md index 849d452ab..995cee539 100644 --- a/docs/src/documents/Import-Export/ExportAPI.md +++ b/docs/src/documents/Import-Export/ExportAPI.md @@ -54,7 +54,7 @@ The current implementation has 2 options. Both are optional: * _fetchType_ This option configures the approach used for fetching entities. It has the following values: * _FULL_: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database. - * _CONNECTED_: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only. + * _CONNECTED_: This fetches all the entities that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only. * _INCREMENTAL_: See [here](#/IncrementalExport) for details. @@ -104,7 +104,7 @@ The _AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchTy }`} </SyntaxHighlighter> -The _AtlasExportRequest_ below specifies the _guid_ instead of _uniqueAttribues_ to fetch _accounts@cl1_. +The _AtlasExportRequest_ below specifies the _guid_ instead of _uniqueAttributes_ to fetch _accounts@cl1_. <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> {`{ @@ -117,7 +117,7 @@ The _AtlasExportRequest_ below specifies the _guid_ instead of _uniqueAttribues_ }`} </SyntaxHighlighter> -The _AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc present in the database. +The _AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc. present in the database. <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> {`{ diff --git a/docs/src/documents/Import-Export/ImportAPIOptions.md b/docs/src/documents/Import-Export/ImportAPIOptions.md index 1f6c8e39d..d05a680bb 100644 --- a/docs/src/documents/Import-Export/ImportAPIOptions.md +++ b/docs/src/documents/Import-Export/ImportAPIOptions.md @@ -103,7 +103,7 @@ Steps to use the behavior: The output of Export has _atlas-typedef.json_ that contains the type definitions for the entities exported. -By default (that is if no options are specified), the type definitions are imported and applied to the system being imported to. The entity import is performed after this. +By default, (that is if no options are specified), the type definitions are imported and applied to the system being imported to. The entity import is performed after this. In some cases, you would not want to modify the type definitions. The import may be better off failing than the types be modified. @@ -152,7 +152,7 @@ _CURL_ #### Handling Large Imports -By default, the Import Service stores all of the data in memory. This may be limiting for ZIPs containing a large amount of data. +By default, the Import Service stores all the data in memory. This may be limiting for ZIPs containing a large amount of data. To configure the temporary directory use the application property _atlas.import.temp.directory_. If this property is left blank, the default in-memory implementation is used. diff --git a/docs/src/documents/Import-Export/ImportEntityTransforms.md b/docs/src/documents/Import-Export/ImportEntityTransforms.md index f1676b7c0..4b973e884 100644 --- a/docs/src/documents/Import-Export/ImportEntityTransforms.md +++ b/docs/src/documents/Import-Export/ImportEntityTransforms.md @@ -28,8 +28,8 @@ The existing transformation frameworks allowed this to happen. #### Reason for New Transformation Framework -While the existing framework provided the basic benefits of the transformation framework, it did not have support for some of the commonly used Atlas types. Which meant that users of this framework would have to meticulously define transformations for every type they are working with. This can be tedious and potentially error-prone. -The new framework addresses this problem by providing built-in transformations for some of the commonly used types. It can also be extended to accommodate new types. +While the existing framework provided the basic benefits of the transformation framework, it did not have support for some commonly used Atlas types. Which meant that users of this framework would have to meticulously define transformations for every type they are working with. This can be tedious and potentially error-prone. +The new framework addresses this problem by providing built-in transformations for some commonly used types. It can also be extended to accommodate new types. #### Approach diff --git a/docs/src/documents/Import-Export/ImportExportAPI.md b/docs/src/documents/Import-Export/ImportExportAPI.md index c903c17a3..6a68ab1b8 100644 --- a/docs/src/documents/Import-Export/ImportExportAPI.md +++ b/docs/src/documents/Import-Export/ImportExportAPI.md @@ -27,7 +27,7 @@ The Import-Export APIs for Atlas facilitate the transfer of data to and from a c The APIs when integrated with backup and/or disaster recovery process will ensure participation of Atlas. ### Introduction -There are 2 broad categories viz. Export & Import. The details of the APIs are as discussed below. +There are 2 broad categories' viz. Export & Import. The details of the APIs are as discussed below. The APIs are available only to _admin_ user. diff --git a/docs/src/documents/Migration/Migration-0.8-to-1.0.md b/docs/src/documents/Migration/Migration-0.8-to-1.0.md index 7efd5752b..cbc2313b5 100644 --- a/docs/src/documents/Migration/Migration-0.8-to-1.0.md +++ b/docs/src/documents/Migration/Migration-0.8-to-1.0.md @@ -159,5 +159,5 @@ Apache Atlas 1.0 introduces number of new features. For data that is migrated, t #### Handling of Entity Definitions that use Classifications as Types -This features is no longer supported. Classifications that are used as types in _attribute definitions_ (_AttributeDefs_) are converted in to new types whose name has _legacy_ prefix. These are then handled like any other type. +This feature is no longer supported. Classifications that are used as types in _attribute definitions_ (_AttributeDefs_) are converted in to new types whose name has _legacy_ prefix. These are then handled like any other type. Creation of such types was prevented in an earlier release, hence only type definitions have potential to exist. Care has been taken to handle entities of this type as well. diff --git a/docs/src/documents/Misc/AtlasServer.md b/docs/src/documents/Misc/AtlasServer.md index ee33c3cce..aa4ba54a3 100644 --- a/docs/src/documents/Misc/AtlasServer.md +++ b/docs/src/documents/Misc/AtlasServer.md @@ -37,7 +37,7 @@ The _additionalInfo_ attribute property is discussed in detail below. #### Export/Import Audits -The table has following columns: +The table has the following columns: * _Operation_: EXPORT or IMPORT that denotes the operation performed on instance. * _Source Server_: For an export operation performed on this instance, the value in this column will always be the cluster name of the current Atlas instance. This is the value specified in _atlas-application.properties_ by the key _atlas.cluster.name_. If not value is specified 'default' is used. @@ -67,7 +67,7 @@ The following export request will end up creating _AtlasServer_ entity with _clM Often times it is necessary to disambiguate the name of the cluster by specifying the location or the data center within which the Atlas instance resides. -The name of the cluster can be specified by separating the location name and cluster name by '$'. For example, a clsuter name specified as 'SFO$cl1' can be a cluster in San Fancisco (SFO) data center with the name 'cl1'. +The name of the cluster can be specified by separating the location name and cluster name by '$'. For example, a cluster name specified as 'SFO$cl1' can be a cluster in San Francisco (SFO) data center with the name 'cl1'. The _AtlasServer_ will handle this and set its name as 'cl1' and _fullName_ as 'SFO@cl1'. diff --git a/docs/src/documents/Misc/SoftReference.md b/docs/src/documents/Misc/SoftReference.md index 31ad20321..63c2b1dcc 100644 --- a/docs/src/documents/Misc/SoftReference.md +++ b/docs/src/documents/Misc/SoftReference.md @@ -13,11 +13,11 @@ import SyntaxHighlighter from 'react-syntax-highlighter'; #### Background -Entity attributes are specified using attribute definitions. An attributes persistence strategy is determined by based on their type. +Entity attributes are specified using attribute definitions. An attributes' persistence strategy is determined by based on their type. Primitive types are persisted as properties within the vertex of their parent. -Non-primitive attributes get a vertex of their own and and edge is created between the parent the child to establish ownership. +Non-primitive attributes get a vertex of their own and edge is created between the parent the child to establish ownership. Attribute with _isSoftReference_ option set to _true_, is non-primitive attribute that gets treatment of a primitive attribute. diff --git a/docs/src/documents/Notifications.md b/docs/src/documents/Notifications.md index 34d5c1a4b..18734b67a 100644 --- a/docs/src/documents/Notifications.md +++ b/docs/src/documents/Notifications.md @@ -60,7 +60,7 @@ Notification includes the following data. </SyntaxHighlighter> Apache Atlas 1.0 can be configured to send notifications in older version format, instead of the latest version format. -This can be helpful in deployments that are not yet ready to process notifications in latest version format. +This can be helpful in deployments that are not yet ready to process notifications in the latest version format. To configure Apache Atlas 1.0 to send notifications in earlier version format, please set following configuration in atlas-application.properties: <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> diff --git a/docs/src/documents/Overview.md b/docs/src/documents/Overview.md index 3a82656ad..8614106ab 100644 --- a/docs/src/documents/Overview.md +++ b/docs/src/documents/Overview.md @@ -40,7 +40,7 @@ capabilities around these data assets for data scientists, analysts and the data * SQL like query language to search entities - Domain Specific Language (DSL) ### Security & Data Masking - * Fine grained security for metadata access, enabling controls on access to entity instances and operations like add/update/remove classifications + * Fine-grained security for metadata access, enabling controls on access to entity instances and operations like add/update/remove classifications * Integration with Apache Ranger enables authorization/data-masking on data access based on classifications associated with entities in Apache Atlas. For example: * who can access data classified as PII, SENSITIVE * customer-service users can only see last 4 digits of columns classified as NATIONAL_ID diff --git a/docs/src/documents/Project-Info/MailingLists.md b/docs/src/documents/Project-Info/MailingLists.md index 884330363..a16fba339 100644 --- a/docs/src/documents/Project-Info/MailingLists.md +++ b/docs/src/documents/Project-Info/MailingLists.md @@ -8,7 +8,7 @@ submenu: Mailing Lists # Project Mailing Lists -* These are the mailing lists that have been established for this project. For each list, there is a subscribe, unsubscribe, and an archive link. +* These are the mailing lists that have been established for this project. For each list, there is a - subscribe, unsubscribe, and an archive link. | **Name** | **Subscribe** | **Unsubscribe** | **Post** | **Archive** | diff --git a/docs/src/documents/Project-Info/TeamList.md b/docs/src/documents/Project-Info/TeamList.md index d260fa353..f8c662bd4 100644 --- a/docs/src/documents/Project-Info/TeamList.md +++ b/docs/src/documents/Project-Info/TeamList.md @@ -10,7 +10,7 @@ submenu: Team List import TeamList from 'theme/components/shared/TeamList' #### A successful project requires many people to play many roles. Some members write code or documentation, while others are valuable as testers, submitting patches and suggestions. -#### The team is comprised of Members and Contributors. Members have direct access to the source of a project and actively evolve the code-base. Contributors improve the project through submission of patches and suggestions to the Members. The number of Contributors to the project is unbounded. Get involved today. All contributions to the project are greatly appreciated. +#### The team comprises Members and Contributors. Members have direct access to the source of a project and actively evolve the code-base. Contributors improve the project through submission of patches and suggestions to the Members. The number of Contributors to the project is unbounded. Get involved today. All contributions to the project are greatly appreciated. ## Members diff --git a/docs/src/documents/Search/SearchAdvanced.md b/docs/src/documents/Search/SearchAdvanced.md index c22981e1b..9da14efd9 100644 --- a/docs/src/documents/Search/SearchAdvanced.md +++ b/docs/src/documents/Search/SearchAdvanced.md @@ -23,9 +23,9 @@ Benefits of DSL: * Use of classifications is accounted for in the syntax. * Provides way to group and aggregate results. -We will be using the quick start dataset in the examples that follow. This dataset is comprehensive enough to be used to to demonstrate the various features of the language. +We will be using the quick start dataset in the examples that follow. This dataset is comprehensive enough to be used to demonstrate the various features of the language. -For details on the grammar, please refer to Atlas DSL Grammer on [Github](https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/query/antlr4/AtlasDSLParser.g4) (Antlr G4 format). +For details on the grammar, please refer to Atlas DSL Grammar on [GitHub](https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/query/antlr4/AtlasDSLParser.g4) (Antlr G4 format). ## Using Advanced Search @@ -56,7 +56,7 @@ In the absence of _where_ for filtering on the source, the dataset fetched by th The _where_ clause allows for filtering over the dataset. This achieved by using conditions within the where clause. -A conditions is identifier followed by an operator followed by a literal. Literal must be enclosed in single or double quotes. Example, _name = "Sales"_. An identifier can be name of the property of the type specified in the _from_ clause or an alias. +A condition is an identifier followed by an operator followed by a literal. Literal must be enclosed in single or double quotes. Example, _name = "Sales"_. An identifier can be the name of the property of the type specified in the _from_ clause or an alias. Example: To retrieve entity of type _Table_ with a specific name say time_dim: @@ -125,7 +125,7 @@ Dates in this format follow this notation: * _yyyy-MM-ddTHH:mm:ss.SSSZ_. Which means, year-month-day followed by time in hour-minutes-seconds-milli-seconds. Date and time need to be separated by 'T'. It should end with 'Z'. * _yyyy-MM-dd_. Which means, year-month-day. -Example: Date represents December 11, 2017 at 2:35 AM. +Example: Date represents December 11, 2017, at 2:35 AM. <SyntaxHighlighter wrapLines={true} language="sql" style={theme.dark}> {`2017-12-11T02:35:0.0Z`} @@ -140,7 +140,7 @@ Example: To retrieve entity of type _Table_ created within 2017 and 2018. #### Using Boolean Literals Properties of entities of type boolean can be used within queries. -Eample: To retrieve entity of type hdfs_path whose attribute _isFile_ is set to _true_ and whose name is _Invoice_. +Example: To retrieve entity of type hdfs_path whose attribute _isFile_ is set to _true_ and whose name is _Invoice_. <SyntaxHighlighter wrapLines={true} language="sql" style={theme.dark}> {`from hdfs_path where isFile = true or name = "Invoice"`} @@ -151,7 +151,7 @@ Valid values for boolean literals are 'true' and 'false'. ### Existence of a Property The has keyword can be used with or without the where clause. It is used to check existence of a property in an entity. -Example: To retreive entity of type Table with a property locationUri. +Example: To retrieve entity of type Table with a property locationUri. <SyntaxHighlighter wrapLines={true} language="html" style={theme.dark}> {`Table has locationUri @@ -240,7 +240,7 @@ Example: To retrieve all the entities that are tagged with _Dimension_ classific {`Dimension where Dimension.priority = "high"`} </SyntaxHighlighter> -###Non Primitive attribute Filtering +###Non-Primitive attribute Filtering In the discussion so far we looked at where clauses with primitive types. This section will look at using properties that are non-primitive types. #### Relationship-based filtering @@ -432,7 +432,7 @@ Example: To know the number of entities owned by each owner. </SyntaxHighlighter> ### Using System Attributes -Each type defined within Atlas gets few attributes by default. These attributes help with internal book keeping of the entities. All the system attributes are prefixed with '__' (double underscore). This helps in identifying them from other attributes. +Each type defined within Atlas gets few attributes by default. These attributes help with internal bookkeeping of the entities. All the system attributes are prefixed with '__' (double underscore). This helps in identifying them from other attributes. Following are the system attributes: * __guid Each entity within Atlas is assigned a globally unique identifier (GUID for short). * __modifiedBy Name of the user who last modified the entity. @@ -518,4 +518,4 @@ The following clauses are no longer supported: ## Resources * Antlr [Book](https://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). * Antlr [Quick Start](https://github.com/antlr/antlr4/blob/master/doc/getting-started.md). - * Atlas DSL Grammar on [Github](https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/query/antlr4/AtlasDSLParser.g4) (Antlr G4 format). + * Atlas DSL Grammar on [GitHub](https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/query/antlr4/AtlasDSLParser.g4) (Antlr G4 format). diff --git a/docs/src/documents/Security/AtlasRangerAuthorizer.md b/docs/src/documents/Security/AtlasRangerAuthorizer.md index 9b019ec12..c5f1c512d 100644 --- a/docs/src/documents/Security/AtlasRangerAuthorizer.md +++ b/docs/src/documents/Security/AtlasRangerAuthorizer.md @@ -79,7 +79,7 @@ Following authorization policy allows user 'admin' to perform export/import admi ### Apache Ranger access audit for Apache Atlas authorizations Apache Ranger authorization plugin generates audit logs with details of the access authorized by the plugin. The details -include the object accessed (eg. hive_table with ID cost_savings.claim_savings@cl1), type of access performed (eg. +include the object accessed (e.g. hive_table with ID cost_savings.claim_savings@cl1), type of access performed (e.g. entity-add-classification, entity-remove-classification), name of the user, time of access and the IP address the access request came from - as shown in the following image. diff --git a/docs/src/documents/Security/AtlasSimpleAuthorizer.md b/docs/src/documents/Security/AtlasSimpleAuthorizer.md index fc470d742..d09fa8f0c 100644 --- a/docs/src/documents/Security/AtlasSimpleAuthorizer.md +++ b/docs/src/documents/Security/AtlasSimpleAuthorizer.md @@ -129,7 +129,7 @@ Roles defined above can be assigned (granted) to users as shown below: </SyntaxHighlighter> -Roles can be assigned (granted) to user-groups as shown below. An user can belong to multiple groups; roles assigned to +Roles can be assigned (granted) to user-groups as shown below. A user can belong to multiple groups; roles assigned to all groups the user belongs to will be used to authorize the access. <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> diff --git a/docs/src/documents/Security/Authentication.md b/docs/src/documents/Security/Authentication.md index 3bdb5f7e8..9252f5f59 100644 --- a/docs/src/documents/Security/Authentication.md +++ b/docs/src/documents/Security/Authentication.md @@ -112,7 +112,7 @@ atlas.authentication.method.ldap.ad.user.searchfilter=(sAMAccountName={0}) atlas.authentication.method.ldap.ad.default.role=ROLE_USER`} </SyntaxHighlighter> -### LDAP Directroy +### LDAP Directory <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> {`atlas.authentication.method.ldap.url=ldap://<Ldap server ip>:389 @@ -130,8 +130,8 @@ atlas.authentication.method.ldap.default.role=ROLE_USER`} ### Keycloak Method. -To enable Keycloak authentication mode in Atlas, set the property `atlas.authentication.method.keycloak` to true and also set the property `atlas.authentication.method.keycloak.file` to the localtion of your `keycloak.json` in `atlas-application.properties`. -Also set `atlas.authentication.method.keycloak.ugi-groups` to false if you want to pickup groups from Keycloak. By default the groups will be picked up from the *roles* defined in Keycloak. In case you want to use the groups +To enable Keycloak authentication mode in Atlas, set the property `atlas.authentication.method.keycloak` to true and also set the property `atlas.authentication.method.keycloak.file` to the location of your `keycloak.json` in `atlas-application.properties`. +Also set `atlas.authentication.method.keycloak.ugi-groups` to false if you want to pickup groups from Keycloak. By default, the groups will be picked up from the *roles* defined in Keycloak. In case you want to use the groups you need to create a mapping in keycloak and define `atlas.authentication.method.keycloak.groups_claim` equal to the token claim name. Make sure **not** to use the full group path and add the information to the access token. <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> @@ -140,7 +140,7 @@ atlas.authentication.method.keycloak.file=/opt/atlas/conf/keycloak.json atlas.authentication.method.keycloak.ugi-groups=false`} </SyntaxHighlighter> -Setup you keycloak.json per instructions from Keycloak. Make sure to include `"principal-attribute": "preferred_username"` to ensure readable user names and `"autodetect-bearer-only": true`. +Setup you keycloak.json per instructions from Keycloak. Make sure to include `"principal-attribute": "preferred_username"` to ensure readable usernames and `"autodetect-bearer-only": true`. <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> {`{ diff --git a/docs/src/documents/Security/Security.md b/docs/src/documents/Security/Security.md index af7d80e51..fcadc0468 100644 --- a/docs/src/documents/Security/Security.md +++ b/docs/src/documents/Security/Security.md @@ -27,8 +27,8 @@ Both SSL one-way (server authentication) and two-way (server and client authenti * `keystore.file` - the path to the keystore file leveraged by the server. This file contains the server certificate. * `truststore.file` - the path to the truststore file. This file contains the certificates of other trusted entities (e.g. the certificates for client processes if two-way SSL is enabled). In most instances this can be set to the same value as the keystore.file property (especially if one-way SSL is enabled). * `client.auth.enabled` (false|true) [default: false] - enable/disable client authentication. If enabled, the client will have to authenticate to the server during the transport session key creation process (i.e. two-way SSL is in effect). - * `cert.stores.credential.provider.path` - the path to the Credential Provider store file. The passwords for the keystore, truststore, and server certificate are maintained in this secure file. Utilize the cputil script in the 'bin' directoy (see below) to populate this file with the passwords required. - * `atlas.ssl.exclude.cipher.suites` - the excluded Cipher Suites list - *NULL.*,.*RC4.*,.*MD5.*,.*DES.*,.*DSS.* are weak and unsafe Cipher Suites that are excluded by default. If additional Ciphers need to be excluded, set this property with the default Cipher Suites such as atlas.ssl.exclude.cipher.suites=.*NULL.*, .*RC4.*, .*MD5.*, .*DES.*, .*DSS.*, and add the additional Ciper Suites to the list with a comma separator. They can be added with their full name or a regular expression [...] + * `cert.stores.credential.provider.path` - the path to the Credential Provider store file. The passwords for the keystore, truststore, and server certificate are maintained in this secure file. Utilize the cputil script in the 'bin' directory (see below) to populate this file with the passwords required. + * `atlas.ssl.exclude.cipher.suites` - the excluded Cipher Suites list - *NULL.*,.*RC4.*,.*MD5.*,.*DES.*,.*DSS.* are weak and unsafe Cipher Suites that are excluded by default. If additional Ciphers need to be excluded, set this property with the default Cipher Suites such as atlas.ssl.exclude.cipher.suites=.*NULL.*, .*RC4.*, .*MD5.*, .*DES.*, .*DSS.*, and add the additional Cipher Suites to the list with a comma separator. They can be added with their full name or a regular expressio [...] #### Credential Provider Utility Script @@ -58,7 +58,7 @@ The properties for configuring service authentication are: ### JAAS configuration -In a secure cluster, some of the components (such as Kafka) that Atlas interacts with, require Atlas to authenticate itself to them using JAAS. The following properties are used to set up appropriate JAAS Configuration. +In a secure cluster, some components (such as Kafka) that Atlas interacts with, require Atlas to authenticate itself to them using JAAS. The following properties are used to set up appropriate JAAS Configuration. * `atlas.jaas.client-id.loginModuleName` - the authentication method used by the component (for example, com.sun.security.auth.module.Krb5LoginModule) * `atlas.jaas.client-id.loginModuleControlFlag` (required|requisite|sufficient|optional) [default: required] @@ -126,9 +126,9 @@ MyClient { ## SPNEGO-based HTTP Authentication -HTTP access to the Atlas platform can be secured by enabling the platform's SPNEGO support. There are currently two supported authentication mechanisms: +HTTP accesses to the Atlas platform can be secured by enabling the platform's SPNEGO support. There are currently two supported authentication mechanisms: - * `simple` - authentication is performed via a provided user name + * `simple` - authentication is performed via a provided username * `kerberos` - the KDC authenticated identity of the client is leveraged to authenticate to the server The kerberos support requires the client accessing the server to first authenticate to the KDC (usually this is done via the 'kinit' command). Once authenticated, the user may access the server (the authenticated identity will be related to the server via the SPNEGO negotiation mechanism). diff --git a/docs/src/documents/Setup/BuildInstruction.md b/docs/src/documents/Setup/BuildInstruction.md index b067676cd..5d8605482 100644 --- a/docs/src/documents/Setup/BuildInstruction.md +++ b/docs/src/documents/Setup/BuildInstruction.md @@ -13,7 +13,7 @@ import SyntaxHighlighter from 'react-syntax-highlighter'; ### Building Apache Atlas Download Apache Atlas 1.0.0 release sources, apache-atlas-1.0.0-sources.tar.gz, from the [downloads](#/Downloads) page. -Then follow the instructions below to to build Apache Atlas. +Then follow the instructions below to build Apache Atlas. @@ -34,7 +34,7 @@ mvn clean -DskipTests package -Pdist * NOTES: * Remove option '-DskipTests' to run unit and integration tests - * To build a distribution without minified js,css file, build with _skipMinify_ profile. By default js and css files are minified. + * To build a distribution without minified js,css file, build with _skipMinify_ profile. By default, js and css files are minified. Above will build Apache Atlas for an environment having functional HBase and Solr instances. Apache Atlas needs to be setup with the following to run in this environment: diff --git a/docs/src/documents/Setup/Configuration.md b/docs/src/documents/Setup/Configuration.md index b72a0cc44..e7f5bc989 100644 --- a/docs/src/documents/Setup/Configuration.md +++ b/docs/src/documents/Setup/Configuration.md @@ -55,7 +55,7 @@ Elasticsearch is a prerequisite for Apache Atlas use. Set the following properti <SyntaxHighlighter wrapLines={true} language="bash" style={theme.dark}> {`atlas.graph.index.search.backend=elasticsearch -atlas.graph.index.search.hostname=<hostname(s) of the Elasticsearch master nodes comma separated> +atlas.graph.index.search.hostname=<hostname(s) of the Elasticsearch master nodes, comma separated> atlas.graph.index.search.elasticsearch.client-only=true`} </SyntaxHighlighter> @@ -131,7 +131,7 @@ atlas.server.ha.zookeeper.connect=zk1.company.com:2181,zk2.company.com:2181,zk3. atlas.server.ha.zookeeper.num.retries=3 # Specify how much time should the server wait before attempting connections to Zookeeper, in case of any connection issues. atlas.server.ha.zookeeper.retry.sleeptime.ms=1000 -# Specify how long a session to Zookeeper should last without inactiviy to be deemed as unreachable. +# Specify how long a session to Zookeeper should last without inactivity to be deemed as unreachable. atlas.server.ha.zookeeper.session.timeout.ms=20000 # Specify the scheme and the identity to be used for setting up ACLs on nodes created in Zookeeper for HA. # The format of these options is <scheme:identity>. diff --git a/docs/src/documents/Setup/EclipseSetup.md b/docs/src/documents/Setup/EclipseSetup.md index 1092e8319..cfb044a40 100644 --- a/docs/src/documents/Setup/EclipseSetup.md +++ b/docs/src/documents/Setup/EclipseSetup.md @@ -43,8 +43,8 @@ Atlas command line tools are written in Python. * Install the Scala IDE, TestNG, and m2eclipse-scala features/plugins as described below. **Scala IDE Eclipse feature** -Some of the Atlas source code is written in the Scala programming language. The Scala IDE feature is required to compile Scala source code in Eclipse. - * In Eclipse, choose Help - Install New Software.. +Some Atlas source code is written in the Scala programming language. The Scala IDE feature is required to compile Scala source code in Eclipse. + * In Eclipse, choose Help - Install New Software... * Click Add... to add an update site, and set Location to http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site * Select Scala IDE for Eclipse from the list of available features * Restart Eclipse after install @@ -52,13 +52,13 @@ Some of the Atlas source code is written in the Scala programming language. The *TestNG Eclipse plug-in* Atlas tests use the [TestNG framework](http://testng.org/doc/documentation-main.html), which is similar to JUnit. The TestNG plug-in is required to run TestNG tests from Eclipse. - * In Eclipse, choose Help - Install New Software.. + * In Eclipse, choose Help - Install New Software... * Click Add... to add an update site, and set Location to http://beust.com/eclipse-old/eclipse_6.9.9.201510270734 * Choose TestNG and continue with install * Restart Eclipse after installing the plugin * In Window - Preferences - TestNG, <b>un</b>check "Use project TestNG jar" *m2eclipse-scala Eclipse plugin* - * In Eclipse, choose Help - Install New Software.. + * In Eclipse, choose Help - Install New Software... * Click Add... to add an update site, and set Location to http://alchim31.free.fr/m2e-scala/update-site/ * Choose Maven Integration for Scala IDE, and continue with install * Restart Eclipse after install @@ -95,7 +95,7 @@ g. Restart Eclipse h. Choose Project - Clean, select Clean all projects, and click OK. -Some projects may not pick up the Scala library – if this occurs, quick fix on those projects to add in the Scala library – projects atlas-typesystem, atlas-repository, hdfs-model, storm-bridge and altas-webapp. +Some projects may not pick up the Scala library – if this occurs, quick fix on those projects to add in the Scala library – projects atlas-typesystem, atlas-repository, hdfs-model, storm-bridge and atlas-webapp. You should now have a clean workspace. diff --git a/docs/src/documents/Setup/InstallationInstruction.md b/docs/src/documents/Setup/InstallationInstruction.md index 100eaf51e..de432e034 100644 --- a/docs/src/documents/Setup/InstallationInstruction.md +++ b/docs/src/documents/Setup/InstallationInstruction.md @@ -68,7 +68,7 @@ To stop Apache Atlas, run following command: ### Configuring Apache Atlas -By default config directory used by Apache Atlas is _{package dir}/conf_. To override this set environment variable ATLAS_CONF to the path of the conf dir. +By default, config directory used by Apache Atlas is _{package dir}/conf_. To override this set environment variable ATLAS_CONF to the path of the conf dir. Environment variables needed to run Apache Atlas can be set in _atlas-env.sh_ file in the conf directory. This file will be sourced by Apache Atlas scripts before any commands are executed. The following environment variables are available to set. @@ -82,25 +82,25 @@ Environment variables needed to run Apache Atlas can be set in _atlas-env.sh_ fi # any additional java opts that you want to set for client only #export ATLAS_CLIENT_OPTS= -# java heap size we want to set for the client. Default is 1024MB +# java heap size we want to set for the client. Default is 1024 MB #export ATLAS_CLIENT_HEAP= # any additional opts you want to set for atlas service. #export ATLAS_SERVER_OPTS= -# java heap size we want to set for the atlas server. Default is 1024MB +# java heap size we want to set for the atlas server. Default is 1024 MB #export ATLAS_SERVER_HEAP= -# What is is considered as atlas home dir. Default is the base location of the installed software +# What is considered as atlas home dir. Default is the base location of the installed software #export ATLAS_HOME_DIR= -# Where log files are stored. Defatult is logs directory under the base install location +# Where log files are stored. Default is logs directory under the base install location #export ATLAS_LOG_DIR= -# Where pid files are stored. Defatult is logs directory under the base install location +# Where pid files are stored. Default is logs directory under the base install location #export ATLAS_PID_DIR= -# Where do you want to expand the war file. By Default it is in /server/webapp dir under the base install dir. +# Where do you want to expand the war file. By Default, it is in /server/webapp dir under the base install dir. #export ATLAS_EXPANDED_WEBAPP_DIR=`} </SyntaxHighlighter> @@ -122,8 +122,8 @@ The following values are recommended for JDK 8: export ATLAS_SERVER_HEAP="-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m -XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m" </SyntaxHighlighter> -*NOTE for Mac OS users* -If you are using a Mac OS, you will need to configure the ATLAS_SERVER_OPTS (explained above). +*NOTE for macOS users* +If you are using a macOS, you will need to configure the ATLAS_SERVER_OPTS (explained above). In _{package dir}/conf/atlas-env.sh_ uncomment the following line <SyntaxHighlighter wrapLines={true} language="powershell" style={theme.dark}> @@ -169,7 +169,7 @@ SolrCloud mode uses a ZooKeeper Service as a highly available, central location </SyntaxHighlighter> - * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Apache Solr corresponding to the indexes that Apache Atlas uses. In the case that the Apache Atlas and Apache Solr instances are on 2 different hosts, first copy the required configuration files from ATLAS_HOME/conf/solr on the Apache Atlas instance host to Apache Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where Apache Solr configuration files h [...] + * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Apache Solr corresponding to the indexes that Apache Atlas uses. In the case that the Apache Atlas and Apache Solr instances are on 2 different hosts, first copy the required configuration files from ATLAS_HOME/conf/solr on the Apache Atlas instance host to Apache Solr instance host. SOLR_CONF in the below-mentioned commands refer to the directory where Apache Solr configuration files h [...] <SyntaxHighlighter wrapLines={true} language="powershell" style={theme.dark}> {`$SOLR_BIN/solr create -c vertex_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor @@ -178,7 +178,7 @@ $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replica </SyntaxHighlighter> Note: If numShards and replicationFactor are not specified, they default to 1 which suffices if you are trying out solr with ATLAS on a single node instance. - Otherwise specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration. + Otherwise, specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration. The number of shards cannot exceed the total number of Solr nodes in your SolrCloud cluster. The number of replicas (replicationFactor) can be set according to the redundancy required. @@ -200,7 +200,7 @@ For more information on JanusGraph solr configuration , please refer http://docs Pre-requisites for running Apache Solr in cloud mode * Memory - Apache Solr is both memory and CPU intensive. Make sure the server running Apache Solr has adequate memory, CPU and disk. - Apache Solr works well with 32GB RAM. Plan to provide as much memory as possible to Apache Solr process + Apache Solr works well with 32 GB RAM. Plan to provide as much memory as possible to Apache Solr process * Disk - If the number of entities that need to be stored are large, plan to have at least 500 GB free space in the volume where Apache Solr is going to store the index data * SolrCloud has support for replication and sharding. It is highly recommended to use SolrCloud with at least two Apache Solr nodes running on different servers with replication enabled. If using SolrCloud, then you also need ZooKeeper installed and configured with 3 or 5 ZooKeeper nodes @@ -208,7 +208,7 @@ Pre-requisites for running Apache Solr in cloud mode * Start Apache Solr in http mode - alternative setup to Solr in cloud mode. Solr Standalone is used for a single instance, and it keeps configuration information on the file system. It does not require zookeeper and provides high performance for medium size index. - Can be consider as a good option for fast prototyping as well as valid configuration for development environments. In some cases it demonstrates a better performance than solr cloud mode in production grade setup of Atlas. + Can be considered as a good option for fast prototyping as well as valid configuration for development environments. In some cases it demonstrates a better performance than solr cloud mode in production grade setup of Atlas. * Change ATLAS configuration to point to Standalone Apache Solr instance setup. Please make sure the following configurations are set to the below values in ATLAS_HOME/conf/atlas-application.properties diff --git a/docs/src/documents/Tools/AtlasRepairIndex.md b/docs/src/documents/Tools/AtlasRepairIndex.md index 0161cb1cb..801f6c59a 100644 --- a/docs/src/documents/Tools/AtlasRepairIndex.md +++ b/docs/src/documents/Tools/AtlasRepairIndex.md @@ -34,7 +34,7 @@ atlas-index-repair/repair_index.py This will result in vertex_index, edge_index and fulltext_index to be re-built completely. It is recommended that existing contents of these indexes be deleted before executing this restore. ###### Caveats -Note that the full index repair is a time consuming process. Depending on the size of data the process may take days to complete. During the restore process the Basic Search functionality will not be available. Be sure to allocate sufficient time for this activity. +Note that the full index repair is a time-consuming process. Depending on the size of data the process may take days to complete. During the restore process the Basic Search functionality will not be available. Be sure to allocate sufficient time for this activity. ##### Selective Restore diff --git a/docs/src/documents/TypeSystem.md b/docs/src/documents/TypeSystem.md index 7d6125591..7cfbc70d1 100644 --- a/docs/src/documents/TypeSystem.md +++ b/docs/src/documents/TypeSystem.md @@ -20,7 +20,7 @@ Atlas out of the box (like Hive tables, for e.g.) are modelled using types and r types of metadata in Atlas, one needs to understand the concepts of the type system component. ## Types -A Type in Atlas is a definition of how a particular type of metadata objects are stored and accessed. A type represents one or a collection of attributes that define the properties for the metadata object. Users with a development background will recognize the similarity of a type to a ‘Class’ definition of object oriented programming languages, or a ‘table schema’ of relational databases. +A Type in Atlas is a definition of how a particular type of metadata objects are stored and accessed. A type represents one or a collection of attributes that define the properties for the metadata object. Users with a development background will recognize the similarity of a type to a ‘Class’ definition of object-oriented programming languages, or a ‘table schema’ of relational databases. An example of a type that comes natively defined with Atlas is a Hive table. A Hive table is defined with these attributes: @@ -56,14 +56,14 @@ The following points can be noted from the above example: * Enum metatypes * Collection metatypes: array, map * Composite metatypes: Entity, Struct, Classification, Relationship - * Entity & Classification types can ‘extend’ from other types, called ‘supertype’ - by virtue of this, it will get to include the attributes that are defined in the supertype as well. This allows modellers to define common attributes across a set of related types etc. This is again similar to the concept of how Object Oriented languages define super classes for a class. It is also possible for a type in Atlas to extend from multiple super types. + * Entity & Classification types can ‘extend’ from other types, called ‘supertype’ - by virtue of this, it will get to include the attributes that are defined in the supertype as well. This allows modellers to define common attributes across a set of related types etc. This is again similar to the concept of how Object-Oriented languages define super classes for a class. It is also possible for a type in Atlas to extend from multiple super types. * In this example, every hive table extends from a pre-defined supertype called a ‘DataSet’. More details about this pre-defined types will be provided later. * Types which have a metatype of ‘Entity’, ‘Struct’, ‘Classification’ or 'Relationship' can have a collection of attributes. Each attribute has a name (e.g. ‘name’) and some other associated properties. A property can be referred to using an expression type_name.attribute_name. It is also good to note that attributes themselves are defined using Atlas metatypes. * In this example, hive_table.name is a String, hive_table.aliases is an array of Strings, hive_table.db refers to an instance of a type called hive_db and so on. * Type references in attributes, (like hive_table.db) are particularly interesting. Note that using such an attribute, we can define arbitrary relationships between two types defined in Atlas and thus build rich models. Note that one can also collect a list of references as an attribute type (e.g. hive_table.columns which represents a list of references from hive_table to hive_column type) ## Entities -An ‘entity’ in Atlas is a specific value or instance of an Entity ‘type’ and thus represents a specific metadata object in the real world. Referring back to our analogy of Object Oriented Programming languages, an ‘instance’ is an‘Object’ of a certain ‘Class’. +An ‘entity’ in Atlas is a specific value or instance of an Entity ‘type’ and thus represents a specific metadata object in the real world. Referring back to our analogy of Object-Oriented Programming languages, an ‘instance’ is an‘Object’ of a certain ‘Class’. An example of an entity will be a specific Hive Table. Say Hive has a table called ‘customers’ in the ‘default’database. This table will be an ‘entity’ in Atlas of type hive_table. By virtue of being an instance of an entity type, it will have values for every attribute that are a part of the Hive table ‘type’, such as: @@ -103,7 +103,7 @@ values: The following points can be noted from the example above: - * Every instance ofan entity type is identified by a unique identifier, a GUID. This GUID is generated by the Atlas server when the object is defined, and remains constant for the entire lifetime of the entity. At any point in time, this particular entity can be accessed using its GUID. + * Every instance of an entity type is identified by a unique identifier, a GUID. This GUID is generated by the Atlas server when the object is defined, and remains constant for the entire lifetime of the entity. At any point in time, this particular entity can be accessed using its GUID. * In this example, the ‘customers’ table in the default database is uniquely identified by the GUID "9ba387dd-fa76-429c-b791-ffc338d3c91f" * An entity is of a given type, and the name of the type is provided with the entity definition. * In this example, the ‘customers’ table is a ‘hive_table. @@ -114,7 +114,7 @@ With this idea on entities, we can now see the difference between Entity and Str ## Attributes We already saw that attributes are defined inside metatypes like Entity, Struct, Classification and Relationship. But we -implistically referred to attributes as having a name and a metatype value. However, attributes in Atlas have some more +implicitly referred to attributes as having a name and a metatype value. However, attributes in Atlas have some more properties that define more concepts related to the type system. An attribute has the following properties: @@ -133,13 +133,13 @@ The properties above have the following meanings: * name - the name of the attribute * dataTypeName - the metatype name of the attribute (native, collection or composite) * isComposite - - * This flag indicates an aspect of modelling. If an attribute is defined as composite, it means that it cannot have a lifecycle independent of the entity it is contained in. A good example of this concept is the set of columns that make a part of a hive table. Since the columns do not have meaning outside of the hive table, they are defined as composite attributes. + * This flag indicates an aspect of modelling. If an attribute is defined as composite, it means that it cannot have a lifecycle independent of the entity it is contained in. A good example of this concept is the set of columns that make a part of a hive table. Since the columns do not have meaning outside the hive table, they are defined as composite attributes. * A composite attribute must be created in Atlas along with the entity it is contained in. i.e. A hive column must be created along with the hive table. * isIndexable - - * This flag indicates whether this property should be indexed on, so that look ups can be performed using the attribute value as a predicate and can be performed efficiently. + * This flag indicates whether this property should be indexed on, so that look-ups can be performed using the attribute value as a predicate and can be performed efficiently. * isUnique - - * This flag is again related to indexing. If specified to be unique, it means that a special index is created for this attribute in JanusGraph that allows for equality based look ups. - * Any attribute with a true value for this flag is treated like a primary key to distinguish this entity from other entities. Hence care should be taken ensure that this attribute does model a unique property in real world. + * This flag is again related to indexing. If specified to be unique, it means that a special index is created for this attribute in JanusGraph that allows for equality based look-ups. + * Any attribute with a true value for this flag is treated like a primary key to distinguish this entity from other entities. Hence, care should be taken ensure that this attribute does model a unique property in real world. * For e.g. consider the name attribute of a hive_table. In isolation, a name is not a unique attribute for a hive_table, because tables with the same name can exist in multiple databases. Even a pair of (database name, table name) is not unique if Atlas is storing metadata of hive tables amongst multiple clusters. Only a cluster location, database name and table name can be deemed unique in the physical world. * multiplicity - indicates whether this attribute is required, optional, or could be multi-valued. If an entity’s definition of the attribute value does not match the multiplicity declaration in the type definition, this would be a constraint violation and the entity addition will fail. This field can therefore be used to define some constraints on the metadata information. @@ -174,7 +174,7 @@ Note the “isOptional=true” constraint - a table entity cannot be created wit always be bound to the table entity they are defined with. From this description and examples, you will be able to realize that attribute definitions can be used to influence -specific modelling behavior (constraints, indexing, etc) to be enforced by the Atlas system. +specific modelling behavior (constraints, indexing, etc.) to be enforced by the Atlas system. ## System specific types and their significance Atlas comes with a few pre-defined system types. We saw one example (DataSet) in preceding sections. In this @@ -193,14 +193,14 @@ make convention based assumptions about what attributes they can expect of types **Infrastructure**: This type extends Asset and typically can be used to be a common super type for infrastructural metadata objects like clusters, hosts etc. -**DataSet**: This type extends Referenceable. Conceptually, it can be used to represent an type that stores data. In Atlas, -hive tables, hbase_tables etc are all types that extend from DataSet. Types that extend DataSet can be expected to have +**DataSet**: This type extends Referenceable. Conceptually, it can be used to represent a type that stores data. In Atlas, +hive tables, hbase_tables etc. are all types that extend from DataSet. Types that extend DataSet can be expected to have a Schema in the sense that they would have an attribute that defines attributes of that dataset. For e.g. the columns -attribute in a hive_table. Also entities of types that extend DataSet participate in data transformation and this +attribute in a hive_table. Also, entities of types that extend DataSet participate in data transformation and this transformation can be captured by Atlas via lineage (or provenance) graphs. **Process**: This type extends Asset. Conceptually, it can be used to represent any data transformation operation. For example, an ETL process that transforms a hive table with raw data to another hive table that stores some aggregate can be a specific type that extends the Process type. A Process type has two specific attributes, inputs and outputs. Both -inputs and outputs are arrays of DataSet entities. Thus an instance of a Process type can use these inputs and outputs +inputs and outputs are arrays of DataSet entities. Thus, an instance of a Process type can use these inputs and outputs to capture how the lineage of a DataSet evolves. diff --git a/docs/src/documents/Whats-New/WhatsNew-2.0.md b/docs/src/documents/Whats-New/WhatsNew-2.0.md index 724e8ab31..54c499295 100644 --- a/docs/src/documents/Whats-New/WhatsNew-2.0.md +++ b/docs/src/documents/Whats-New/WhatsNew-2.0.md @@ -25,7 +25,7 @@ submenu: Whats New * Notification processing to support batch-commits * New option in notification processing to ignore potentially incorrect hive_column_lineage * Updated Hive hook to avoid duplicate column-lineage entities; also updated Atlas server to skip duplicate column-lineage entities - * Improved batch processing in notificaiton handler to avoid processing of an entity multiple times + * Improved batch processing in notification handler to avoid processing of an entity multiple times * Add option to ignore/prune metadata for temporary/staging hive tables * Avoid unnecessary lookup when creating new relationships * UI Improvements: diff --git a/docs/src/documents/Whats-New/WhatsNew-2.1.md b/docs/src/documents/Whats-New/WhatsNew-2.1.md index 82079f9f5..17d4b8327 100644 --- a/docs/src/documents/Whats-New/WhatsNew-2.1.md +++ b/docs/src/documents/Whats-New/WhatsNew-2.1.md @@ -17,9 +17,9 @@ submenu: Whats New ## Enhancements * **Search**: ability to find entities by more than one classification * **Performance**: improvements in lineage retrieval and classification-propagation -* **Notification**: ability to process notificaitons from multiple Kafka topics +* **Notification**: ability to process notifications from multiple Kafka topics * **Hive Hook**: tracks process-executions via hive_process_execution entities -* **Hive Hook**: catures DDL operations via hive_db_ddl and hive_table_ddl entities +* **Hive Hook**: captures DDL operations via hive_db_ddl and hive_table_ddl entities * **Notification**: introduced shell entities to record references to non-existing entities in notifications * **Spark**: added model to capture Spark entities, processes and relationships * **AWS S3**: introduced updated model to capture AWS S3 entities and relationships