Re: Review Request 72156: ATLAS-3618 Entities with no guid appears in search result

2020-03-05 Thread Pinal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72156/
---

(Updated March 5, 2020, 11:35 a.m.)


Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
Subramanian.


Changes
---

addressed typeNamePredicate null check


Bugs: ATLAS-3618
https://issues.apache.org/jira/browse/ATLAS-3618


Repository: atlas


Description
---

1) Entities of struct types appears when ALL_ENTITY_TYPES is selected
2) Entities of internal types like AtlasGlossary etc appears when 
ALL_ENTITY_TYPES is selected


Diffs (updated)
-

  
repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
 6ab0afbf9 
  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
8f531876b 
  repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
bb1e9f633 


Diff: https://reviews.apache.org/r/72156/diff/3/

Changes: https://reviews.apache.org/r/72156/diff/2-3/


Testing
---

1) typeName: ALL_ENTITY_TYPES returns all entities with no struct types(whoes 
guid isnotnull) and no internal types(whoes supertype is not _internal)
2) typeName: ALL_ENTITY_TYPES, filter: guid isnull, returns no result
3) typeName: ALL_ENTITY_TYPES, filter: typeName begins_with Atlas, returns no 
result

Usecase:
-> Added, term1 in Glossary
-> added classification1 to term1
1) In search Panel(showing all entities) -> term1 shouln't appear
2) In classification Panel, showing all entities associated to classification1 
-> term1 shouldn't appear


Thanks,

Pinal Shah



[jira] [Created] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)
Damian Warszawski created ATLAS-3654:


 Summary: Support solr in standalone (http) mode
 Key: ATLAS-3654
 URL: https://issues.apache.org/jira/browse/ATLAS-3654
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Affects Versions: 3.0.0
Reporter: Damian Warszawski


*Problem description*

Atlas does not support running Solr in standalone(http) mode.

*Goals*

 It is especially useful for testing purposes to make setup as simple as 
possible without  Zookeeper. It also enables full integration with JanusGraph 
as it support both mode of running Solr `cloud` and `http` 
[https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

*Proposed solution*
 * call solr V1 API  while creating/updating request handlers in standalone solr
 * update atlas start script to enable standalone embedded solr

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: ATLAS-3654.patch

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72197: ATLAS-3653: renamed Namespace to EntityExtn

2020-03-05 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72197/#review219783
---


Fix it, then Ship it!





webapp/src/main/java/org/apache/atlas/web/rest/EntityREST.java
Line 860 (original), 860 (patched)


can we use 'extension' instead of 'extns' in REST path for clarity?

e.g. "/guid/{guid}/extensions"


- Sarath Subramanian


On March 4, 2020, 9:10 p.m., Madhan Neethiraj wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72197/
> ---
> 
> (Updated March 4, 2020, 9:10 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, keval bhatt, Sameer Shaikh, and 
> Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3653
> https://issues.apache.org/jira/browse/ATLAS-3653
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> renamed Namespace to EntityExtn
> 
> 
> Diffs
> -
> 
>   
> authorization/src/main/java/org/apache/atlas/authorize/AtlasEntityAccessRequest.java
>  f2e483888 
>   authorization/src/main/java/org/apache/atlas/authorize/AtlasPrivilege.java 
> 7d81e22f8 
>   
> authorization/src/main/java/org/apache/atlas/authorize/simple/AtlasSimpleAuthorizer.java
>  5f0c7b2b7 
>   
> authorization/src/main/java/org/apache/atlas/authorize/simple/AtlasSimpleAuthzPolicy.java
>  47b728003 
>   
> authorization/src/test/java/org/apache/atlas/authorize/simple/AtlasSimpleAuthorizerTest.java
>  e585e93d2 
>   authorization/src/test/resources/atlas-simple-authz-policy.json 379d42b0c 
>   intg/src/main/java/org/apache/atlas/AtlasErrorCode.java 04eb4a08e 
>   intg/src/main/java/org/apache/atlas/model/TypeCategory.java cbcd0a3c8 
>   intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 
> 2e2e4ee03 
>   intg/src/main/java/org/apache/atlas/model/typedef/AtlasEntityDef.java 
> dcae71676 
>   intg/src/main/java/org/apache/atlas/model/typedef/AtlasNamespaceDef.java 
> 713a2c26a 
>   intg/src/main/java/org/apache/atlas/model/typedef/AtlasTypesDef.java 
> 81ea946e5 
>   intg/src/main/java/org/apache/atlas/store/AtlasTypeDefStore.java b08ace442 
>   intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 2824feb16 
>   intg/src/main/java/org/apache/atlas/type/AtlasNamespaceType.java ede84436d 
>   intg/src/main/java/org/apache/atlas/type/AtlasStructType.java 5c62a2497 
>   intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 5b7cbeef5 
>   intg/src/main/java/org/apache/atlas/type/AtlasTypeUtil.java 5b115b530 
>   intg/src/main/java/org/apache/atlas/typesystem/types/DataTypes.java 
> d57a48443 
>   intg/src/test/java/org/apache/atlas/TestRelationshipUtilsV2.java 32ed6ee4e 
>   
> intg/src/test/java/org/apache/atlas/model/typedef/TestAtlasNamespaceDef.java 
> 88677740b 
>   repository/src/main/java/org/apache/atlas/query/IdentifierHelper.java 
> c443652f0 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
>  a3a570b97 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ExportService.java
>  82a2d31d5 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ExportTypeProcessor.java
>  b5533525f 
>   
> repository/src/main/java/org/apache/atlas/repository/store/bootstrap/AtlasTypeDefStoreInitializer.java
>  2a602c871 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasTypeDefGraphStore.java
>  e1ef84924 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasNamespaceDefStoreV2.java
>  eaaf6bbe3 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasTypeDefGraphStoreV2.java
>  afdfba9b8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
>  7533ebc78 
>   repository/src/main/java/org/apache/atlas/repository/util/FilterUtil.java 
> df27b0ce4 
>   
> repository/src/test/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityExtnDefStoreV2Test.java
>  PRE-CREATION 
>   
> repository/src/test/java/org/apache/atlas/repository/store/graph/v2/AtlasNamespaceDefStoreV2Test.java
>  e2f5c16a7 
>   webapp/src/main/java/org/apache/atlas/examples/QuickStartV2.java 72f5befee 
>   webapp/src/main/java/org/apache/atlas/web/rest/EntityREST.java fcf71891f 
>   webapp/src/main/java/org/apache/atlas/web/rest/TypesREST.java e7cf62d07 
> 
> 
> Diff: https://reviews.apache.org/r/72197/diff/1/
> 
> 
> Testing
> 

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219785
---




repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
Lines 34 (patched)


methods defined here looks more of like helper methods  than interface 
methods.


- Sarath Subramanian


On March 4, 2020, 10:09 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 4, 2020, 10:09 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
>  PRE-CREATION 
>   repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/12/
> 
> 
> Testing
> ---
> 
> 

Re: Review Request 72182: ATLAS-3647 : System attribute search : isIncomplete attribute has 1, null as values

2020-03-05 Thread mayank jain

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72182/
---

(Updated March 5, 2020, 9:50 a.m.)


Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
Subramanian.


Bugs: ATLAS-3647
https://issues.apache.org/jira/browse/ATLAS-3647


Repository: atlas


Description
---

In entity definition , 
isIncomplete is null when is entity is complete
isIncomplete is 1 when is entity is incomplete (shell/ghost entities)

when isIncomplete = false is expected to return all complete entities 
(non-shell entities) but since isIncomplete is null , it doesn't return any 
entity.

In system attributes search,
isIncomplete takes conditions =, != , not null , null with values true , false.


Solution :
Basically while creating normal entities isIncomplete attribute is nowhere into 
consideration and also it only comes into picture for shell entities.

So, when we try searching for isIncomplete = false , the normal entities which 
does not hold any value for this particular attribute we can directly alter the 
graph query search with an OR condition.

i.e _isIncomplete = false OR _isIncomplete is null

This will return all the entities which  were once a shell entity and then got 
updated to full entity and all the normal entities which never went into the 
process of shell entities and have isIncomplete attribute as null.


Diffs (updated)
-

  repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
356363db0 


Diff: https://reviews.apache.org/r/72182/diff/3/

Changes: https://reviews.apache.org/r/72182/diff/2-3/


Testing
---

Tested the complete working  of isIncomplete attribute and it works fine.


Thanks,

mayank jain



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219784
---




intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java
Line 114 (original), 117 (patched)


nit: casting to String is not needed.



repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
Line 56 (original), 56 (patched)


add '@Override' annotation to methods overriding from interface.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
Line 69 (original), 69 (patched)


add '@Override' annotation to methods overriding from interface.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
Line 73 (original), 64 (patched)


ternary operation here is long and not intuitive. Consider refactoring to 
method:

ImportStrategy importStrategy = initImportStrategy(importResult);


- Sarath Subramanian


On March 4, 2020, 10:09 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 4, 2020, 10:09 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  

Re: Review Request 71482: ATLAS-3423:-Import Glossary Terms CSV into a Glossary

2020-03-05 Thread Sidharth Mishra

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71482/#review219786
---




common/src/main/java/org/apache/atlas/repository/Constants.java
Lines 205 (patched)


Please provide comments which will be useful to get some more context or 
remove the comment. In this case the variable name is more informative than the 
comment.



common/src/main/java/org/apache/atlas/repository/Constants.java
Lines 207 (patched)


Please consider renaming GlossaryImportSupportedFileFormats to 
GlossaryImportSupportedFileExtensions as you have used extensions at other 
places and comments



dashboardv2/public/js/views/glossary/ImportGlossaryLayoutView.js
Lines 53 (patched)


This comment is not adding any useful information as the initialize block 
is self explanatory.


- Sidharth Mishra


On March 4, 2020, 10:07 a.m., mayank jain wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71482/
> ---
> 
> (Updated March 4, 2020, 10:07 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3423
> https://issues.apache.org/jira/browse/ATLAS-3423
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This patch consists implementation for 2 end points first for template 
> download and other for csv file upload with term details also the Unit Test 
> cases for both the end points.
> 
> * The 1st endpoint {glossary/template} return template file this would be 
> type of format of data that shows how the data needs to be populated by user 
> in the file.
> 
> http://localhost:21000/api/atlas/v2/glossary/importHeaderRow
> 
> Template structure:-
> 
> GlossaryName, TermName, ShortDescription, LongDescription, Examples, 
> Abbreviation, Usage, AdditionalAttributes, TranslationTerms, ValidValuesFor, 
> Synonyms, ReplacedBy, ValidValues, ReplacementTerms, SeeAlso, 
> TranslatedTerms, IsA, Antonyms, Classifies, PreferredToTerms, PreferredTerms
> Fruits,Apple5,SD4,LD4,"EXAMPLE","ABBREVIATION","USAGE",,"Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4"
> 
> 
> * The 2nd endpoint {glossary/importGlossaryData} (file upload) would actually 
> parse the Data into AtlasObjects and further create the AtlasGlossaryTerms 
> inside Glossary.
> 
> curl -v -g POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F file=@template_6.csv 
> "http://localhost:21000/api/atlas/v2/glossary/import
> 
> 
> Note:-
> 
> While populating the data in the  csv file each record should be maintained 
> in single Line (enter command within the record would result in parsing the 
> second line as a new record).
> 
> The downloaded template needs to be saved as whateverTheFileNameIs.csv 
> explicitly.
> 
> If the file is been succefully uploaded then the AtlasGlossaryTerm would be 
> returned or else List of Errors would returned for user to rectify them 
> further.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 7c0fd5601 
>   dashboardv2/gruntfile.js fef4e08c3 
>   dashboardv2/package-lock.json 7f25b5752 
>   dashboardv2/package.json e90040edb 
>   dashboardv2/public/css/scss/theme.scss 0589e0920 
>   dashboardv2/public/index.html.tpl a6a999e53 
>   dashboardv2/public/js/main.js 75e16c3aa 
>   dashboardv2/public/js/templates/glossary/GlossaryLayoutView_tmpl.html 
> 1fa1e3540 
>   dashboardv2/public/js/templates/glossary/ImportGlossaryLayoutView_tmpl.html 
> PRE-CREATION 
>   dashboardv2/public/js/utils/UrlLinks.js 6c67e8c37 
>   dashboardv2/public/js/views/glossary/GlossaryLayoutView.js 9b386f326 
>   dashboardv2/public/js/views/glossary/ImportGlossaryLayoutView.js 
> PRE-CREATION 
>   dashboardv3/gruntfile.js f55ff0d5e 
>   dashboardv3/package-lock.json 3918eccaa 
>   dashboardv3/package.json 5dc05104f 
>   dashboardv3/public/css/scss/leftsidebar.scss bbdc5fb26 
>   dashboardv3/public/css/scss/theme.scss 2b0c45d6b 
>   dashboardv3/public/index.html.tpl 2edbb659d 
>   dashboardv3/public/js/main.js 26fd70991 
>   dashboardv3/public/js/templates/glossary/ImportGlossaryLayoutView_tmpl.html 
> PRE-CREATION 
>   
> dashboardv3/public/js/templates/search/tree/GlossaryTreeLayoutView_tmpl.html 
> 83da9c57c 
>   dashboardv3/public/js/utils/UrlLinks.js 2bbe6796f 
>   dashboardv3/public/js/views/glossary/ImportGlossaryLayoutView.js 
> PRE-CREATION 
>   

Re: Review Request 71482: ATLAS-3423:-Import Glossary Terms CSV into a Glossary

2020-03-05 Thread Sidharth Mishra

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71482/#review219788
---



- Sidharth Mishra


On March 4, 2020, 10:07 a.m., mayank jain wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71482/
> ---
> 
> (Updated March 4, 2020, 10:07 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3423
> https://issues.apache.org/jira/browse/ATLAS-3423
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> This patch consists implementation for 2 end points first for template 
> download and other for csv file upload with term details also the Unit Test 
> cases for both the end points.
> 
> * The 1st endpoint {glossary/template} return template file this would be 
> type of format of data that shows how the data needs to be populated by user 
> in the file.
> 
> http://localhost:21000/api/atlas/v2/glossary/importHeaderRow
> 
> Template structure:-
> 
> GlossaryName, TermName, ShortDescription, LongDescription, Examples, 
> Abbreviation, Usage, AdditionalAttributes, TranslationTerms, ValidValuesFor, 
> Synonyms, ReplacedBy, ValidValues, ReplacementTerms, SeeAlso, 
> TranslatedTerms, IsA, Antonyms, Classifies, PreferredToTerms, PreferredTerms
> Fruits,Apple5,SD4,LD4,"EXAMPLE","ABBREVIATION","USAGE",,"Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4","Footwear:B4"
> 
> 
> * The 2nd endpoint {glossary/importGlossaryData} (file upload) would actually 
> parse the Data into AtlasObjects and further create the AtlasGlossaryTerms 
> inside Glossary.
> 
> curl -v -g POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F file=@template_6.csv 
> "http://localhost:21000/api/atlas/v2/glossary/import
> 
> 
> Note:-
> 
> While populating the data in the  csv file each record should be maintained 
> in single Line (enter command within the record would result in parsing the 
> second line as a new record).
> 
> The downloaded template needs to be saved as whateverTheFileNameIs.csv 
> explicitly.
> 
> If the file is been succefully uploaded then the AtlasGlossaryTerm would be 
> returned or else List of Errors would returned for user to rectify them 
> further.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 7c0fd5601 
>   dashboardv2/gruntfile.js fef4e08c3 
>   dashboardv2/package-lock.json 7f25b5752 
>   dashboardv2/package.json e90040edb 
>   dashboardv2/public/css/scss/theme.scss 0589e0920 
>   dashboardv2/public/index.html.tpl a6a999e53 
>   dashboardv2/public/js/main.js 75e16c3aa 
>   dashboardv2/public/js/templates/glossary/GlossaryLayoutView_tmpl.html 
> 1fa1e3540 
>   dashboardv2/public/js/templates/glossary/ImportGlossaryLayoutView_tmpl.html 
> PRE-CREATION 
>   dashboardv2/public/js/utils/UrlLinks.js 6c67e8c37 
>   dashboardv2/public/js/views/glossary/GlossaryLayoutView.js 9b386f326 
>   dashboardv2/public/js/views/glossary/ImportGlossaryLayoutView.js 
> PRE-CREATION 
>   dashboardv3/gruntfile.js f55ff0d5e 
>   dashboardv3/package-lock.json 3918eccaa 
>   dashboardv3/package.json 5dc05104f 
>   dashboardv3/public/css/scss/leftsidebar.scss bbdc5fb26 
>   dashboardv3/public/css/scss/theme.scss 2b0c45d6b 
>   dashboardv3/public/index.html.tpl 2edbb659d 
>   dashboardv3/public/js/main.js 26fd70991 
>   dashboardv3/public/js/templates/glossary/ImportGlossaryLayoutView_tmpl.html 
> PRE-CREATION 
>   
> dashboardv3/public/js/templates/search/tree/GlossaryTreeLayoutView_tmpl.html 
> 83da9c57c 
>   dashboardv3/public/js/utils/UrlLinks.js 2bbe6796f 
>   dashboardv3/public/js/views/glossary/ImportGlossaryLayoutView.js 
> PRE-CREATION 
>   dashboardv3/public/js/views/search/tree/GlossaryTreeLayoutView.js 28c6a9e4a 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java c5bf50dca 
>   intg/src/main/java/org/apache/atlas/AtlasErrorCode.java 04eb4a08e 
>   
> intg/src/main/java/org/apache/atlas/model/glossary/relations/AtlasGlossaryHeader.java
>  660514bc2 
>   pom.xml f76c6a05e 
>   repository/pom.xml 802d587a8 
>   repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java 
> 9229d2d58 
>   repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java 
> cdc3f073f 
>   repository/src/main/java/org/apache/atlas/glossary/GlossaryUtils.java 
> 9625f9409 
>   repository/src/main/java/org/apache/atlas/util/FileUtils.java PRE-CREATION 
>   repository/src/test/java/org/apache/atlas/glossary/GlossaryServiceTest.java 
> 759dcdf42 
>   repository/src/test/resources/csvFiles/empty.csv PRE-CREATION 

Re: Review Request 71482: ATLAS-3423:-Import Glossary Terms CSV into a Glossary

2020-03-05 Thread Sidharth Mishra

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71482/#review219787
---




repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Line 74 (original), 83 (patched)


Remove extra space after entityChangeNotifier



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 89 (patched)


Align the = with other assignments in constructors



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1094 (patched)


Remove this extra line



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1099 (patched)


Remove the extra new line here



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1100 (patched)


It's better to append the IOExcpetion message to AtlasBaseException for 
more details of exception



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1101 (patched)


Template -> template



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1106 (patched)


Remove this extra line



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1108 (patched)


please make this as well local to try



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1109 (patched)


This can be made local to try block and final



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1125 (patched)


Please remove extra line. Same is present other places as well.



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1127 (patched)


An Error - Error



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1128 (patched)


An Error - Error



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1139 (patched)


remove extrra line



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1143 (patched)


An Error - Error



repository/src/main/java/org/apache/atlas/glossary/GlossaryService.java
Lines 1144 (patched)


An Error - Error



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 532 (patched)


We should always try to avoid function argument to be out argument. It will 
be hard understand and its error prone. Here the second argument 
failedTermMsgList is an out argument. Please avoid this.



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 555 (patched)


Please Fix the indentation



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 559 (patched)


Please Remove extra blank line



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 560 (patched)


Better change this if else to ->

if(GlossaryService.isNameInvalid(glossaryName)) {
   // Error
}else{
  // Success
}



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 567 (patched)


Please fix Indentation



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 586 (patched)


Better use System.getProperty("line.separator") instead of '\n'. Please 
check the indentation here as well



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 587 (patched)


It seems failedTermMsgList is not being used outside of this function. We 
can remove this argument from the function. This will also avoid out argument



repository/src/main/java/org/apache/atlas/glossary/GlossaryTermUtils.java
Lines 676 (patched)

[jira] [Updated] (ATLAS-3600) Some System Attribute of Entity filter doesn't work

2020-03-05 Thread Keval Bhatt (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keval Bhatt updated ATLAS-3600:
---
Fix Version/s: 3.0.0
   2.1.0

> Some System Attribute of Entity filter doesn't work
> ---
>
> Key: ATLAS-3600
> URL: https://issues.apache.org/jira/browse/ATLAS-3600
> Project: Atlas
>  Issue Type: Bug
>Reporter: Mayank Jain
>Assignee: Pinal
>Priority: Major
> Fix For: 2.1.0, 3.0.0
>
>
> The new enhancement of System Attributes does not support's following 4 
> search ,
>  # classification search
>  # Propogated Classification Search
>  # User-Defined Attributes 
>  # Labels                                  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3600) Some System Attribute of Entity filter doesn't work

2020-03-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052132#comment-17052132
 ] 

ASF subversion and git services commented on ATLAS-3600:


Commit 66c95d12419ecd6bf4429171eecf5d9039eacdb1 in atlas's branch 
refs/heads/branch-2.0 from Pinal Shah
[ https://gitbox.apache.org/repos/asf?p=atlas.git;h=66c95d1 ]

ATLAS-3600 : Some System Attribute of Entity filter doesn't work

Signed-off-by: kevalbhatt 
(cherry picked from commit e50372820f146636ba606695ea4bd8ae91d82e55)


> Some System Attribute of Entity filter doesn't work
> ---
>
> Key: ATLAS-3600
> URL: https://issues.apache.org/jira/browse/ATLAS-3600
> Project: Atlas
>  Issue Type: Bug
>Reporter: Mayank Jain
>Assignee: Pinal
>Priority: Major
> Fix For: 2.1.0, 3.0.0
>
>
> The new enhancement of System Attributes does not support's following 4 
> search ,
>  # classification search
>  # Propogated Classification Search
>  # User-Defined Attributes 
>  # Labels                                  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3600) Some System Attribute of Entity filter doesn't work

2020-03-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052131#comment-17052131
 ] 

ASF subversion and git services commented on ATLAS-3600:


Commit e50372820f146636ba606695ea4bd8ae91d82e55 in atlas's branch 
refs/heads/master from Pinal Shah
[ https://gitbox.apache.org/repos/asf?p=atlas.git;h=e503728 ]

ATLAS-3600 : Some System Attribute of Entity filter doesn't work

Signed-off-by: kevalbhatt 


> Some System Attribute of Entity filter doesn't work
> ---
>
> Key: ATLAS-3600
> URL: https://issues.apache.org/jira/browse/ATLAS-3600
> Project: Atlas
>  Issue Type: Bug
>Reporter: Mayank Jain
>Assignee: Pinal
>Priority: Major
> Fix For: 2.1.0, 3.0.0
>
>
> The new enhancement of System Attributes does not support's following 4 
> search ,
>  # classification search
>  # Propogated Classification Search
>  # User-Defined Attributes 
>  # Labels                                  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
External issue URL: https://github.com/apache/atlas/pull/90

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3655) Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread Vladislav Glinskiy (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052471#comment-17052471
 ] 

Vladislav Glinskiy commented on ATLAS-3655:
---

cc [~kabhwan] [~sarath] 

> Create 'spark_application' type to avoid 'spark_process' from being updated 
> for multiple operations
> ---
>
> Key: ATLAS-3655
> URL: https://issues.apache.org/jira/browse/ATLAS-3655
> Project: Atlas
>  Issue Type: Task
>Reporter: Vladislav Glinskiy
>Priority: Major
> Fix For: 2.1.0, 3.0.0
>
> Attachments: Screenshot from 2020-03-03 16-09-39.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Create 'spark_application' type to avoid 'spark_process' from being updated 
> for multiple operations. Currently, Spark Atlas Connector uses 
> 'spark_process' as a top-level type for a Spark session, thus it's being 
> updated for multiple operations within the same session.
> The following statements:
> {code:java}
> spark.sql("create table table_1(col1 int,col2 string)");
> spark.sql("create table table_2 as select * from table_1");
> {code}
> result in the next correct lineage:
> table1 --> spark_process1 ---> table2
> but executing similar statements in the same spark session:
> {code:java}
> spark.sql("create table table_3(col1 int,col2 string)"); 
> spark.sql("create table table_4 as select * from table_3");
> {code}
> result in the same 'spark_process' being updated and the lineage now connects 
> all the 4 tables(see screenshot in the attachments).
>  
> The proposal is to create a 'spark_application' entity and associate all 
> 'spark_process' entities (created within that session) to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [atlas] vladhlinsky commented on issue #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladhlinsky commented on issue #91: ATLAS-3655: Create 'spark_application' type 
to avoid 'spark_process' from being updated for multiple operations
URL: https://github.com/apache/atlas/pull/91#issuecomment-595413162
 
 
   cc @HeartSaVioR @sarathsubramanian


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388527692
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -470,6 +498,24 @@
 "cardinality": "SINGLE"
   },
   "propagateTags": "NONE"
+},
+{
+  "name": "spark_application_process",
 
 Review comment:
   "spark_application_process" => "spark_application_processes"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] vladhlinsky commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladhlinsky commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388546052
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -305,6 +305,34 @@
 }
   ]
 },
+{
+  "name": "spark_application",
+  "superTypes": [
+"Process"
+  ],
+  "serviceType": "spark",
+  "typeVersion": "1.0",
+  "attributeDefs": [
+{
+  "name": "currUser",
 
 Review comment:
   Thanks! Changed to "currentUser".


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] vladhlinsky commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladhlinsky commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388546216
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -470,6 +498,24 @@
 "cardinality": "SINGLE"
   },
   "propagateTags": "NONE"
+},
+{
+  "name": "spark_application_process",
 
 Review comment:
   Renamed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] vladglinsky commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladglinsky commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388545586
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -305,6 +305,34 @@
 }
   ]
 },
+{
+  "name": "spark_application",
+  "superTypes": [
+"Process"
+  ],
+  "serviceType": "spark",
+  "typeVersion": "1.0",
+  "attributeDefs": [
+{
+  "name": "currUser",
 
 Review comment:
   Thanks! Changed to  "currentUser".


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] vladglinsky commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladglinsky commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388545586
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -305,6 +305,34 @@
 }
   ]
 },
+{
+  "name": "spark_application",
+  "superTypes": [
+"Process"
+  ],
+  "serviceType": "spark",
+  "typeVersion": "1.0",
+  "attributeDefs": [
+{
+  "name": "currUser",
 
 Review comment:
   Thanks! Changed to  "currentUser".


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] vladhlinsky opened a new pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladhlinsky opened a new pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91
 
 
   ## What changes were proposed in this pull request?
   
   Create `spark_application` type to avoid `spark_process` from being updated 
for multiple operations. Currently, Spark Atlas Connector uses `spark_process` 
as a top-level type for a Spark session, thus it's being updated for multiple 
operations within the same session.
   
   The following statements:
   ```
   spark.sql("create table table_1(col1 int,col2 string)");
   spark.sql("create table table_2 as select * from table_1");
   ```
   result in the next correct lineage:
   ```
   table1 --> spark_process1 ---> table2
   ```
   but executing similar statements in the same spark session:
   ```
   spark.sql("create table table_3(col1 int,col2 string)"); 
   spark.sql("create table table_4 as select * from table_3");
   ```
   result in the same `spark_process` being updated and the lineage now 
connects all the 4 tables.
   The proposal is to create a `spark_application` entity and associate all 
`spark_process` entities (created within that session) to it.
   
   ## How was this patch tested?
   
   Manually using modified version of Spark Atlas Connector:
   - Installed and started Atlas.
   - Executed the next statements using spark-shell:
   
   ```
   spark.sql("create table table_1_17(col1 int,col2 string)");
   spark.sql("create table table_2_17 as select * from table_1_17");
   spark.sql("create table table_3_17(col1 int,col2 string)");
   spark.sql("create table table_4_17 as select * from table_3_17");
   ```
   
   - Verified that all 4 entites are connected in Atlas lineage.
   - `1100-spark_model.json` is updated with proposed changes.
   - Once again executed similar statements:
   
   ```
   spark.sql("create table table_1_37(col1 int,col2 string)");
   spark.sql("create table table_2_37 as select * from table_1_37");
   spark.sql("create table table_3_37(col1 int,col2 string)");
   spark.sql("create table table_4_37 as select * from table_3_37");
   ```
   
   - Verified that two `spark_process` entities are created,
   that have a single `spark_application` entity as `application`.
   Each of these processes has it's own lineage.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388527337
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -305,6 +305,34 @@
 }
   ]
 },
+{
+  "name": "spark_application",
+  "superTypes": [
+"Process"
+  ],
+  "serviceType": "spark",
+  "typeVersion": "1.0",
+  "attributeDefs": [
+{
+  "name": "currUser",
 
 Review comment:
   consider renaming "currUser" to "currentUser"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (ATLAS-3618) Entities with no guid appears in search result

2020-03-05 Thread Sarath Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarath Subramanian updated ATLAS-3618:
--
Affects Version/s: 2.0.0

> Entities with no guid appears in search result
> --
>
> Key: ATLAS-3618
> URL: https://issues.apache.org/jira/browse/ATLAS-3618
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Pinal
>Assignee: Pinal
>Priority: Major
> Attachments: no_guid_entity.png
>
>
> entities with no guid is listed in search result, when _ALL_ENTITY_TYPES is 
> searched.
> !no_guid_entity.png!
> Also, Entites with internal type is listed in search result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3618) Entities with no guid appears in search result

2020-03-05 Thread Sarath Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarath Subramanian updated ATLAS-3618:
--
Fix Version/s: 2.1.0

> Entities with no guid appears in search result
> --
>
> Key: ATLAS-3618
> URL: https://issues.apache.org/jira/browse/ATLAS-3618
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Pinal
>Assignee: Pinal
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: no_guid_entity.png
>
>
> entities with no guid is listed in search result, when _ALL_ENTITY_TYPES is 
> searched.
> !no_guid_entity.png!
> Also, Entites with internal type is listed in search result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3618) Entities with no guid appears in search result

2020-03-05 Thread Sarath Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarath Subramanian updated ATLAS-3618:
--
Component/s:  atlas-core

> Entities with no guid appears in search result
> --
>
> Key: ATLAS-3618
> URL: https://issues.apache.org/jira/browse/ATLAS-3618
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0
>Reporter: Pinal
>Assignee: Pinal
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: no_guid_entity.png
>
>
> entities with no guid is listed in search result, when _ALL_ENTITY_TYPES is 
> searched.
> !no_guid_entity.png!
> Also, Entites with internal type is listed in search result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Ashutosh Mestry via Review Board


> On March 5, 2020, 9:30 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
> > Lines 34 (patched)
> > 
> >
> > methods defined here looks more of like helper methods  than interface 
> > methods.

Since this is a drop-in for reduced impact, it needs to have same signature as 
the original concrete implementation. Changing this will involve refactoring 
original code. I can take it up after this commit.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219785
---


On March 5, 2020, 5:43 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 5, 2020, 5:43 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  

Re: Review Request 72197: ATLAS-3653: renamed Namespace to EntityExtn

2020-03-05 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72197/
---

(Updated March 5, 2020, 9:02 p.m.)


Review request for atlas, Ashutosh Mestry, keval bhatt, Sameer Shaikh, and 
Sarath Subramanian.


Changes
---

renamed Namespace to BusinessMetadata


Bugs: ATLAS-3653
https://issues.apache.org/jira/browse/ATLAS-3653


Repository: atlas


Description
---

renamed Namespace to EntityExtn


Diffs (updated)
-

  
authorization/src/main/java/org/apache/atlas/authorize/AtlasEntityAccessRequest.java
 f2e483888 
  authorization/src/main/java/org/apache/atlas/authorize/AtlasPrivilege.java 
7d81e22f8 
  
authorization/src/main/java/org/apache/atlas/authorize/simple/AtlasSimpleAuthorizer.java
 5f0c7b2b7 
  
authorization/src/main/java/org/apache/atlas/authorize/simple/AtlasSimpleAuthzPolicy.java
 47b728003 
  
authorization/src/test/java/org/apache/atlas/authorize/simple/AtlasSimpleAuthorizerTest.java
 e585e93d2 
  authorization/src/test/resources/atlas-simple-authz-policy.json 379d42b0c 
  intg/src/main/java/org/apache/atlas/AtlasErrorCode.java 04eb4a08e 
  intg/src/main/java/org/apache/atlas/model/TypeCategory.java cbcd0a3c8 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 2e2e4ee03 
  intg/src/main/java/org/apache/atlas/model/typedef/AtlasEntityDef.java 
dcae71676 
  intg/src/main/java/org/apache/atlas/model/typedef/AtlasNamespaceDef.java 
713a2c26a 
  intg/src/main/java/org/apache/atlas/model/typedef/AtlasTypesDef.java 
81ea946e5 
  intg/src/main/java/org/apache/atlas/store/AtlasTypeDefStore.java b08ace442 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 2824feb16 
  intg/src/main/java/org/apache/atlas/type/AtlasNamespaceType.java ede84436d 
  intg/src/main/java/org/apache/atlas/type/AtlasStructType.java 5c62a2497 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 5b7cbeef5 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeUtil.java 5b115b530 
  intg/src/main/java/org/apache/atlas/typesystem/types/DataTypes.java d57a48443 
  intg/src/test/java/org/apache/atlas/TestRelationshipUtilsV2.java 32ed6ee4e 
  intg/src/test/java/org/apache/atlas/model/typedef/TestAtlasNamespaceDef.java 
88677740b 
  repository/src/main/java/org/apache/atlas/query/IdentifierHelper.java 
c443652f0 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 a3a570b97 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ExportService.java 
82a2d31d5 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ExportTypeProcessor.java
 b5533525f 
  
repository/src/main/java/org/apache/atlas/repository/store/bootstrap/AtlasTypeDefStoreInitializer.java
 2a602c871 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasTypeDefGraphStore.java
 e1ef84924 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasBusinessMetadataDefStoreV2.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasNamespaceDefStoreV2.java
 eaaf6bbe3 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasTypeDefGraphStoreV2.java
 afdfba9b8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 7533ebc78 
  repository/src/main/java/org/apache/atlas/repository/util/FilterUtil.java 
df27b0ce4 
  
repository/src/test/java/org/apache/atlas/repository/store/graph/v2/AtlasBusinessMetadataDefStoreV2Test.java
 PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/store/graph/v2/AtlasNamespaceDefStoreV2Test.java
 e2f5c16a7 
  webapp/src/main/java/org/apache/atlas/examples/QuickStartV2.java 72f5befee 
  webapp/src/main/java/org/apache/atlas/web/rest/EntityREST.java fcf71891f 
  webapp/src/main/java/org/apache/atlas/web/rest/TypesREST.java e7cf62d07 


Diff: https://reviews.apache.org/r/72197/diff/2/

Changes: https://reviews.apache.org/r/72197/diff/1-2/


Testing (updated)
---

pre-commit tests run 
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1717/


Thanks,

Madhan Neethiraj



[jira] [Updated] (ATLAS-3653) Rename type Namespace to BusinessMetadata

2020-03-05 Thread Madhan Neethiraj (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Neethiraj updated ATLAS-3653:

Description: 
A new type named {{Namespace}} was introduced in Atlas via ATLAS-3486. This 
feature enables named grouping of attributes; and these attributes can be added 
to instances of specified entity-types. In effect, this feature allows 
attributes to be added in multiple entity-types, without actually modifying 
these entity-types. This helps multiple ways, like:
- ability to add attributes (like dataQuality, projectName) to multiple 
entity-types
- authorize update to attributes
- find entities (of multiple types) based on value of such attributes

Since the term {{namespace}} is used in multiple applications in different 
context, it is better to avoid using this term for above feature; instead I 
suggest to use {{BusinessMetadata}} - as this feature allows extending 
entity-types by adding attributes related to business (i.e. not technical 
metadata).

  was:
A new type named {{Namespace}} was introduced in Atlas via ATLAS-3486. This 
feature enables named grouping of attributes; and these attributes can be added 
to instances of specified entity-types. In effect, this feature allows 
attributes to be added in multiple entity-types, without actually modifying 
these entity-types. This helps multiple ways, like:
- ability to add attributes (like dataQuality, projectName) to multiple 
entity-types
- authorize update to attributes
- find entities (of multiple types) based on value of such attributes

Since the term {{namespace}} is used in multiple applications in different 
context, it is better to avoid using this term for above feature; instead I 
suggest to use {{EntityExtn}} - as this feature allows extending entity-types 
by adding attributes.

Summary: Rename type Namespace to BusinessMetadata  (was: Rename type 
Namespace to EntityExtn)

> Rename type Namespace to BusinessMetadata
> -
>
> Key: ATLAS-3653
> URL: https://issues.apache.org/jira/browse/ATLAS-3653
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Madhan Neethiraj
>Assignee: Madhan Neethiraj
>Priority: Major
> Fix For: 2.1.0, 3.0.0
>
> Attachments: ATLAS-3653.patch
>
>
> A new type named {{Namespace}} was introduced in Atlas via ATLAS-3486. This 
> feature enables named grouping of attributes; and these attributes can be 
> added to instances of specified entity-types. In effect, this feature allows 
> attributes to be added in multiple entity-types, without actually modifying 
> these entity-types. This helps multiple ways, like:
> - ability to add attributes (like dataQuality, projectName) to multiple 
> entity-types
> - authorize update to attributes
> - find entities (of multiple types) based on value of such attributes
> Since the term {{namespace}} is used in multiple applications in different 
> context, it is better to avoid using this term for above feature; instead I 
> suggest to use {{BusinessMetadata}} - as this feature allows extending 
> entity-types by adding attributes related to business (i.e. not technical 
> metadata).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [atlas] vladhlinsky commented on issue #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
vladhlinsky commented on issue #91: ATLAS-3655: Create 'spark_application' type 
to avoid 'spark_process' from being updated for multiple operations
URL: https://github.com/apache/atlas/pull/91#issuecomment-595413075
 
 
   Attaching screenshots.
   - Installed and started Atlas.
   - Executed the next statements using spark-shell:
   
   ```
   spark.sql("create table table_1_17(col1 int,col2 string)");
   spark.sql("create table table_2_17 as select * from table_1_17");
   spark.sql("create table table_3_17(col1 int,col2 string)");
   spark.sql("create table table_4_17 as select * from table_3_17");
   ```
   - Verified that all 4 entites are connected in Atlas lineage.
   ![Screenshot from 2020-02-27 
19-31-09](https://user-images.githubusercontent.com/61428392/76019361-42ab2900-5f2a-11ea-960b-192cb0d00638.png)
   
   - `1100-spark_model.json` is updated with proposed changes.
   - Once again executed similar statements:
   
   ```
   spark.sql("create table table_1_37(col1 int,col2 string)");
   spark.sql("create table table_2_37 as select * from table_1_37");
   spark.sql("create table table_3_37(col1 int,col2 string)");
   spark.sql("create table table_4_37 as select * from table_3_37");
   ```
   
   - Verified that two `spark_process` entities are created,
   that have a single `spark_application` entity as `application`.
   Each of these processes has it's own lineage.
   
   ![Screenshot from 2020-03-04 
23-16-44](https://user-images.githubusercontent.com/61428392/76019494-78e8a880-5f2a-11ea-856d-7bb7b8415412.png)
   ![Screenshot from 2020-03-04 
23-17-02](https://user-images.githubusercontent.com/61428392/76019557-93228680-5f2a-11ea-9ce4-6d89a87ce94e.png)
   ![Screenshot from 2020-03-04 
23-17-10](https://user-images.githubusercontent.com/61428392/76019509-8140e380-5f2a-11ea-9610-53b54dbffd8f.png)
   ![Screenshot from 2020-03-04 
23-17-34](https://user-images.githubusercontent.com/61428392/76019579-9cabee80-5f2a-11ea-9684-16a38029d611.png)
   ![Screenshot from 2020-03-04 
23-19-56](https://user-images.githubusercontent.com/61428392/76019598-a46b9300-5f2a-11ea-9995-24732d740c08.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ATLAS-3655) Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread Vladislav Glinskiy (Jira)
Vladislav Glinskiy created ATLAS-3655:
-

 Summary: Create 'spark_application' type to avoid 'spark_process' 
from being updated for multiple operations
 Key: ATLAS-3655
 URL: https://issues.apache.org/jira/browse/ATLAS-3655
 Project: Atlas
  Issue Type: Task
Reporter: Vladislav Glinskiy
 Fix For: 2.1.0, 3.0.0
 Attachments: Screenshot from 2020-03-03 16-09-39.png

Create 'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations. Currently, Spark Atlas Connector uses 'spark_process' as a 
top-level type for a Spark session, thus it's being updated for multiple 
operations within the same session.

The following statements:
{code:java}
spark.sql("create table table_1(col1 int,col2 string)");
spark.sql("create table table_2 as select * from table_1");
{code}
result in the next correct lineage:

table1 --> spark_process1 ---> table2

but executing similar statements in the same spark session:
{code:java}
spark.sql("create table table_3(col1 int,col2 string)"); 
spark.sql("create table table_4 as select * from table_3");
{code}
result in the same 'spark_process' being updated and the lineage now connects 
all the 4 tables(see screenshot in the attachments).

 

The proposal is to create a 'spark_application' entity and associate all 
'spark_process' entities (created within that session) to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72182: ATLAS-3647 : System attribute search : isIncomplete attribute has 1, null as values

2020-03-05 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72182/#review219813
---


Ship it!




Ship It!

- Madhan Neethiraj


On March 5, 2020, 9:50 a.m., mayank jain wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72182/
> ---
> 
> (Updated March 5, 2020, 9:50 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
> Subramanian.
> 
> 
> Bugs: ATLAS-3647
> https://issues.apache.org/jira/browse/ATLAS-3647
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> In entity definition , 
> isIncomplete is null when is entity is complete
> isIncomplete is 1 when is entity is incomplete (shell/ghost entities)
> 
> when isIncomplete = false is expected to return all complete entities 
> (non-shell entities) but since isIncomplete is null , it doesn't return any 
> entity.
> 
> In system attributes search,
> isIncomplete takes conditions =, != , not null , null with values true , 
> false.
> 
> 
> Solution :
> Basically while creating normal entities isIncomplete attribute is nowhere 
> into consideration and also it only comes into picture for shell entities.
> 
> So, when we try searching for isIncomplete = false , the normal entities 
> which does not hold any value for this particular attribute we can directly 
> alter the graph query search with an OR condition.
> 
> i.e _isIncomplete = false OR _isIncomplete is null
> 
> This will return all the entities which  were once a shell entity and then 
> got updated to full entity and all the normal entities which never went into 
> the process of shell entities and have isIncomplete attribute as null.
> 
> 
> Diffs
> -
> 
>   repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
> 356363db0 
> 
> 
> Diff: https://reviews.apache.org/r/72182/diff/3/
> 
> 
> Testing
> ---
> 
> Tested the complete working  of isIncomplete attribute and it works fine.
> 
> 
> Thanks,
> 
> mayank jain
> 
>



Re: Review Request 72156: ATLAS-3618 Entities with no guid appears in search result

2020-03-05 Thread Pinal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72156/
---

(Updated March 6, 2020, 6:26 a.m.)


Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
Subramanian.


Changes
---

Changed typeNamePredicate to EntityPredicate


Bugs: ATLAS-3618
https://issues.apache.org/jira/browse/ATLAS-3618


Repository: atlas


Description
---

1) Entities of struct types appears when ALL_ENTITY_TYPES is selected
2) Entities of internal types like AtlasGlossary etc appears when 
ALL_ENTITY_TYPES is selected


Diffs (updated)
-

  
repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
 6ab0afbf9 
  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
ebd5992cd 
  repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
b5ede0b82 


Diff: https://reviews.apache.org/r/72156/diff/5/

Changes: https://reviews.apache.org/r/72156/diff/4-5/


Testing
---

1) typeName: ALL_ENTITY_TYPES returns all entities with no struct types(whoes 
guid isnotnull) and no internal types(whoes supertype is not _internal)
2) typeName: ALL_ENTITY_TYPES, filter: guid isnull, returns no result
3) typeName: ALL_ENTITY_TYPES, filter: typeName begins_with Atlas, returns no 
result

Usecase:
-> Added, term1 in Glossary
-> added classification1 to term1
1) In search Panel(showing all entities) -> term1 shouln't appear
2) In classification Panel, showing all entities associated to classification1 
-> term1 shouldn't appear


Thanks,

Pinal Shah



Re: Review Request 72156: ATLAS-3618 Entities with no guid appears in search result

2020-03-05 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72156/#review219807
---




repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
Lines 73 (patched)


typeNamePredicate => isEntityPredicate


- Madhan Neethiraj


On March 6, 2020, 5:37 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72156/
> ---
> 
> (Updated March 6, 2020, 5:37 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
> Subramanian.
> 
> 
> Bugs: ATLAS-3618
> https://issues.apache.org/jira/browse/ATLAS-3618
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> 1) Entities of struct types appears when ALL_ENTITY_TYPES is selected
> 2) Entities of internal types like AtlasGlossary etc appears when 
> ALL_ENTITY_TYPES is selected
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
>  6ab0afbf9 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  ebd5992cd 
>   repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
> b5ede0b82 
> 
> 
> Diff: https://reviews.apache.org/r/72156/diff/4/
> 
> 
> Testing
> ---
> 
> 1) typeName: ALL_ENTITY_TYPES returns all entities with no struct types(whoes 
> guid isnotnull) and no internal types(whoes supertype is not _internal)
> 2) typeName: ALL_ENTITY_TYPES, filter: guid isnull, returns no result
> 3) typeName: ALL_ENTITY_TYPES, filter: typeName begins_with Atlas, returns no 
> result
> 
> Usecase:
> -> Added, term1 in Glossary
> -> added classification1 to term1
> 1) In search Panel(showing all entities) -> term1 shouln't appear
> 2) In classification Panel, showing all entities associated to 
> classification1 -> term1 shouldn't appear
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



Re: Review Request 72156: ATLAS-3618 Entities with no guid appears in search result

2020-03-05 Thread Pinal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72156/
---

(Updated March 6, 2020, 5:37 a.m.)


Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
Subramanian.


Changes
---

Rebase patch


Bugs: ATLAS-3618
https://issues.apache.org/jira/browse/ATLAS-3618


Repository: atlas


Description
---

1) Entities of struct types appears when ALL_ENTITY_TYPES is selected
2) Entities of internal types like AtlasGlossary etc appears when 
ALL_ENTITY_TYPES is selected


Diffs (updated)
-

  
repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
 6ab0afbf9 
  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
ebd5992cd 
  repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
b5ede0b82 


Diff: https://reviews.apache.org/r/72156/diff/4/

Changes: https://reviews.apache.org/r/72156/diff/3-4/


Testing
---

1) typeName: ALL_ENTITY_TYPES returns all entities with no struct types(whoes 
guid isnotnull) and no internal types(whoes supertype is not _internal)
2) typeName: ALL_ENTITY_TYPES, filter: guid isnull, returns no result
3) typeName: ALL_ENTITY_TYPES, filter: typeName begins_with Atlas, returns no 
result

Usecase:
-> Added, term1 in Glossary
-> added classification1 to term1
1) In search Panel(showing all entities) -> term1 shouln't appear
2) In classification Panel, showing all entities associated to classification1 
-> term1 shouldn't appear


Thanks,

Pinal Shah



Re: Review Request 72188: ATLAS-3650 : Basic Search: query of typeName doesn't apply when it has many subTypes(like Asset) in combination with attribute filter

2020-03-05 Thread Pinal Shah


> On March 4, 2020, 5:10 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
> > Lines 126 (patched)
> > 
> >
> > typeNamePredicate could be null - see #100.
> > 
> > Consider alternate fix of moving #114 - #116 outside the 'if' block at 
> > #110.

Thanks Madhan,
I think once ATLAS-3618(https://reviews.apache.org/r/72156) is committed,
I can incooperate changes in it, because this patch is dependent on ATLAS-3618.


- Pinal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72188/#review219747
---


On March 4, 2020, 5:05 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72188/
> ---
> 
> (Updated March 4, 2020, 5:05 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
> Subramanian.
> 
> 
> Bugs: ATLAS-3650
> https://issues.apache.org/jira/browse/ATLAS-3650
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> When Asset typeName, which has many subtypes, with some attribute filter is 
> searched , 
> query formed for typeName is not included in final query
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  8f531876b 
> 
> 
> Diff: https://reviews.apache.org/r/72188/diff/1/
> 
> 
> Testing
> ---
> 
> 1) Select Asset
> 2) Select Guid notNull
> --> Before, all entities with guid notnull including internal 
> Type(AtlasGlossary) where coming in result
> --> Now, Entities with typeName Asset and allSubtypes with guid notnull is 
> coming in result
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



[GitHub] [atlas] sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388528136
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -305,6 +305,34 @@
 }
   ]
 },
+{
+  "name": "spark_application",
+  "superTypes": [
+"Process"
+  ],
+  "serviceType": "spark",
+  "typeVersion": "1.0",
+  "attributeDefs": [
+{
+  "name": "currUser",
+  "typeName": "string",
+  "isOptional": true,
+  "cardinality": "SINGLE",
+  "isUnique": false,
+  "isIndexable": false,
 
 Review comment:
   isIndexable => true


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [atlas] sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 'spark_application' type to avoid 'spark_process' from being updated for multiple operations

2020-03-05 Thread GitBox
sarathsubramanian commented on a change in pull request #91: ATLAS-3655: Create 
'spark_application' type to avoid 'spark_process' from being updated for 
multiple operations
URL: https://github.com/apache/atlas/pull/91#discussion_r388528238
 
 

 ##
 File path: addons/models/1000-Hadoop/1100-spark_model.json
 ##
 @@ -305,6 +305,34 @@
 }
   ]
 },
+{
+  "name": "spark_application",
+  "superTypes": [
+"Process"
+  ],
+  "serviceType": "spark",
+  "typeVersion": "1.0",
+  "attributeDefs": [
+{
+  "name": "currUser",
+  "typeName": "string",
+  "isOptional": true,
+  "cardinality": "SINGLE",
+  "isUnique": false,
+  "isIndexable": false,
+  "searchWeight": 10
+},
+{
+  "name": "remoteUser",
+  "typeName": "string",
+  "isOptional": true,
+  "cardinality": "SINGLE",
+  "isUnique": false,
+  "isIndexable": false,
 
 Review comment:
   isIndexable => true


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Review Request 72156: ATLAS-3618 Entities with no guid appears in search result

2020-03-05 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72156/#review219817
---


Ship it!




Ship It!

- Madhan Neethiraj


On March 6, 2020, 6:26 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72156/
> ---
> 
> (Updated March 6, 2020, 6:26 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
> Subramanian.
> 
> 
> Bugs: ATLAS-3618
> https://issues.apache.org/jira/browse/ATLAS-3618
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> 1) Entities of struct types appears when ALL_ENTITY_TYPES is selected
> 2) Entities of internal types like AtlasGlossary etc appears when 
> ALL_ENTITY_TYPES is selected
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
>  6ab0afbf9 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  ebd5992cd 
>   repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
> b5ede0b82 
> 
> 
> Diff: https://reviews.apache.org/r/72156/diff/5/
> 
> 
> Testing
> ---
> 
> 1) typeName: ALL_ENTITY_TYPES returns all entities with no struct types(whoes 
> guid isnotnull) and no internal types(whoes supertype is not _internal)
> 2) typeName: ALL_ENTITY_TYPES, filter: guid isnull, returns no result
> 3) typeName: ALL_ENTITY_TYPES, filter: typeName begins_with Atlas, returns no 
> result
> 
> Usecase:
> -> Added, term1 in Glossary
> -> added classification1 to term1
> 1) In search Panel(showing all entities) -> term1 shouln't appear
> 2) In classification Panel, showing all entities associated to 
> classification1 -> term1 shouldn't appear
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: (was: ATLAS-3654.patch)

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: ATLAS-3654.patch

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [atlas] dwarszawski opened a new pull request #90: [ATLAS-3654] enable support for Solr in standalone/http mode

2020-03-05 Thread GitBox
dwarszawski opened a new pull request #90: [ATLAS-3654] enable support for Solr 
in standalone/http mode
URL: https://github.com/apache/atlas/pull/90
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated March 5, 2020, 5:43 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
3362bf158 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  
repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
 0f2b4bfae 
  
repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
1964ade9a 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 cb5a7acd0 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 f552525a4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
 d7020a702 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c 


Diff: https://reviews.apache.org/r/71025/diff/13/

Changes: https://reviews.apache.org/r/71025/diff/12-13/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)