[jira] [Updated] (ATLAS-4619) Refactor Atlas webapp module to remove Kafka core dependency

2022-07-14 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4619:
---
Component/s:  atlas-core
 (was: atlas-webui)

> Refactor Atlas webapp module to remove Kafka core dependency
> 
>
> Key: ATLAS-4619
> URL: https://issues.apache.org/jira/browse/ATLAS-4619
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Patrik Márton
>Priority: Major
>
> Goal is to break the strong coupling between Atlas components and Kafka. 
> These dependencies include using server side libraries of Kafka (this couples 
> the Scala version and other non-public interfaces of Kafka). Any code using 
> server side libraries of Kafka should be refactored.
> Since atlas webapp module uses the ShutdownableThread from core kafka, it 
> should be refactored in a way to eliminate this dependency.
> https://github.com/apache/atlas/blob/master/webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java#L526



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ATLAS-3057) Index Repair Tool: Add JanusGraph-Specific Index Repair Tool

2022-02-18 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494758#comment-17494758
 ] 

Ashutosh Mestry commented on ATLAS-3057:


We have tested this only with HBase + Solr.

Having said that, we have only used JanusGraph APIs. Which means that if 
appropriate configurations are added, HBase + ES should function.

Hope that helps.

> Index Repair Tool: Add JanusGraph-Specific Index Repair Tool
> 
>
> Key: ATLAS-3057
> URL: https://issues.apache.org/jira/browse/ATLAS-3057
> Project: Atlas
>  Issue Type: New Feature
>Reporter: Ashutosh Mestry
>Assignee: Nikhil P Bonte
>Priority: Major
> Fix For: 2.0.0, trunk
>
> Attachments: ATLAS-3057-Atlas-Index-Repair-tool-for-JanusGraph.patch, 
> ATLAS-3057-Atlas-Index-Repair-tool-for-JanusGraph.patch
>
>
> *Background*
> For Atlas version that uses _HBase_ and _Solr_, occasionally if Solr is down 
> and entities get created, then there is no record of the created entities 
> within Solr. Basic search does not indicate presence of such entities.
> *Solution*
> Index Repair Tool (which was present in branch-0.8) should be implemented for 
> JanusGraph.
> *Implementation Guidance*
>  * Create Java-based implementation rather than Groovy script that needs 
> graphdb-specific shell to be installed. 
>  * Use JanusGraph APIs for restore instead of custom per-vertex logic.
>  * Investigate possibility of using MapReduce for higher throughput.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (ATLAS-4501) Table 'Allow list' with a default deny to load only a subset of tables

2021-12-06 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4501:
--

Assignee: Radhika Kundam

> Table 'Allow list' with a default deny to load only a subset of tables
> --
>
> Key: ATLAS-4501
> URL: https://issues.apache.org/jira/browse/ATLAS-4501
> Project: Atlas
>  Issue Type: New Feature
>  Components:  atlas-core, atlas-intg, hive-integration
>Reporter: Adriano
>Assignee: Radhika Kundam
>Priority: Major
>
> There are some huge environments where the warehouse has a thousand databases 
> and hundred thousand tables with many columns and most of them are dropped, 
> created, updated at a fast pace. In these environments, the Atlas processing 
> time can slow down increasing the backlog as it starts moving slower than the 
> changes in the warehouse and the {{prune.pattern}} e/o {{ignore.pattern}} it 
> is not suitable.
> It will be nice to have the opportunity to have a default deny behaviour for 
> all the tables and then to 'allow' the import of a subset of tables specified 
> in a parameter regex (in order to process only some important tables): 
> basically that works in the opposite way to the {{prune.pattern}} and 
> {{ignore.pattern.}}
> As far as I know, there is a similar feature for S3 and ADLS but not for hive.
> If this is the case, will be nice to get the feature onboarded in your 
> backlog.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ATLAS-4479) ApplicationProperties cause NPE when using Elasticsearch

2021-11-15 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444001#comment-17444001
 ] 

Ashutosh Mestry commented on ATLAS-4479:


I have approved the PR.

> ApplicationProperties cause NPE when using Elasticsearch
> 
>
> Key: ATLAS-4479
> URL: https://issues.apache.org/jira/browse/ATLAS-4479
> Project: Atlas
>  Issue Type: Bug
>Reporter: wuzhiguo
>Priority: Major
>
> code in line 367: `LOG.info("Setting " + SOLR_WAIT_SEARCHER_CONF + " = " + 
> getBoolean(SOLR_WAIT_SEARCHER_CONF));` will cause NPE when using 
> elasticsearch, which should not, cause the configuration belongs to solr



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ATLAS-4474) Cannot append log in correct file when using Docker

2021-11-10 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442109#comment-17442109
 ] 

Ashutosh Mestry commented on ATLAS-4474:


Can you please tell me what have you done so far?

I have was able to get the docker image to work by following the steps 
mentioned in the readme file.

> Cannot append log in correct file when using Docker
> ---
>
> Key: ATLAS-4474
> URL: https://issues.apache.org/jira/browse/ATLAS-4474
> Project: Atlas
>  Issue Type: Bug
>Reporter: wuzhiguo
>Priority: Major
> Attachments: Dockerfile, image-2021-11-10-18-05-53-971.png, 
> image-2021-11-10-18-07-52-717.png, image-2021-11-10-18-08-34-805.png
>
>
> When I'm using docker to build atlas, the log cannot resolve the vm options 
> defined in python script
> Please see the images and my Dockerfile
> The bin file is based on release-2.2.0-rc1
> I'd like to upload my bin file too, but it exceeds maximum size limit 
> (approximate to 440M)
> !image-2021-11-10-18-05-53-971.png!!image-2021-11-10-18-07-52-717.png!
> !image-2021-11-10-18-08-34-805.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ATLAS-4246) Make Kafka Interface aware of Kafka Schema Registry

2021-11-09 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441268#comment-17441268
 ] 

Ashutosh Mestry commented on ATLAS-4246:


Here's the new CI build: 
[https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/954/]

Will proceed with commit if this succeeds.

> Make Kafka Interface aware of Kafka Schema Registry
> ---
>
> Key: ATLAS-4246
> URL: https://issues.apache.org/jira/browse/ATLAS-4246
> Project: Atlas
>  Issue Type: Improvement
>  Components: kafka-integration
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Aileen Toleikis
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: Kafka
> Fix For: 3.0.0, 2.3.0
>
>
> Kafka Community is using Schema Registry more and more heavily but as Atlas 
> is currently unaware of this, this extension helps Atlas make use of the 
> Schemas.
>  
> We have tested this extension and we have production environments where Atlas 
> will not be allowed without schema registry access. We have received feedback 
> that this extension would be sufficient to allow production use.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (ATLAS-4464) Ingest: Improve Rate of Ingest

2021-11-02 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4464:
---
Attachment: (was: 
ATLAS-4464-concurrent-notification-processing-obj-sync-v3.patch)

> Ingest: Improve Rate of Ingest
> --
>
> Key: ATLAS-4464
> URL: https://issues.apache.org/jira/browse/ATLAS-4464
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
>
> *Background*
> Existing implementation of _NotificationHookConsumer_ has linear complexity 
> for ingestion. This has several impacts:
>  * Authorization policies will take longer to get enforced.
>  * Unpredictable wait times for metadata showing up in Atlas.
> *Solution*
> Implement a mechanism for processing messages such that:
>  * Determine dependencies within incoming messages.
>  * Dependent messages should get processed serially.
>  * Messages without dependencies are processed concurrently.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4469) Support HBase secured by Kerberos

2021-11-01 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437134#comment-17437134
 ] 

Ashutosh Mestry commented on ATLAS-4469:


We already support HBase authenticated by Kerberos. Can you please add some 
details to this ticket? 

> Support HBase secured by Kerberos
> -
>
> Key: ATLAS-4469
> URL: https://issues.apache.org/jira/browse/ATLAS-4469
> Project: Atlas
>  Issue Type: Improvement
>Reporter: seafish
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4464) Ingest: Improve Rate of Ingest

2021-10-28 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4464:
---
Attachment: ATLAS-4464-concurrent-notification-processing-obj-sync-v3.patch

> Ingest: Improve Rate of Ingest
> --
>
> Key: ATLAS-4464
> URL: https://issues.apache.org/jira/browse/ATLAS-4464
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> ATLAS-4464-concurrent-notification-processing-obj-sync-v3.patch
>
>
> *Background*
> Existing implementation of _NotificationHookConsumer_ has linear complexity 
> for ingestion. This has several impacts:
>  * Authorization policies will take longer to get enforced.
>  * Unpredictable wait times for metadata showing up in Atlas.
> *Solution*
> Implement a mechanism for processing messages such that:
>  * Determine dependencies within incoming messages.
>  * Dependent messages should get processed serially.
>  * Messages without dependencies are processed concurrently.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4464) Ingest: Improve Rate of Ingest

2021-10-28 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4464:
--

 Summary: Ingest: Improve Rate of Ingest
 Key: ATLAS-4464
 URL: https://issues.apache.org/jira/browse/ATLAS-4464
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
 Fix For: trunk


*Background*

Existing implementation of _NotificationHookConsumer_ has linear complexity for 
ingestion. This has several impacts:
 * Authorization policies will take longer to get enforced.
 * Unpredictable wait times for metadata showing up in Atlas.

*Solution*

Implement a mechanism for processing messages such that:
 * Determine dependencies within incoming messages.
 * Dependent messages should get processed serially.
 * Messages without dependencies are processed concurrently.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4246) Make Kafka Interface aware of Kafka Schema Registry

2021-10-28 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435501#comment-17435501
 ] 

Ashutosh Mestry commented on ATLAS-4246:


[~aileeen] Please give us couple of days, we are in the process of fixing our 
CI build. Once that is GREEN, we will proceed with committing your patch and 
few others.

> Make Kafka Interface aware of Kafka Schema Registry
> ---
>
> Key: ATLAS-4246
> URL: https://issues.apache.org/jira/browse/ATLAS-4246
> Project: Atlas
>  Issue Type: Improvement
>  Components: kafka-integration
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Aileen Toleikis
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: Kafka
> Fix For: 3.0.0, 2.3.0
>
>
> Kafka Community is using Schema Registry more and more heavily but as Atlas 
> is currently unaware of this, this extension helps Atlas make use of the 
> Schemas.
>  
> We have tested this extension and we have production environments where Atlas 
> will not be allowed without schema registry access. We have received feedback 
> that this extension would be sufficient to allow production use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4246) Make Kafka Interface aware of Kafka Schema Registry

2021-10-21 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432571#comment-17432571
 ] 

Ashutosh Mestry commented on ATLAS-4246:


Your changes in the new patch look good. I will wait for Sarath's 'Ship it' and 
then we will get it committed.

I am running a CI build: 
[https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/919/]

Thanks for all the work!

> Make Kafka Interface aware of Kafka Schema Registry
> ---
>
> Key: ATLAS-4246
> URL: https://issues.apache.org/jira/browse/ATLAS-4246
> Project: Atlas
>  Issue Type: Improvement
>  Components: kafka-integration
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Aileen Toleikis
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: Kafka
> Fix For: 3.0.0, 2.3.0
>
>
> Kafka Community is using Schema Registry more and more heavily but as Atlas 
> is currently unaware of this, this extension helps Atlas make use of the 
> Schemas.
>  
> We have tested this extension and we have production environments where Atlas 
> will not be allowed without schema registry access. We have received feedback 
> that this extension would be sufficient to allow production use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4458) Commons-Logging Exclusion Causes Startup Problems

2021-10-20 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4458:
---
Attachment: ATLAS-4458-Commons-logging-reference-fix.patch

> Commons-Logging Exclusion Causes Startup Problems
> -
>
> Key: ATLAS-4458
> URL: https://issues.apache.org/jira/browse/ATLAS-4458
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: ATLAS-4458-Commons-logging-reference-fix.patch
>
>
> Earlier commit of ATLAS-4351,  caused _commons-logging*.jar_ to be excluded. 
> This causes startup problem due to missing dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4458) Commons-Logging Exclusion Causes Startup Problems

2021-10-20 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4458:
--

 Summary: Commons-Logging Exclusion Causes Startup Problems
 Key: ATLAS-4458
 URL: https://issues.apache.org/jira/browse/ATLAS-4458
 Project: Atlas
  Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


Earlier commit of ATLAS-4351,  caused _commons-logging*.jar_ to be excluded. 
This causes startup problem due to missing dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4456) Atlas fails to start if Solr wait-searcher property is not set

2021-10-19 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4456:
--

Assignee: Ashutosh Mestry

> Atlas fails to start if Solr wait-searcher property is not set
> --
>
> Key: ATLAS-4456
> URL: https://issues.apache.org/jira/browse/ATLAS-4456
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.2.0
>Reporter: Robert Yokota
>Assignee: Ashutosh Mestry
>Priority: Major
>
> Atlas fails to start if Solr wait-searcher property is not set.  This is due 
> to the following line 
> [https://github.com/apache/atlas/blob/release-2.2.0-rc1/intg/src/main/java/org/apache/atlas/ApplicationProperties.java#L365]
> which will throw the following exception:
> {code:java}
> Exception in thread "main" org.apache.atlas.AtlasException: Failed to load 
> application properties
> at 
> org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:150)
> at 
> org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:103)
> at org.apache.atlas.Atlas.main(Atlas.java:111)
> Caused by: java.util.NoSuchElementException: 
> 'atlas.graph.index.search.solr.wait-searcher' doesn't map to an existing 
> object
> at 
> org.apache.commons.configuration.AbstractConfiguration.getBoolean(AbstractConfiguration.java:644)
> at 
> org.apache.atlas.ApplicationProperties.setDefaults(ApplicationProperties.java:365)
> at 
> org.apache.atlas.ApplicationProperties.get(ApplicationProperties.java:141)
> ... 2 more
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4449) atlas版本2.0编译报依赖org.restlet错误

2021-10-11 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427229#comment-17427229
 ] 

Ashutosh Mestry commented on ATLAS-4449:


I used this: 
{code:java}
mvn clean install package -Pdist -DskipTests -am{code}
My build goes through without problems.  It uses the public maven repo.

Can you attempt your build with public repo. 

> atlas版本2.0编译报依赖org.restlet错误
> 
>
> Key: ATLAS-4449
> URL: https://issues.apache.org/jira/browse/ATLAS-4449
> Project: Atlas
>  Issue Type: Bug
>  Components: atlas-intg
> Environment: centos7
>Reporter: cjjxfli
>Priority: Major
>
> [ERROR] Failed to execute goal on project atlas-testtools: Could not resolve 
> dependencies for project org.apache.atlas:atlas-testtools:jar:2.3.0-SNAPSHOT: 
> The following artifacts could not be resolved: 
> org.restlet.jee:org.restlet:jar:2.4.3, 
> org.restlet.jee:org.restlet.ext.servlet:jar:2.4.3: Could not find artifact 
> org.restlet.jee:org.restlet:jar:2.4.3 in alimaven 
> (http://maven.aliyun.com/nexus/content/groups/public/) -> [Help 1]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4246) Make Kafka Interface aware of Kafka Schema Registry

2021-10-06 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425068#comment-17425068
 ] 

Ashutosh Mestry commented on ATLAS-4246:


[~aileeen] We will proceed with merging this. I will update RB, if we need any 
updates.

[~viktorsomogyi] Thanks for taking care of this.

> Make Kafka Interface aware of Kafka Schema Registry
> ---
>
> Key: ATLAS-4246
> URL: https://issues.apache.org/jira/browse/ATLAS-4246
> Project: Atlas
>  Issue Type: Improvement
>  Components: kafka-integration
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Aileen Toleikis
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: Kafka
> Fix For: 3.0.0, 2.3.0
>
>
> Kafka Community is using Schema Registry more and more heavily but as Atlas 
> is currently unaware of this, this extension helps Atlas make use of the 
> Schemas.
>  
> We have tested this extension and we have production environments where Atlas 
> will not be allowed without schema registry access. We have received feedback 
> that this extension would be sufficient to allow production use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4440) Upgrade Atlas's Kafka dependency to 2.8

2021-10-04 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423997#comment-17423997
 ] 

Ashutosh Mestry commented on ATLAS-4440:


+1 for the patch.

[~chaitali] / [~dishatalreja]: Can you please review.

[~sarath]: Please sign-off.

> Upgrade Atlas's Kafka dependency to 2.8
> ---
>
> Key: ATLAS-4440
> URL: https://issues.apache.org/jira/browse/ATLAS-4440
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Major
> Attachments: 0001-ATLAS-4440-Update-Kafka-dependency-to-2.8.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4246) Make Kafka Interface aware of Kafka Schema Registry

2021-10-04 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423993#comment-17423993
 ] 

Ashutosh Mestry commented on ATLAS-4246:


[~aileeen] I have assigned this ticket to Viktor. I will follow up with him.

> Make Kafka Interface aware of Kafka Schema Registry
> ---
>
> Key: ATLAS-4246
> URL: https://issues.apache.org/jira/browse/ATLAS-4246
> Project: Atlas
>  Issue Type: Improvement
>  Components: kafka-integration
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Aileen Toleikis
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: Kafka
> Fix For: 3.0.0, 2.3.0
>
>
> Kafka Community is using Schema Registry more and more heavily but as Atlas 
> is currently unaware of this, this extension helps Atlas make use of the 
> Schemas.
>  
> We have tested this extension and we have production environments where Atlas 
> will not be allowed without schema registry access. We have received feedback 
> that this extension would be sufficient to allow production use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4246) Make Kafka Interface aware of Kafka Schema Registry

2021-10-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4246:
--

Assignee: Viktor Somogyi-Vass

> Make Kafka Interface aware of Kafka Schema Registry
> ---
>
> Key: ATLAS-4246
> URL: https://issues.apache.org/jira/browse/ATLAS-4246
> Project: Atlas
>  Issue Type: Improvement
>  Components: kafka-integration
>Affects Versions: 2.1.0, 3.0.0
>Reporter: Aileen Toleikis
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: Kafka
> Fix For: 3.0.0, 2.3.0
>
>
> Kafka Community is using Schema Registry more and more heavily but as Atlas 
> is currently unaware of this, this extension helps Atlas make use of the 
> Schemas.
>  
> We have tested this extension and we have production environments where Atlas 
> will not be allowed without schema registry access. We have received feedback 
> that this extension would be sufficient to allow production use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4440) Upgrade Atlas's Kafka dependency to 2.8

2021-10-01 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423340#comment-17423340
 ] 

Ashutosh Mestry commented on ATLAS-4440:


PC Build: 
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/871/

> Upgrade Atlas's Kafka dependency to 2.8
> ---
>
> Key: ATLAS-4440
> URL: https://issues.apache.org/jira/browse/ATLAS-4440
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Reporter: Viktor Somogyi-Vass
>Assignee: Viktor Somogyi-Vass
>Priority: Major
> Attachments: 0001-ATLAS-4440-Update-Kafka-dependency-to-2.8.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-4358) Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, etc) does not exist in Elasticsearch

2021-09-27 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-4358.

Resolution: Fixed

> Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, 
> etc) does not exist in Elasticsearch
> --
>
> Key: ATLAS-4358
> URL: https://issues.apache.org/jira/browse/ATLAS-4358
> Project: Atlas
>  Issue Type: Bug
>Reporter: Anshul Mehta
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: patch-manager-fix.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> *Impact of the problem -*
>  * Atlas pod taking much longer to become active and this time keep 
> increasing as the assets increase. This basically means a downtime on every 
> Atlas release.
>  * Not able to filter via basic search on attributes like {{__timestamp}} , 
> {{__modificationTimestamp}} , {{createdBy}} and {{modifiedBy}} .
> *Issue -*
> So just before creating the mapping in the mixed index (ES index) Atlas 
> creates something called {{propertyKey}} and this propertyKey is used to 
> create the mapping. The code is written in a way that checks if propertyKey 
> for the current property is null or not. If it is null it creates the 
> propertyKey and then adds it to mixed index. If it is not Null it assumes 
> that the property has already been added to the index and so skips adding it.
> Now in our case when Atlas checked the propertyKey it was not null (which 
> should not have been the case) therefore Atlas skipped adding it to the mixed 
> index and so these properties never got added to the mixed index. This 
> basically meant propertyKey for these properties were getting created 
> somewhere else. We looked into the entire codebase but could not find the use 
> of makePropertyKey method ( which is used to create propertyKey) or any other 
> similar method.
> Then I saw certain java patch vertices getting created even before these 
> internal attributes are added to various indices. Though these patches were 
> applied later once all internal attributes were added to all the indices.
> Now, these patch vertices have 9 attributes and we releaized these 9 
> attributes are the only attributes missing from ES. So basically when patch 
> vertices got created and these vertices with their attributes got added to 
> cassandra via janusgraph, janusgraph automatically created propertyKey for 
> all these attributes (the janusgraph's makePropertyKey method is not called 
> during this process anywhere in the Atlas code). And because internal 
> attributes were getting added to indices in another thread at the same time, 
> when code checked for propertyKey, it was not null and so it did not add the 
> property to the mixed index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4358) Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, etc) does not exist in Elasticsearch

2021-09-27 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4358:
--

Assignee: Ashutosh Mestry

> Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, 
> etc) does not exist in Elasticsearch
> --
>
> Key: ATLAS-4358
> URL: https://issues.apache.org/jira/browse/ATLAS-4358
> Project: Atlas
>  Issue Type: Bug
>Reporter: Anshul Mehta
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: patch-manager-fix.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> *Impact of the problem -*
>  * Atlas pod taking much longer to become active and this time keep 
> increasing as the assets increase. This basically means a downtime on every 
> Atlas release.
>  * Not able to filter via basic search on attributes like {{__timestamp}} , 
> {{__modificationTimestamp}} , {{createdBy}} and {{modifiedBy}} .
> *Issue -*
> So just before creating the mapping in the mixed index (ES index) Atlas 
> creates something called {{propertyKey}} and this propertyKey is used to 
> create the mapping. The code is written in a way that checks if propertyKey 
> for the current property is null or not. If it is null it creates the 
> propertyKey and then adds it to mixed index. If it is not Null it assumes 
> that the property has already been added to the index and so skips adding it.
> Now in our case when Atlas checked the propertyKey it was not null (which 
> should not have been the case) therefore Atlas skipped adding it to the mixed 
> index and so these properties never got added to the mixed index. This 
> basically meant propertyKey for these properties were getting created 
> somewhere else. We looked into the entire codebase but could not find the use 
> of makePropertyKey method ( which is used to create propertyKey) or any other 
> similar method.
> Then I saw certain java patch vertices getting created even before these 
> internal attributes are added to various indices. Though these patches were 
> applied later once all internal attributes were added to all the indices.
> Now, these patch vertices have 9 attributes and we releaized these 9 
> attributes are the only attributes missing from ES. So basically when patch 
> vertices got created and these vertices with their attributes got added to 
> cassandra via janusgraph, janusgraph automatically created propertyKey for 
> all these attributes (the janusgraph's makePropertyKey method is not called 
> during this process anywhere in the Atlas code). And because internal 
> attributes were getting added to indices in another thread at the same time, 
> when code checked for propertyKey, it was not null and so it did not add the 
> property to the mixed index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4435) Session Inactivity Timeout: Provide Ability to Disable Session Inactivity Timeout Implementation

2021-09-21 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4435:
---
Attachment: ATLAS-4435-session-timeout-disable.patch

> Session Inactivity Timeout: Provide Ability to Disable Session Inactivity 
> Timeout Implementation
> 
>
> Key: ATLAS-4435
> URL: https://issues.apache.org/jira/browse/ATLAS-4435
> Project: Atlas
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: ATLAS-4435-session-timeout-disable.patch
>
>
> *Background*
> The ATLAS-4379 implemented 'Session Inactivity Timeout'.
> *Requirement*
> User should have ability to disable this when needed.
> *Solution*
> The property 'atlas.session.timeout.secs' lets user set timeout value. 
> Setting this to -1 will disable the feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4435) Session Inactivity Timeout: Provide Ability to Disable Session Inactivity Timeout Implementation

2021-09-21 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4435:
--

 Summary: Session Inactivity Timeout: Provide Ability to Disable 
Session Inactivity Timeout Implementation
 Key: ATLAS-4435
 URL: https://issues.apache.org/jira/browse/ATLAS-4435
 Project: Atlas
  Issue Type: Improvement
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
 Fix For: trunk


*Background*

The ATLAS-4379 implemented 'Session Inactivity Timeout'.

*Requirement*

User should have ability to disable this when needed.

*Solution*

The property 'atlas.session.timeout.secs' lets user set timeout value. Setting 
this to -1 will disable the feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4358) Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, etc) does not exist in Elasticsearch

2021-09-21 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4358:
---
Attachment: patch-manager-fix.patch

> Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, 
> etc) does not exist in Elasticsearch
> --
>
> Key: ATLAS-4358
> URL: https://issues.apache.org/jira/browse/ATLAS-4358
> Project: Atlas
>  Issue Type: Bug
>Reporter: Anshul Mehta
>Priority: Major
> Attachments: patch-manager-fix.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> *Impact of the problem -*
>  * Atlas pod taking much longer to become active and this time keep 
> increasing as the assets increase. This basically means a downtime on every 
> Atlas release.
>  * Not able to filter via basic search on attributes like {{__timestamp}} , 
> {{__modificationTimestamp}} , {{createdBy}} and {{modifiedBy}} .
> *Issue -*
> So just before creating the mapping in the mixed index (ES index) Atlas 
> creates something called {{propertyKey}} and this propertyKey is used to 
> create the mapping. The code is written in a way that checks if propertyKey 
> for the current property is null or not. If it is null it creates the 
> propertyKey and then adds it to mixed index. If it is not Null it assumes 
> that the property has already been added to the index and so skips adding it.
> Now in our case when Atlas checked the propertyKey it was not null (which 
> should not have been the case) therefore Atlas skipped adding it to the mixed 
> index and so these properties never got added to the mixed index. This 
> basically meant propertyKey for these properties were getting created 
> somewhere else. We looked into the entire codebase but could not find the use 
> of makePropertyKey method ( which is used to create propertyKey) or any other 
> similar method.
> Then I saw certain java patch vertices getting created even before these 
> internal attributes are added to various indices. Though these patches were 
> applied later once all internal attributes were added to all the indices.
> Now, these patch vertices have 9 attributes and we releaized these 9 
> attributes are the only attributes missing from ES. So basically when patch 
> vertices got created and these vertices with their attributes got added to 
> cassandra via janusgraph, janusgraph automatically created propertyKey for 
> all these attributes (the janusgraph's makePropertyKey method is not called 
> during this process anywhere in the Atlas code). And because internal 
> attributes were getting added to indices in another thread at the same time, 
> when code checked for propertyKey, it was not null and so it did not add the 
> property to the mixed index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4358) Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, etc) does not exist in Elasticsearch

2021-09-21 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17418168#comment-17418168
 ] 

Ashutosh Mestry commented on ATLAS-4358:


[~mehtaanshul] Would it be possible for you to try out the attached patch? It 
integrates your fix and an additional fix. We had a similar problem in one of 
our customer environments. I am not able to duplicate it in my lab setup. The 
attached fix can address the problem.

> Mapping for some internal Atlas attributes ( like __patch.type , __timestamp, 
> etc) does not exist in Elasticsearch
> --
>
> Key: ATLAS-4358
> URL: https://issues.apache.org/jira/browse/ATLAS-4358
> Project: Atlas
>  Issue Type: Bug
>Reporter: Anshul Mehta
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> *Impact of the problem -*
>  * Atlas pod taking much longer to become active and this time keep 
> increasing as the assets increase. This basically means a downtime on every 
> Atlas release.
>  * Not able to filter via basic search on attributes like {{__timestamp}} , 
> {{__modificationTimestamp}} , {{createdBy}} and {{modifiedBy}} .
> *Issue -*
> So just before creating the mapping in the mixed index (ES index) Atlas 
> creates something called {{propertyKey}} and this propertyKey is used to 
> create the mapping. The code is written in a way that checks if propertyKey 
> for the current property is null or not. If it is null it creates the 
> propertyKey and then adds it to mixed index. If it is not Null it assumes 
> that the property has already been added to the index and so skips adding it.
> Now in our case when Atlas checked the propertyKey it was not null (which 
> should not have been the case) therefore Atlas skipped adding it to the mixed 
> index and so these properties never got added to the mixed index. This 
> basically meant propertyKey for these properties were getting created 
> somewhere else. We looked into the entire codebase but could not find the use 
> of makePropertyKey method ( which is used to create propertyKey) or any other 
> similar method.
> Then I saw certain java patch vertices getting created even before these 
> internal attributes are added to various indices. Though these patches were 
> applied later once all internal attributes were added to all the indices.
> Now, these patch vertices have 9 attributes and we releaized these 9 
> attributes are the only attributes missing from ES. So basically when patch 
> vertices got created and these vertices with their attributes got added to 
> cassandra via janusgraph, janusgraph automatically created propertyKey for 
> all these attributes (the janusgraph's makePropertyKey method is not called 
> during this process anywhere in the Atlas code). And because internal 
> attributes were getting added to indices in another thread at the same time, 
> when code checked for propertyKey, it was not null and so it did not add the 
> property to the mixed index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ATLAS-4389) Best practice or a way to bring in large number of entities on a regular basis.

2021-09-20 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417764#comment-17417764
 ] 

Ashutosh Mestry edited comment on ATLAS-4389 at 9/20/21, 5:37 PM:
--

Sorry for the delay in replying.

Background: Existing implementation of ingest has linear complexity. This is 
done to be able to deal with the create/update/delete message types and the 
temporal nature of these operations.

Here are few that I have tried and worked as solutions for some of our 
customers:

*Approach 1*

Pre-requisite: Entity creation is in your control. 

Solution:
 * Create topologically sorted entities. Parent entities are created before the 
child entities.
 * Create lineage entities after parent participating entities are created.
 * Use REST APIs to concurrently create entities of a type. Start new type only 
after all entities of a type are exhausted.

This is the advantage of being able to create entities concurrently as their 
dependents are already created. This approach gives high throughput and 
continues to maintain consistency of data.

This needs some amount of book-keeping. This may not be a lot if you are 
creating Hive entities and follow a consistent pattern for coming up with names 
for _qualifiedName_ unique attribute.

In my test: I was able to run between 25 to 50 concurrent workers all creating 
entity of a type. 

About code paths: Ingest via Kakfa queue, entity creation via REST APIs and 
ingest via Import API all follow same code path.

 


was (Author: ashutoshm):
Sorry for the delay in replying.

Background: Existing implementation of ingest has linear complexity. This is 
done to be able to deal with the create/update/delete message types and the 
temporal nature of these operations.

Here are few that I have tried and worked as solutions for some of our 
customers:

*Approach 1*

Pre-requisite: Entity creation is in your control. 

Solution: 
 * Create topologically sorted entities. Parent entities are created before the 
child entities. 
 * Create lineage entities after parent participating entities are created.
 * Use REST APIs to concurrently create entities of a type. Start new type only 
after all entities of a type are exhausted.

This is the advantage of being able to create entities concurrently as their 
dependents are already created. This approach gives high throughput and 
continues to maintain consistency of data.

This needs some amount of book-keeping. This may not be a lot if you are 
creating Hive entities and follow a consistent pattern for coming up with names 
for _qualifiedName_ unique attribute.

About code paths: Ingest via Kakfa queue, entity creation via REST APIs and 
ingest via Import API all follow same code path.

 

> Best practice or a way to bring in large number of entities on a regular 
> basis.
> ---
>
> Key: ATLAS-4389
> URL: https://issues.apache.org/jira/browse/ATLAS-4389
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Saad
>Assignee: Ashutosh Mestry
>Priority: Major
>  Labels: documentation, newbie, performance
> Attachments: image-2021-08-05-11-22-29-259.png, 
> image-2021-08-05-11-23-05-440.png
>
>
> Would you be so kind to let us know if there is any best practice or a way to 
> bring in large number of entities on a regular basis.
> *Our use case:*
> We will be bringing in around 12,000  datasets, 12,000 jobs and 70,000 
> columns. We want to do this as part of our deployment pipeline for other 
> upstream projects.
> At every deploy we want to do the following:
>  - Add the jobs, datasets and columns that are not in Atlas
>  - Update the jobs, datasets and columns that are in Atlas
>  - Delete the jobs from Atlas that are deleted from the upstream systems.
> So far we have considered using the bulk API endpoint(/v2/entity/bulk). This 
> has its own issues. We found that if the payload is too big in our case 
> bigger than 300-500 entities this times out. The more deeper the 
> relationships the fewer the entities you can send through the bulk endpoint.
> Inspecting some of the code we feel that both REST and streaming data through 
> Kafka follow the same codepath and finally yield the same performance.
> Further we found that when creating entities the type registry becomes the 
> bottle neck. We discovered this by profiling the jvm. We found that only one 
> core processes the the entities and their relationships.
> *Questions:*
> 1- What is the best practice when bulk loading lots on entities in a 
> reasonable time. We are aiming to load 12k jobs, 12k datasets and 70k columns 
> in less than 10 mins.?
> 2- Where should we start if we want to scale the API, is there any 

[jira] [Commented] (ATLAS-4389) Best practice or a way to bring in large number of entities on a regular basis.

2021-09-20 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417764#comment-17417764
 ] 

Ashutosh Mestry commented on ATLAS-4389:


Sorry for the delay in replying.

Background: Existing implementation of ingest has linear complexity. This is 
done to be able to deal with the create/update/delete message types and the 
temporal nature of these operations.

Here are few that I have tried and worked as solutions for some of our 
customers:

*Approach 1*

Pre-requisite: Entity creation is in your control. 

Solution: 
 * Create topologically sorted entities. Parent entities are created before the 
child entities. 
 * Create lineage entities after parent participating entities are created.
 * Use REST APIs to concurrently create entities of a type. Start new type only 
after all entities of a type are exhausted.

This is the advantage of being able to create entities concurrently as their 
dependents are already created. This approach gives high throughput and 
continues to maintain consistency of data.

This needs some amount of book-keeping. This may not be a lot if you are 
creating Hive entities and follow a consistent pattern for coming up with names 
for _qualifiedName_ unique attribute.

About code paths: Ingest via Kakfa queue, entity creation via REST APIs and 
ingest via Import API all follow same code path.

 

> Best practice or a way to bring in large number of entities on a regular 
> basis.
> ---
>
> Key: ATLAS-4389
> URL: https://issues.apache.org/jira/browse/ATLAS-4389
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Saad
>Assignee: Ashutosh Mestry
>Priority: Major
>  Labels: documentation, newbie, performance
> Attachments: image-2021-08-05-11-22-29-259.png, 
> image-2021-08-05-11-23-05-440.png
>
>
> Would you be so kind to let us know if there is any best practice or a way to 
> bring in large number of entities on a regular basis.
> *Our use case:*
> We will be bringing in around 12,000  datasets, 12,000 jobs and 70,000 
> columns. We want to do this as part of our deployment pipeline for other 
> upstream projects.
> At every deploy we want to do the following:
>  - Add the jobs, datasets and columns that are not in Atlas
>  - Update the jobs, datasets and columns that are in Atlas
>  - Delete the jobs from Atlas that are deleted from the upstream systems.
> So far we have considered using the bulk API endpoint(/v2/entity/bulk). This 
> has its own issues. We found that if the payload is too big in our case 
> bigger than 300-500 entities this times out. The more deeper the 
> relationships the fewer the entities you can send through the bulk endpoint.
> Inspecting some of the code we feel that both REST and streaming data through 
> Kafka follow the same codepath and finally yield the same performance.
> Further we found that when creating entities the type registry becomes the 
> bottle neck. We discovered this by profiling the jvm. We found that only one 
> core processes the the entities and their relationships.
> *Questions:*
> 1- What is the best practice when bulk loading lots on entities in a 
> reasonable time. We are aiming to load 12k jobs, 12k datasets and 70k columns 
> in less than 10 mins.?
> 2- Where should we start if we want to scale the API, is there any known way 
> to horizontally scale Atlas?
> Here are some of the stats for the load testing we did,
>  
> !image-2021-08-05-11-23-05-440.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4407) Common: Remove Dependency on shared curator JAR

2021-08-29 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4407:
---
Description: There are couple of places in module that import 
_org.apache.curator.shaded_ namespace. Removing this usage will allow for 
removal of the dependency on the referenced JAR.  (was: There are couple of 
places in INTG where curator shaded jar is used. This can be removed.)

> Common: Remove Dependency on shared curator JAR
> ---
>
> Key: ATLAS-4407
> URL: https://issues.apache.org/jira/browse/ATLAS-4407
> Project: Atlas
>  Issue Type: Improvement
>  Components: atlas-intg
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> There are couple of places in module that import _org.apache.curator.shaded_ 
> namespace. Removing this usage will allow for removal of the dependency on 
> the referenced JAR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4407) Common: Remove Dependency on shared curator JAR

2021-08-29 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4407:
---
Summary: Common: Remove Dependency on shared curator JAR  (was: INTG: 
Remove Dependency on shared curator JAR)

> Common: Remove Dependency on shared curator JAR
> ---
>
> Key: ATLAS-4407
> URL: https://issues.apache.org/jira/browse/ATLAS-4407
> Project: Atlas
>  Issue Type: Improvement
>  Components: atlas-intg
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> There are couple of places in INTG where curator shaded jar is used. This can 
> be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4407) INTG: Remove Dependency on shared curator JAR

2021-08-26 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4407:
--

 Summary: INTG: Remove Dependency on shared curator JAR
 Key: ATLAS-4407
 URL: https://issues.apache.org/jira/browse/ATLAS-4407
 Project: Atlas
  Issue Type: Improvement
  Components: atlas-intg
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


There are couple of places in INTG where curator shaded jar is used. This can 
be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4396) BAcka to work

2021-08-19 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4396:
---
Description: 
Przechowalnia 24 – Self Storage Warszawa to sieć samoobsługowych magazynów 
kontenerowych. Z powodu, że wynajem kontenerów magazynowych to tania i wygodna 
forma magazynowania, cieszy się ogromną popularnością. W przeciwieństwie do 
klasycznych self storage, kontenery oferują znacznie większą przestrzeń za 
niższą cenę. Ponadto możliwość podjechania samochodem bezpośrednio pod kontener 
znacząco ułatwia transport i magazynowanie. Dlatego magazyny kontenerowe są 
doskonałym rozwiązaniem zarówno dla małych i średnich przedsiębiorstw, jak i 
dla osób prywatnych chcących przechować swoje przedmioty w bezpiecznych 
warunkach.

Aktualnie dysponujemy kontenerami magazynowymi od 7 m2, poprzez 14 m2 jak 
również 29 m2 w lokalizacji na ternie warszawy oraz dwóch lokalizacjach wokół 
Warszawy. Zapraszamy wszystkich chętnych do skorzystania z naszej oferty.

[https://przechowalnia24.pl|https://przechowalnia24.pl/]

 English translation:

---
Przechowalnia 24 - Self Storage Warsaw is a network of self-service container 
warehouses. Due to the fact that the rental of storage containers is a cheap 
and convenient form of storage, it is very popular. Contrary to classic self 
storage, containers offer much more space for a lower price. In addition, the 
possibility of driving the car directly under the container significantly 
facilitates transport and storage. Therefore, container warehouses are a 
perfect solution for both small and medium-sized enterprises as well as for 
private persons who want to store their items in safe conditions.
---

  was:
Przechowalnia 24 – Self Storage Warszawa to sieć samoobsługowych magazynów 
kontenerowych. Z powodu, że wynajem kontenerów magazynowych to tania i wygodna 
forma magazynowania, cieszy się ogromną popularnością. W przeciwieństwie do 
klasycznych self storage, kontenery oferują znacznie większą przestrzeń za 
niższą cenę. Ponadto możliwość podjechania samochodem bezpośrednio pod kontener 
znacząco ułatwia transport i magazynowanie. Dlatego magazyny kontenerowe są 
doskonałym rozwiązaniem zarówno dla małych i średnich przedsiębiorstw, jak i 
dla osób prywatnych chcących przechować swoje przedmioty w bezpiecznych 
warunkach.

Aktualnie dysponujemy kontenerami magazynowymi od 7 m2, poprzez 14 m2 jak 
również 29 m2 w lokalizacji na ternie warszawy oraz dwóch lokalizacjach wokół 
Warszawy. Zapraszamy wszystkich chętnych do skorzystania z naszej oferty.

[https://przechowalnia24.pl|https://przechowalnia24.pl/]

 


> BAcka to work
> -
>
> Key: ATLAS-4396
> URL: https://issues.apache.org/jira/browse/ATLAS-4396
> Project: Atlas
>  Issue Type: Test
>  Components: atlas-webui
>Affects Versions: 2.1.0
>Reporter: Witek Brzęczek
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.1.0
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> Przechowalnia 24 – Self Storage Warszawa to sieć samoobsługowych magazynów 
> kontenerowych. Z powodu, że wynajem kontenerów magazynowych to tania i 
> wygodna forma magazynowania, cieszy się ogromną popularnością. W 
> przeciwieństwie do klasycznych self storage, kontenery oferują znacznie 
> większą przestrzeń za niższą cenę. Ponadto możliwość podjechania samochodem 
> bezpośrednio pod kontener znacząco ułatwia transport i magazynowanie. Dlatego 
> magazyny kontenerowe są doskonałym rozwiązaniem zarówno dla małych i średnich 
> przedsiębiorstw, jak i dla osób prywatnych chcących przechować swoje 
> przedmioty w bezpiecznych warunkach.
> Aktualnie dysponujemy kontenerami magazynowymi od 7 m2, poprzez 14 m2 jak 
> również 29 m2 w lokalizacji na ternie warszawy oraz dwóch lokalizacjach wokół 
> Warszawy. Zapraszamy wszystkich chętnych do skorzystania z naszej oferty.
> [https://przechowalnia24.pl|https://przechowalnia24.pl/]
>  English translation:
> ---
> Przechowalnia 24 - Self Storage Warsaw is a network of self-service container 
> warehouses. Due to the fact that the rental of storage containers is a cheap 
> and convenient form of storage, it is very popular. Contrary to classic self 
> storage, containers offer much more space for a lower price. In addition, the 
> possibility of driving the car directly under the container significantly 
> facilitates transport and storage. Therefore, container warehouses are a 
> perfect solution for both small and medium-sized enterprises as well as for 
> private persons who want to store their items in safe conditions.
> ---



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4389) Best practice or a way to bring in large number of entities on a regular basis.

2021-08-10 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4389:
--

Assignee: Ashutosh Mestry

> Best practice or a way to bring in large number of entities on a regular 
> basis.
> ---
>
> Key: ATLAS-4389
> URL: https://issues.apache.org/jira/browse/ATLAS-4389
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Saad
>Assignee: Ashutosh Mestry
>Priority: Major
>  Labels: documentation, newbie, performance
> Attachments: image-2021-08-05-11-22-29-259.png, 
> image-2021-08-05-11-23-05-440.png
>
>
> Would you be so kind to let us know if there is any best practice or a way to 
> bring in large number of entities on a regular basis.
> *Our use case:*
> We will be bringing in around 12,000  datasets, 12,000 jobs and 70,000 
> columns. We want to do this as part of our deployment pipeline for other 
> upstream projects.
> At every deploy we want to do the following:
>  - Add the jobs, datasets and columns that are not in Atlas
>  - Update the jobs, datasets and columns that are in Atlas
>  - Delete the jobs from Atlas that are deleted from the upstream systems.
> So far we have considered using the bulk API endpoint(/v2/entity/bulk). This 
> has its own issues. We found that if the payload is too big in our case 
> bigger than 300-500 entities this times out. The more deeper the 
> relationships the fewer the entities you can send through the bulk endpoint.
> Inspecting some of the code we feel that both REST and streaming data through 
> Kafka follow the same codepath and finally yield the same performance.
> Further we found that when creating entities the type registry becomes the 
> bottle neck. We discovered this by profiling the jvm. We found that only one 
> core processes the the entities and their relationships.
> *Questions:*
> 1- What is the best practice when bulk loading lots on entities in a 
> reasonable time. We are aiming to load 12k jobs, 12k datasets and 70k columns 
> in less than 10 mins.?
> 2- Where should we start if we want to scale the API, is there any known way 
> to horizontally scale Atlas?
> Here are some of the stats for the load testing we did,
>  
> !image-2021-08-05-11-23-05-440.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-2989) Full text search based dsl query is not bringing the results appropriately

2021-08-09 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-2989:
--

Assignee: Pinal Shah

> Full text search based dsl query is not bringing the results appropriately
> --
>
> Key: ATLAS-2989
> URL: https://issues.apache.org/jira/browse/ATLAS-2989
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core, atlas-webui
>Reporter: Abhishek Sharma
>Assignee: Pinal Shah
>Priority: Blocker
>
> Hello
> I have created some entities(like test1,school,college) using custom type 
> that I have prepared.
> Each entity contains array of columns .Each column contains array of sources. 
> Each Column comprises of columnAlias as a field.
> Now when I am performing the full text search using the below mentioned query
> [http://172.29.59.59:21000/api/atlas/v2/search/fulltext?excludeDeletedEntities=true=key+where+key='c1'|http://172.29.59.59:21000/api/atlas/v2/search/fulltext?excludeDeletedEntities=true=key+where+key=%27c1%27]
> Even if I provide the key to be something which is not correct and irrevelant 
> ,it is still showing output and it is showing output even on passing correct 
> key.
> I have the following doubts -
> 1) Whether my syntax for full text search query is incorrect ?
> 2) Same pattern is observed even on querying using basic dsl
> http://172.29.59.81:21000/api/atlas/v2/search/basic?excludeDeletedEntities=true==columns



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4340) Set Solr wait-searcher property to false by default to make Solr commits async

2021-07-09 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4340:
---
Description: 
 In Atlas when a transaction is committed, the entries are committed to HBase 
(primary storage) and Solr (indexing storage). A transaction is rolled-back if 
the primary storage commit fails, on the other hand when the secondary commit 
fails (solr), the transaction is not-rolled back and logged as warning and it 
is recommended to use reindex to repair the missing index documents. This 
behavior is due to the fact that the primary storage is the source of truth and 
indexes can be rebuild.

In Janusgraph, there is a property for Solr to make solr commits async. This is 
set to *true* in Atlas making every commit to wait until the solr commit is 
successful. This will have a negative impact on performance and is recommended 
to be false by default.

Property: *index.[X].solr.wait-searcher*
|When mutating - wait for the index to reflect new mutations before returning. 
This can have a negative impact on performance.|

 

This Jira is about setting the default value for above property to FALSE and 
can be overridden if need arises. 

The solution should use the _StandardTransactionLogProcessor_ provided within 
JanusGraph to track failures to indexes (secondary storage in JanusGraph 
parlance) during commit. Using this would provide recovery mechanism in case of 
failures during transaction commit.

  was:
 In Atlas when a transaction is committed, the entries are committed to HBase 
(primary storage) and Solr (indexing storage). A transaction is rolled-back if 
the primary storage commit fails, on the other hand when the secondary commit 
fails (solr), the transaction is not-rolled back and logged as warning and it 
is recommended to use reindex to repair the missing index documents. This 
behavior is due to the fact that the primary storage is the source of truth and 
indexes can be rebuild.

In Janusgraph, there is a property for Solr to make solr commits async. This is 
set to *true* in Atlas making every commit to wait until the solr commit is 
successful. This will have a negative impact on performance and is recommended 
to be false by default.

Property: *index.[X].solr.wait-searcher*
|When mutating - wait for the index to reflect new mutations before returning. 
This can have a negative impact on performance.|

 

This Jira is about setting the default value for above property to FALSE and 
can be overridden if need arises. 


> Set Solr wait-searcher property to false by default to make Solr commits async
> --
>
> Key: ATLAS-4340
> URL: https://issues.apache.org/jira/browse/ATLAS-4340
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.1.0
>Reporter: Sarath Subramanian
>Assignee: Sarath Subramanian
>Priority: Major
>  Labels: perfomance, solr
> Fix For: 3.0.0, 2.2.0
>
> Attachments: ATLAS-4340-001.patch
>
>
>  In Atlas when a transaction is committed, the entries are committed to HBase 
> (primary storage) and Solr (indexing storage). A transaction is rolled-back 
> if the primary storage commit fails, on the other hand when the secondary 
> commit fails (solr), the transaction is not-rolled back and logged as warning 
> and it is recommended to use reindex to repair the missing index documents. 
> This behavior is due to the fact that the primary storage is the source of 
> truth and indexes can be rebuild.
> In Janusgraph, there is a property for Solr to make solr commits async. This 
> is set to *true* in Atlas making every commit to wait until the solr commit 
> is successful. This will have a negative impact on performance and is 
> recommended to be false by default.
> Property: *index.[X].solr.wait-searcher*
> |When mutating - wait for the index to reflect new mutations before 
> returning. This can have a negative impact on performance.|
>  
> This Jira is about setting the default value for above property to FALSE and 
> can be overridden if need arises. 
> The solution should use the _StandardTransactionLogProcessor_ provided within 
> JanusGraph to track failures to indexes (secondary storage in JanusGraph 
> parlance) during commit. Using this would provide recovery mechanism in case 
> of failures during transaction commit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-4306) Atlas Spooling: Support for User-specific Spool Directory

2021-07-09 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-4306.

Resolution: Fixed

> Atlas Spooling: Support for User-specific Spool Directory
> -
>
> Key: ATLAS-4306
> URL: https://issues.apache.org/jira/browse/ATLAS-4306
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4306-Support-for-User-specific-Spool-Directory.patch
>
>
> *Background*
> Spooling for messages is enabled for hooks that publish messages to Atlas. 
> This is available for hooks that choose to add the following properties:
> {code:java}
> atlas.hook.spool.enabled=true
> atlas.hook.spool.dir=/spool-dir{code}
> Once the hook is initialized, the directory is created. The user who creates 
> the directory will have read-write permissions to the spool directly.
> There are cases where the hook gets initialized with different users. When 
> that happen, the directory is accessible to the other user. This causes 
> initialization failure and the spooling functionality not to be available to 
> the other user.
> *Solution(s)*
> This problem can be circumvented by making the 2 users be part of the same 
> group that will allow for same directory to be present for multiple users. 
> Multi-user access is currently supported.
> For the scenario, where multiple users are not part of the same group, the 
> situation described above will come into play. 
> _Approach Used_
> **No directory exists, _User1_ first creates the directory specified in the 
> configuration.
> Directory exists, _User1_ has access to the directory and spooling is enabled.
> Directory exists, _User2_ does not have access to the directory. New 
> directory with username suffixed to the existing directory name is created. 
> The directory created for the user will be
>  
> {code:java}
> /spool-dir-User1 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4336) Restrict attributes of type to be created starting with double underscore

2021-06-13 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4336:
---
Description: 
*Background*

Currently type can be created/updated with an attribute whose name starts with 
"__" like __guid.

Example:
{code:java}
"entityDefs": [ { 
  "name": "myType", 
  "superTypes": [], 
  "serviceType": "atlas_core", 
  "typeVersion": "1.0", 
   "attributeDefs": [ 
  { 
 "name": "__myAttrName", 
 "typeName": "string", 
 "cardinality": "SINGLE", 
 "isIndexable": true, 
 "isOptional": true, 
 "isUnique": false } 
] 
  }
]
{code}
This gets mixed up with system attributes.

Hence attribute names should not be allowed to be started with double 
underscore or following attribute names (system attribute names) should be 
restricted when a type is created or updated :

__classificationNames
 __modifiedBy
 __createdBy
 __state
 __typeName
 __modificationTimestamp
 __propagatedClassificationNames
 __customAttributes
 __isIncomplete
 __guid
 __timestamp
 __labels

  was:
Currently type can be updated with an attribute whose name starts with "__" 
like __guid.

This gets mixed up with system attributes.

Hence attribute names should not be allowed to be started with double 
underscore or following attribute names (system attribute names) should be 
restricted when a type is created or updated :

__classificationNames
__modifiedBy
__createdBy
__state
__typeName
__modificationTimestamp
__propagatedClassificationNames
__customAttributes
__isIncomplete
__guid
__timestamp
__labels


> Restrict attributes of type to be created starting with double underscore
> -
>
> Key: ATLAS-4336
> URL: https://issues.apache.org/jira/browse/ATLAS-4336
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Sharmadha S
>Assignee: Mandar Ambawane
>Priority: Major
> Attachments: ATLAS-4336.patch
>
>
> *Background*
> Currently type can be created/updated with an attribute whose name starts 
> with "__" like __guid.
> Example:
> {code:java}
> "entityDefs": [ { 
>   "name": "myType", 
>   "superTypes": [], 
>   "serviceType": "atlas_core", 
>   "typeVersion": "1.0", 
>"attributeDefs": [ 
>   { 
>  "name": "__myAttrName", 
>  "typeName": "string", 
>  "cardinality": "SINGLE", 
>  "isIndexable": true, 
>  "isOptional": true, 
>  "isUnique": false } 
> ] 
>   }
> ]
> {code}
> This gets mixed up with system attributes.
> Hence attribute names should not be allowed to be started with double 
> underscore or following attribute names (system attribute names) should be 
> restricted when a type is created or updated :
> __classificationNames
>  __modifiedBy
>  __createdBy
>  __state
>  __typeName
>  __modificationTimestamp
>  __propagatedClassificationNames
>  __customAttributes
>  __isIncomplete
>  __guid
>  __timestamp
>  __labels



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4328) Add flink favicon

2021-06-08 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359766#comment-17359766
 ] 

Ashutosh Mestry commented on ATLAS-4328:


[~jjyeh] Thanks! +1 from me!

> Add flink favicon
> -
>
> Key: ATLAS-4328
> URL: https://issues.apache.org/jira/browse/ATLAS-4328
> Project: Atlas
>  Issue Type: Task
>  Components: atlas-intg
>Reporter: Josh Yeh
>Assignee: Prasad P. Pawar
>Priority: Trivial
> Attachments: flink.png, flink_process.png, 
> image-2021-06-07-08-41-37-559.png
>
>
> While testing ATLAS-3812, I noticed flink_application and flink_process icons 
> are missing. File this Jira to track.
> !image-2021-06-07-08-41-37-559.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ATLAS-4328) Add flink favicon

2021-06-08 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359766#comment-17359766
 ] 

Ashutosh Mestry edited comment on ATLAS-4328 at 6/9/21, 5:34 AM:
-

[~jjyeh] Thanks! +1 from me!

[~prasadpp13] Kindly check if color scheme is consistent with rest of the icons.


was (Author: ashutoshm):
[~jjyeh] Thanks! +1 from me!

> Add flink favicon
> -
>
> Key: ATLAS-4328
> URL: https://issues.apache.org/jira/browse/ATLAS-4328
> Project: Atlas
>  Issue Type: Task
>  Components: atlas-intg
>Reporter: Josh Yeh
>Assignee: Prasad P. Pawar
>Priority: Trivial
> Attachments: flink.png, flink_process.png, 
> image-2021-06-07-08-41-37-559.png
>
>
> While testing ATLAS-3812, I noticed flink_application and flink_process icons 
> are missing. File this Jira to track.
> !image-2021-06-07-08-41-37-559.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4328) Add flink favicon

2021-06-07 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4328:
--

Assignee: Prasad P. Pawar

> Add flink favicon
> -
>
> Key: ATLAS-4328
> URL: https://issues.apache.org/jira/browse/ATLAS-4328
> Project: Atlas
>  Issue Type: Task
>  Components: atlas-intg
>Reporter: Josh Yeh
>Assignee: Prasad P. Pawar
>Priority: Trivial
> Attachments: image-2021-06-07-08-41-37-559.png
>
>
> While testing ATLAS-3812, I noticed flink_application and flink_process icons 
> are missing. File this Jira to track.
> !image-2021-06-07-08-41-37-559.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4329) Update Kafka version to 2.5

2021-06-07 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358753#comment-17358753
 ] 

Ashutosh Mestry commented on ATLAS-4329:


+1 for patch.

> Update Kafka version to 2.5 
> 
>
> Key: ATLAS-4329
> URL: https://issues.apache.org/jira/browse/ATLAS-4329
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.1.0
>Reporter: Sarath Subramanian
>Assignee: Sarath Subramanian
>Priority: Major
>  Labels: kafka, pom
> Fix For: 3.0.0, 2.2.0
>
> Attachments: ATLAS-4329-Update-Kafka-version-to-2.5.patch
>
>
> Atlas uses the following kafka versions for producer and consumer:
> +*Current:*+
>  *  2.0.0
>  * 2.11
> +*New:*+
>  * 2.5.0
>  * 2.12



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4302) Migrated Data: Process Entity Name not set to QualifiedName

2021-06-03 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4302:
---
Attachment: ATLAS-4302-Migrated-Data-Process-Entity-Name-not-set.patch

> Migrated Data: Process Entity Name not set to QualifiedName
> ---
>
> Key: ATLAS-4302
> URL: https://issues.apache.org/jira/browse/ATLAS-4302
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4302-Migrated-Data-Process-Entity-Name-not-set.patch
>
>
> *Background*
> In v0.8 of Atlas process names (_hive_process.name_) was using _queryText_ as 
> the name. This in v1.0 onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _hive_process_ and _hive_columnlineage_ entities tends to 
> be large. The name field is part of _AtlasEntityHeader_. When fetching search 
> results, lineage display are some of the flows that have these entities.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4302) Migrated Data: Process Entity Name not set to QualifiedName

2021-06-03 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4302:
---
Attachment: (was: 
ATLAS-4302-Update-hive-process-entities-name-to-qual.patch)

> Migrated Data: Process Entity Name not set to QualifiedName
> ---
>
> Key: ATLAS-4302
> URL: https://issues.apache.org/jira/browse/ATLAS-4302
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Background*
> In v0.8 of Atlas process names (_hive_process.name_) was using _queryText_ as 
> the name. This in v1.0 onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _hive_process_ and _hive_columnlineage_ entities tends to 
> be large. The name field is part of _AtlasEntityHeader_. When fetching search 
> results, lineage display are some of the flows that have these entities.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-4310) NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation

2021-05-26 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-4310.

Resolution: Fixed

> NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation
> 
>
> Key: ATLAS-4310
> URL: https://issues.apache.org/jira/browse/ATLAS-4310
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: ATLAS-4310-Handled-NPE-for-DELETE-classification.patch
>
>
> *Steps to Duplicate*
>  # Enabled admin tasks
>  # Created an hdfs_path entity
>  # In a loop for 330 times: (330 times because to generate 1000 audits) 
>  ## Updated entity ( updated path)
>  ## Added tag1
>  ## Removed tag1
> Expected results: Classification is removed.
> Actual results: Classification is removed. Logs indicate NPE:
> {code:java}
> at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
>  at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) 
> at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
>  at 
> org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
>  at 
> org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
>  at 
> org.apache.atlas.repository.store.graph.v2.tasks.ClassificationPropagationTasks$Delete.run(ClassificationPropagationTasks.java:73)
>  at 
> org.apache.atlas.repository.store.graph.v2.tasks.ClassificationTask.perform(ClassificationTask.java:95)
>  at org.apache.atlas.tasks.AbstractTask.run(AbstractTask.java:33) at 
> org.apache.atlas.tasks.TaskExecutor$TaskConsumer.performTask(TaskExecutor.java:150)
>  at 
> org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:109) 
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)Caused by: 
> java.lang.NullPointerException at 
> org.apache.atlas.repository.graph.GraphHelper.getTypeName(GraphHelper.java:867)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphRetriever.toAtlasClassification(EntityGraphRetriever.java:334)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2572)
>  ... 18 more2021-05-25 11:07:13,553 ERROR - [atlas-task-0-etp651100072-232 - 
> ceaa7213-1d14-4006-8f84-d94e56f4e829:] ~ Task: 
> c9f7c463-1c5d-4ae9-8232-506fd2c95a28: Error performing task! 
> (ClassificationTask:99)org.apache.atlas.exception.AtlasBaseException: 
> java.lang.NullPointerException at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
>  at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) 
> at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
>  at 
> org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
>  at 
> org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
>  at 
> 

[jira] [Updated] (ATLAS-4310) NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation

2021-05-26 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4310:
---
Attachment: ATLAS-4310-Handled-NPE-for-DELETE-classification.patch

> NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation
> 
>
> Key: ATLAS-4310
> URL: https://issues.apache.org/jira/browse/ATLAS-4310
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: ATLAS-4310-Handled-NPE-for-DELETE-classification.patch
>
>
> *Steps to Duplicate*
>  # Enabled admin tasks
>  # Created an hdfs_path entity
>  # In a loop for 330 times: (330 times because to generate 1000 audits) 
>  ## Updated entity ( updated path)
>  ## Added tag1
>  ## Removed tag1
> Expected results: Classification is removed.
> Actual results: Classification is removed. Logs indicate NPE:
> {code:java}
> at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
>  at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) 
> at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
>  at 
> org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
>  at 
> org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
>  at 
> org.apache.atlas.repository.store.graph.v2.tasks.ClassificationPropagationTasks$Delete.run(ClassificationPropagationTasks.java:73)
>  at 
> org.apache.atlas.repository.store.graph.v2.tasks.ClassificationTask.perform(ClassificationTask.java:95)
>  at org.apache.atlas.tasks.AbstractTask.run(AbstractTask.java:33) at 
> org.apache.atlas.tasks.TaskExecutor$TaskConsumer.performTask(TaskExecutor.java:150)
>  at 
> org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:109) 
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834)Caused by: 
> java.lang.NullPointerException at 
> org.apache.atlas.repository.graph.GraphHelper.getTypeName(GraphHelper.java:867)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphRetriever.toAtlasClassification(EntityGraphRetriever.java:334)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2572)
>  ... 18 more2021-05-25 11:07:13,553 ERROR - [atlas-task-0-etp651100072-232 - 
> ceaa7213-1d14-4006-8f84-d94e56f4e829:] ~ Task: 
> c9f7c463-1c5d-4ae9-8232-506fd2c95a28: Error performing task! 
> (ClassificationTask:99)org.apache.atlas.exception.AtlasBaseException: 
> java.lang.NullPointerException at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
>  at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) 
> at 
> org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
>  at 
> org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
>  at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
>  at 
> org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
>  at 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
>  at 
> 

[jira] [Updated] (ATLAS-4310) NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation

2021-05-26 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4310:
---
Description: 
*Steps to Duplicate*
 # Enabled admin tasks
 # Created an hdfs_path entity
 # In a loop for 330 times: (330 times because to generate 1000 audits) 
 ## Updated entity ( updated path)
 ## Added tag1
 ## Removed tag1

Expected results: Classification is removed.

Actual results: Classification is removed. Logs indicate NPE:
{code:java}
at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
 at 
org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
 at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationPropagationTasks$Delete.run(ClassificationPropagationTasks.java:73)
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationTask.perform(ClassificationTask.java:95)
 at org.apache.atlas.tasks.AbstractTask.run(AbstractTask.java:33) at 
org.apache.atlas.tasks.TaskExecutor$TaskConsumer.performTask(TaskExecutor.java:150)
 at org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:109) 
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)Caused by: 
java.lang.NullPointerException at 
org.apache.atlas.repository.graph.GraphHelper.getTypeName(GraphHelper.java:867) 
at 
org.apache.atlas.repository.store.graph.v2.EntityGraphRetriever.toAtlasClassification(EntityGraphRetriever.java:334)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2572)
 ... 18 more2021-05-25 11:07:13,553 ERROR - [atlas-task-0-etp651100072-232 - 
ceaa7213-1d14-4006-8f84-d94e56f4e829:] ~ Task: 
c9f7c463-1c5d-4ae9-8232-506fd2c95a28: Error performing task! 
(ClassificationTask:99)org.apache.atlas.exception.AtlasBaseException: 
java.lang.NullPointerException at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
 at 
org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
 at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationPropagationTasks$Delete.run(ClassificationPropagationTasks.java:73)
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationTask.perform(ClassificationTask.java:95)
 at org.apache.atlas.tasks.AbstractTask.run(AbstractTask.java:33) at 
org.apache.atlas.tasks.TaskExecutor$TaskConsumer.performTask(TaskExecutor.java:150)
 at org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:109) 
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 

[jira] [Created] (ATLAS-4310) NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation

2021-05-26 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4310:
--

 Summary: NPE seen for CLASSIFICATION_PROPAGATION_DELETE Operation
 Key: ATLAS-4310
 URL: https://issues.apache.org/jira/browse/ATLAS-4310
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Steps to Duplicate*
 # Enabled admin tasks
 # Created an hdfs_path entity
 # In a loop for 330 times: (330 times because to generate 1000 audits) 
 ## Updated entity ( updated path)
 ## Added tag1
 ## Removed tag1

Expected results: Classification is removed.

Actual results: Classification is removed. Logs indicate NPE:

 

 
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
 at 
org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
 at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationPropagationTasks$Delete.run(ClassificationPropagationTasks.java:73)
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationTask.perform(ClassificationTask.java:95)
 at org.apache.atlas.tasks.AbstractTask.run(AbstractTask.java:33) at 
org.apache.atlas.tasks.TaskExecutor$TaskConsumer.performTask(TaskExecutor.java:150)
 at org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:109) 
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)Caused by: 
java.lang.NullPointerException at 
org.apache.atlas.repository.graph.GraphHelper.getTypeName(GraphHelper.java:867) 
at 
org.apache.atlas.repository.store.graph.v2.EntityGraphRetriever.toAtlasClassification(EntityGraphRetriever.java:334)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2572)
 ... 18 more2021-05-25 11:07:13,553 ERROR - [atlas-task-0-etp651100072-232 - 
ceaa7213-1d14-4006-8f84-d94e56f4e829:] ~ Task: 
c9f7c463-1c5d-4ae9-8232-506fd2c95a28: Error performing task! 
(ClassificationTask:99)org.apache.atlas.exception.AtlasBaseException: 
java.lang.NullPointerException at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.deleteClassificationPropagation(EntityGraphMapper.java:2595)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$FastClassBySpringCGLIB$$8e3f1c72.invoke()
 at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:737)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
 at 
org.apache.atlas.GraphTransactionInterceptor.invoke(GraphTransactionInterceptor.java:111)
 at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
 at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:672)
 at 
org.apache.atlas.repository.store.graph.v2.EntityGraphMapper$$EnhancerBySpringCGLIB$$96822c39.deleteClassificationPropagation()
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationPropagationTasks$Delete.run(ClassificationPropagationTasks.java:73)
 at 
org.apache.atlas.repository.store.graph.v2.tasks.ClassificationTask.perform(ClassificationTask.java:95)
 at org.apache.atlas.tasks.AbstractTask.run(AbstractTask.java:33) at 
org.apache.atlas.tasks.TaskExecutor$TaskConsumer.performTask(TaskExecutor.java:150)
 at org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:109) 
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 

[jira] [Updated] (ATLAS-4306) Atlas Spooling: Support for User-specific Spool Directory

2021-05-24 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4306:
---
Attachment: ATLAS-4306-Support-for-User-specific-Spool-Directory.patch

> Atlas Spooling: Support for User-specific Spool Directory
> -
>
> Key: ATLAS-4306
> URL: https://issues.apache.org/jira/browse/ATLAS-4306
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4306-Support-for-User-specific-Spool-Directory.patch
>
>
> *Background*
> Spooling for messages is enabled for hooks that publish messages to Atlas. 
> This is available for hooks that choose to add the following properties:
> {code:java}
> atlas.hook.spool.enabled=true
> atlas.hook.spool.dir=/spool-dir{code}
> Once the hook is initialized, the directory is created. The user who creates 
> the directory will have read-write permissions to the spool directly.
> There are cases where the hook gets initialized with different users. When 
> that happen, the directory is accessible to the other user. This causes 
> initialization failure and the spooling functionality not to be available to 
> the other user.
> *Solution(s)*
> This problem can be circumvented by making the 2 users be part of the same 
> group that will allow for same directory to be present for multiple users. 
> Multi-user access is currently supported.
> For the scenario, where multiple users are not part of the same group, the 
> situation described above will come into play. 
> _Approach Used_
> **No directory exists, _User1_ first creates the directory specified in the 
> configuration.
> Directory exists, _User1_ has access to the directory and spooling is enabled.
> Directory exists, _User2_ does not have access to the directory. New 
> directory with username suffixed to the existing directory name is created. 
> The directory created for the user will be
>  
> {code:java}
> /spool-dir-User1 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4306) Atlas Spooling: Support for User-specific Spool Directory

2021-05-24 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4306:
---
Description: 
*Background*

Spooling for messages is enabled for hooks that publish messages to Atlas. This 
is available for hooks that choose to add the following properties:
{code:java}
atlas.hook.spool.enabled=true
atlas.hook.spool.dir=/spool-dir{code}
Once the hook is initialized, the directory is created. The user who creates 
the directory will have read-write permissions to the spool directly.

There are cases where the hook gets initialized with different users. When that 
happen, the directory is accessible to the other user. This causes 
initialization failure and the spooling functionality not to be available to 
the other user.

*Solution(s)*

This problem can be circumvented by making the 2 users be part of the same 
group that will allow for same directory to be present for multiple users. 
Multi-user access is currently supported.

For the scenario, where multiple users are not part of the same group, the 
situation described above will come into play. 

_Approach Used_

**No directory exists, _User1_ first creates the directory specified in the 
configuration.

Directory exists, _User1_ has access to the directory and spooling is enabled.

Directory exists, _User2_ does not have access to the directory. New directory 
with username suffixed to the existing directory name is created. The directory 
created for the user will be

 
{code:java}
/spool-dir-User1 
{code}
 

  was:
*Background*

Spooling for messages is enabled for hooks that publish messages to Atlas. This 
is available for hooks that choose to add the following properties:

_atlas.hook.spool.enabled=true_
_atlas.hook.spool.dir=/spool-dir_

Once the hook is initialized, the directory is created. The user who creates 
the directory will have read-write permissions to the spool directly.

There are cases where the hook gets initialized with different users. When that 
happen, the directory is accessible to the other user. This causes 
initialization failure and the spooling functionality not to be available to 
the other user.

*Solution(s)*

This problem can be circumvented by making the 2 users be part of the same 
group that will allow for same directory to be present for multiple users. 
Multi-user access is currently supported.

For the scenario, where multiple users are not part of the same group, the 
situation described above will come into play. 

_Approach Used_

**No directory exists, _User1_ first creates the directory specified in the 
configuration.

Directory exists, _User1_ has access to the directory and spooling is enabled.

Directory exists, _User2_ does not have access to the directory. New directory 
with username suffixed to the existing directory name is created.

 


> Atlas Spooling: Support for User-specific Spool Directory
> -
>
> Key: ATLAS-4306
> URL: https://issues.apache.org/jira/browse/ATLAS-4306
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Background*
> Spooling for messages is enabled for hooks that publish messages to Atlas. 
> This is available for hooks that choose to add the following properties:
> {code:java}
> atlas.hook.spool.enabled=true
> atlas.hook.spool.dir=/spool-dir{code}
> Once the hook is initialized, the directory is created. The user who creates 
> the directory will have read-write permissions to the spool directly.
> There are cases where the hook gets initialized with different users. When 
> that happen, the directory is accessible to the other user. This causes 
> initialization failure and the spooling functionality not to be available to 
> the other user.
> *Solution(s)*
> This problem can be circumvented by making the 2 users be part of the same 
> group that will allow for same directory to be present for multiple users. 
> Multi-user access is currently supported.
> For the scenario, where multiple users are not part of the same group, the 
> situation described above will come into play. 
> _Approach Used_
> **No directory exists, _User1_ first creates the directory specified in the 
> configuration.
> Directory exists, _User1_ has access to the directory and spooling is enabled.
> Directory exists, _User2_ does not have access to the directory. New 
> directory with username suffixed to the existing directory name is created. 
> The directory created for the user will be
>  
> {code:java}
> /spool-dir-User1 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4306) Atlas Spooling: Support for User-specific Spool Directory

2021-05-24 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4306:
--

 Summary: Atlas Spooling: Support for User-specific Spool Directory
 Key: ATLAS-4306
 URL: https://issues.apache.org/jira/browse/ATLAS-4306
 Project: Atlas
  Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Background*

Spooling for messages is enabled for hooks that publish messages to Atlas. This 
is available for hooks that choose to add the following properties:

_atlas.hook.spool.enabled=true_
_atlas.hook.spool.dir=/spool-dir_

Once the hook is initialized, the directory is created. The user who creates 
the directory will have read-write permissions to the spool directly.

There are cases where the hook gets initialized with different users. When that 
happen, the directory is accessible to the other user. This causes 
initialization failure and the spooling functionality not to be available to 
the other user.

*Solution(s)*

This problem can be circumvented by making the 2 users be part of the same 
group that will allow for same directory to be present for multiple users. 
Multi-user access is currently supported.

For the scenario, where multiple users are not part of the same group, the 
situation described above will come into play. 

_Approach Used_

**No directory exists, _User1_ first creates the directory specified in the 
configuration.

Directory exists, _User1_ has access to the directory and spooling is enabled.

Directory exists, _User2_ does not have access to the directory. New directory 
with username suffixed to the existing directory name is created.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4302) Migrated Data: Process Entity Name not set to QualifiedName

2021-05-24 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4302:
---
Attachment: ATLAS-4302-Update-hive-process-entities-name-to-qual.patch

> Migrated Data: Process Entity Name not set to QualifiedName
> ---
>
> Key: ATLAS-4302
> URL: https://issues.apache.org/jira/browse/ATLAS-4302
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4302-Update-hive-process-entities-name-to-qual.patch
>
>
> *Background*
> In v0.8 of Atlas process names (_hive_process.name_) was using _queryText_ as 
> the name. This in v1.0 onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _hive_process_ and _hive_columnlineage_ entities tends to 
> be large. The name field is part of _AtlasEntityHeader_. When fetching search 
> results, lineage display are some of the flows that have these entities.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4302) Migrated Data: Process Entity Name not set to QualifiedName

2021-05-24 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4302:
---
Attachment: (was: 
ATLAS-4302-Update-hive-process-entities-name-to-qual.patch)

> Migrated Data: Process Entity Name not set to QualifiedName
> ---
>
> Key: ATLAS-4302
> URL: https://issues.apache.org/jira/browse/ATLAS-4302
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4302-Update-hive-process-entities-name-to-qual.patch
>
>
> *Background*
> In v0.8 of Atlas process names (_hive_process.name_) was using _queryText_ as 
> the name. This in v1.0 onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _hive_process_ and _hive_columnlineage_ entities tends to 
> be large. The name field is part of _AtlasEntityHeader_. When fetching search 
> results, lineage display are some of the flows that have these entities.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4302) Migrated Data: Process Entity Name not set to QualifiedName

2021-05-24 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4302:
---
Attachment: ATLAS-4302-Update-hive-process-entities-name-to-qual.patch

> Migrated Data: Process Entity Name not set to QualifiedName
> ---
>
> Key: ATLAS-4302
> URL: https://issues.apache.org/jira/browse/ATLAS-4302
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4302-Update-hive-process-entities-name-to-qual.patch
>
>
> *Background*
> In v0.8 of Atlas process names (_hive_process.name_) was using _queryText_ as 
> the name. This in v1.0 onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _hive_process_ and _hive_columnlineage_ entities tends to 
> be large. The name field is part of _AtlasEntityHeader_. When fetching search 
> results, lineage display are some of the flows that have these entities.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4302) Migrated Data: Process Entity Name not set to QualifiedName

2021-05-24 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4302:
--

 Summary: Migrated Data: Process Entity Name not set to 
QualifiedName
 Key: ATLAS-4302
 URL: https://issues.apache.org/jira/browse/ATLAS-4302
 Project: Atlas
  Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Background*

In v0.8 of Atlas process names (_hive_process.name_) was using _queryText_ as 
the name. This in v1.0 onwards where _name_ and _qualifiedName_ are same.

*Solution*

Add Java patch that updates the name property.

*Impact of Not Doing this Update*

The _queryText_ in _hive_process_ and _hive_columnlineage_ entities tends to be 
large. The name field is part of _AtlasEntityHeader_. When fetching search 
results, lineage display are some of the flows that have these entities.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4285) AtlasTasks: Multiple tag propagation tasks running concurrently, task is complete but propagation is not complete

2021-05-13 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4285:
---
Attachment: ATLAS-4285-Multiple-propagations-with-intersecting-l.patch

> AtlasTasks: Multiple tag propagation tasks running concurrently, task is 
> complete but propagation is not complete
> -
>
> Key: ATLAS-4285
> URL: https://issues.apache.org/jira/browse/ATLAS-4285
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4285-Multiple-propagations-with-intersecting-l.patch
>
>
> Created a 500 level linear lineage . (table1 ---> table2 ---> table3 ---> 
> .. ---> table500)
> Added tag1 to table1 
> Added tag2 to table2
> Added tag3 to table3 
> 3 tasks are created.
> task2 got completed and tag2 is associated only to table2 and not propagated 
> till table500.
> After sometime all tasks are completed , but propagation didn't happen



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4285) AtlasTasks: Multiple tag propagation tasks running concurrently, task is complete but propagation is not complete

2021-05-13 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4285:
---
Attachment: (was: 
ATLAS-4285-Multiple-propagations-with-intersecting-l.patch)

> AtlasTasks: Multiple tag propagation tasks running concurrently, task is 
> complete but propagation is not complete
> -
>
> Key: ATLAS-4285
> URL: https://issues.apache.org/jira/browse/ATLAS-4285
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> Created a 500 level linear lineage . (table1 ---> table2 ---> table3 ---> 
> .. ---> table500)
> Added tag1 to table1 
> Added tag2 to table2
> Added tag3 to table3 
> 3 tasks are created.
> task2 got completed and tag2 is associated only to table2 and not propagated 
> till table500.
> After sometime all tasks are completed , but propagation didn't happen



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4285) AtlasTasks: Multiple tag propagation tasks running concurrently, task is complete but propagation is not complete

2021-05-13 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4285:
---
Attachment: ATLAS-4285-Multiple-propagations-with-intersecting-l.patch

> AtlasTasks: Multiple tag propagation tasks running concurrently, task is 
> complete but propagation is not complete
> -
>
> Key: ATLAS-4285
> URL: https://issues.apache.org/jira/browse/ATLAS-4285
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4285-Multiple-propagations-with-intersecting-l.patch
>
>
> Created a 500 level linear lineage . (table1 ---> table2 ---> table3 ---> 
> .. ---> table500)
> Added tag1 to table1 
> Added tag2 to table2
> Added tag3 to table3 
> 3 tasks are created.
> task2 got completed and tag2 is associated only to table2 and not propagated 
> till table500.
> After sometime all tasks are completed , but propagation didn't happen



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4285) AtlasTasks: Multiple tag propagation tasks running concurrently, task is complete but propagation is not complete

2021-05-13 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4285:
--

 Summary: AtlasTasks: Multiple tag propagation tasks running 
concurrently, task is complete but propagation is not complete
 Key: ATLAS-4285
 URL: https://issues.apache.org/jira/browse/ATLAS-4285
 Project: Atlas
  Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


Created a 500 level linear lineage . (table1 ---> table2 ---> table3 ---> 
.. ---> table500)

Added tag1 to table1 

Added tag2 to table2

Added tag3 to table3 

3 tasks are created.

task2 got completed and tag2 is associated only to table2 and not propagated 
till table500.

After sometime all tasks are completed , but propagation didn't happen



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4164) [Atlas: Spooling] Tables created after spooling are created before the spooled tables when there is multiple frequent restart in kafka brokers

2021-05-11 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4164:
---
Attachment: ATLAS-4164-Spooling-Status.patch

> [Atlas: Spooling] Tables created after spooling are created before the 
> spooled tables when there is multiple frequent restart in kafka brokers
> --
>
> Key: ATLAS-4164
> URL: https://issues.apache.org/jira/browse/ATLAS-4164
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: ATLAS-4164-Spooling-Status.patch
>
>
> Scenario:
>  * Stop kafka broker
>  * Create a few (20) tables save the prefix (abc_table_1, abc_table_2, ... 
> abc_table_n)
>  * Make sure the data is spooled
>  * Start kafka and create a few more tables (xyz_table_1, xyz_table_2, ... 
> xyz_table_n)
>  * Wait for 5 mins for the tables to reflect in atlas
> In this case we expect all the abc_table_* to be created before xyz_table_1, 
> meaning all the spooled tables are created before the tables that are created 
> after spooling.
>  
> Observation:
> createTime of some spooled tables is greater than the create time of the 
> xyz_table_1
>  
> Sample data:
> createTime for tables that are spooled:
> {code:java}
> [1613573518284, 1613573531470, 1613573531861, 1613573529446, 1613573543253, 
> 1613573525390, 1613573525950, 1613573517796, 1613573518284, 1613573522629, 
> 1613573513524, 1613573524856, 1613573518992, 1613573519477, 1613573519947, 
> 1613573521737, 1613573514066, 1613573514555, 1613573515065, 
> 1613573515605]{code}
> createTime for tables that are created after spooling:
> {code:java}
> [1613573540582, 1613573541300, 1613573551691, 1613573552628, 1613573553356, 
> 1613573555478, 1613573556275, 1613573556940, 1613573557763, 1613573558659, 
> 1613573560673, 1613573561363, 1613573562310, 1613573563096, 1613573564004, 
> 1613573566533, 1613573567602, 1613573568439, 1613573569379, 1613573570202] 
> {code}
> We expect all spooled tables to have createTime smaller than the tables 
> created after spooling.
> But *1613573543253 (Spooled tabled create time) is greater than 1613573540582 
> (table created after spooling)*
>  which means, the table created after spooling is created before spooled table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4153) [Atlas: Spooling] The order of the entities created in atlas is not same as the order created in hive

2021-05-10 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4153:
--

Assignee: Ashutosh Mestry

> [Atlas: Spooling] The order of the entities created in atlas is not same as 
> the order created in hive
> -
>
> Key: ATLAS-4153
> URL: https://issues.apache.org/jira/browse/ATLAS-4153
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Steps to re-produce:*
>  * Stop kafka broker
>  * Create a few (20) tables save the prefix (abc_table_1, abc_table_2, ... 
> abc_table_n)
>  * Make sure the data is spooled
>  * Start kafka and create a few more tables (xyz_table_1, xyz_table_2, ... 
> xyz_table_n)
>  * Wait for 5 mins for the tables to reflect in atlas
>  * Fire a basic search with abc_* prefix and verify the tables are created
>  * Collect the createdTime of all the table and verify the order
> Here tables were created with prefix btwxb_table.
> Guid: a55ca87c-1122-4ed4-9812-5047cc8a7968 of *btwxb_table_6* has 
> "createTime": *1612959902106*,
> {code:java}
> {
> "referredEntities": {...},
> "entity": {
> "typeName": "hive_table",
> "attributes": {
> "owner": "hrt_qa",
> "temporary": false,
> "lastAccessTime": 1612959527000,
> "aliases": null,
> "replicatedTo": null,
> "userDescription": null,
> "replicatedFrom": null,
> "qualifiedName": "default.btwxb_table_6@cm",
> "displayName": null,
> "columns": [
> {
> "guid": "3cf7b33b-7fb7-4de4-a967-7f89410b3a00",
> "typeName": "hive_column"
> },
> {
> "guid": "7c5b72ae-e100-467a-b3b2-caade0b7a7a1",
> "typeName": "hive_column"
> }
> ],
> "description": null,
> "viewExpandedText": null,
> "tableType": "MANAGED_TABLE",
> "sd": {
> "guid": "7a1e2b04-f786-4f42-a3c4-31a0ad872256",
> "typeName": "hive_storagedesc"
> },
> "createTime": 1612959527000,
> "name": "btwxb_table_6",
> "comment": null,
> "partitionKeys": [],
> "parameters": {...},
> "retention": 0,
> "viewOriginalText": null,
> "db": {
> "guid": "a8a3227f-ee7a-49c2-8f9e-98534a3146eb",
> "typeName": "hive_db"
> }
> },
> "guid": "a55ca87c-1122-4ed4-9812-5047cc8a7968",
> "isIncomplete": false,
> "status": "ACTIVE",
> "createdBy": "hive",
> "updatedBy": "hrt_qa",
> "createTime": 1612959902106,
> "updateTime": 1612959911756,
> "version": 0,
> "relationshipAttributes": {...},
> "labels": []
> }
> } {code}
> Guid: 70738a7e-d29f-4234-82e7-adbdc4feaed6 of *btwxb_table_1* has 
> "createTime": *1612959903892*, 
> {code:java}
> {
> "referredEntities": {...},
> "entity": {
> "typeName": "hive_table",
> "attributes": {
> "owner": "hrt_qa",
> "temporary": false,
> "lastAccessTime": 1612959523000,
> "aliases": null,
> "replicatedTo": null,
> "userDescription": null,
> "replicatedFrom": null,
> "qualifiedName": "default.btwxb_table_1@cm",
> "displayName": null,
> "columns": [
> {
> "guid": "c283ee8c-d292-406a-a062-d81b12614636",
> "typeName": "hive_column"
> },
> {
> "guid": "cbb19940-e513-4295-a145-5b2d6691ea13",
> "typeName": "hive_column"
> }
> ],
> "description": null,
> "viewExpandedText": null,
> "tableType": "MANAGED_TABLE",
> "sd": {
> "guid": "69d7f6fc-8194-412f-a723-5b142ade1c31",
> "typeName": "hive_storagedesc"
> },
> "createTime": 1612959523000,
> "name": "btwxb_table_1",
> "comment": null,
> "partitionKeys": [],
> "parameters": {...},
> "retention": 0,
> "viewOriginalText": null,
> "db": {
> "guid": "a8a3227f-ee7a-49c2-8f9e-98534a3146eb",
> "typeName": "hive_db"
> }
> },
> "guid": "70738a7e-d29f-4234-82e7-adbdc4feaed6",
> 

[jira] [Reopened] (ATLAS-4152) [Atlas: Spooling] Multiple entries are created for same table when the table is dropped while kafka is down

2021-05-03 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reopened ATLAS-4152:


There's another addition that will need to be done to process entities that 
have been spooled out of a queue.

> [Atlas: Spooling] Multiple entries are created for same table when the table 
> is dropped while kafka is down
> ---
>
> Key: ATLAS-4152
> URL: https://issues.apache.org/jira/browse/ATLAS-4152
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
>
> A single table has multiple duplicate entries when the table is dropped while 
> kafka is down (Spooling scenario)
> Steps to re-produce:
>  * Stop kafka broker
>  * Create a few (20) tables save the prefix (abc_table_1, abc_table_2, ... 
> abc_table_n)
>  * Make sure the data is spooled
>  * Start kafka and create a few more tables (xyz_table_1, xyz_table_2, ... 
> xyz_table_n)
>  * Wait for 5 mins for the tables to reflect in atlas
>  * Fire a basic search with abc_* prefix and verify the tables are created
>  * Collect the createdTime of all the table and verify the order
> Here tables were created with prefix btwxb_table.
> *btwxb_table_5, btwxb_table_10, btwxb_table_15 and btwxb_table_20* are 
> dropped when kafka is down
>  Each of those tables are having a total of 3 entries per table_name. All the 
> tetails are same except the guid
> {code:java}
> {
> "searchParameters":{
> "includeSubTypes":true,
> "excludeDeletedEntities":false,
> "includeSubClassifications":true,
> "typeName":"hive_table",
> "limit":40,
> "offset":0,
> "includeClassificationAttributes":false,
> "query":"btwxb*"
> },
> "queryText":"btwxb*",
> "approximateCount":28,
> "queryType":"BASIC",
> "entities":[
> {
> "status":"DELETED",
> "isIncomplete":false,
> "guid":"9c348843-ebf0-4a0f-a909-4cdbf75ea39d",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_15",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_15@cm",
> "createTime":1612959528000,
> "name":"btwxb_table_15"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"DELETED",
> "isIncomplete":false,
> "guid":"a25fbfc3-f6fb-4f9f-bce4-1da3f2de5921",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_5",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_5@cm",
> "createTime":1612959525000,
> "name":"btwxb_table_5"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"ACTIVE",
> "isIncomplete":false,
> "guid":"bf6de3f6-48f2-4941-b5f2-a1ef3756ffa2",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_8",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_8@cm",
> "createTime":1612959527000,
> "name":"btwxb_table_8"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"ACTIVE",
> "isIncomplete":false,
> "guid":"977adf75-9336-484f-91ed-7976817fb729",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_7",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_7@cm",
> "createTime":1612959527000,
> "name":"btwxb_table_7"
> 

[jira] [Resolved] (ATLAS-4152) [Atlas: Spooling] Multiple entries are created for same table when the table is dropped while kafka is down

2021-04-27 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-4152.

Resolution: Fixed

Implementation of ATLAS-4204 addresses this problem.

> [Atlas: Spooling] Multiple entries are created for same table when the table 
> is dropped while kafka is down
> ---
>
> Key: ATLAS-4152
> URL: https://issues.apache.org/jira/browse/ATLAS-4152
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
>
> A single table has multiple duplicate entries when the table is dropped while 
> kafka is down (Spooling scenario)
> Steps to re-produce:
>  * Stop kafka broker
>  * Create a few (20) tables save the prefix (abc_table_1, abc_table_2, ... 
> abc_table_n)
>  * Make sure the data is spooled
>  * Start kafka and create a few more tables (xyz_table_1, xyz_table_2, ... 
> xyz_table_n)
>  * Wait for 5 mins for the tables to reflect in atlas
>  * Fire a basic search with abc_* prefix and verify the tables are created
>  * Collect the createdTime of all the table and verify the order
> Here tables were created with prefix btwxb_table.
> *btwxb_table_5, btwxb_table_10, btwxb_table_15 and btwxb_table_20* are 
> dropped when kafka is down
>  Each of those tables are having a total of 3 entries per table_name. All the 
> tetails are same except the guid
> {code:java}
> {
> "searchParameters":{
> "includeSubTypes":true,
> "excludeDeletedEntities":false,
> "includeSubClassifications":true,
> "typeName":"hive_table",
> "limit":40,
> "offset":0,
> "includeClassificationAttributes":false,
> "query":"btwxb*"
> },
> "queryText":"btwxb*",
> "approximateCount":28,
> "queryType":"BASIC",
> "entities":[
> {
> "status":"DELETED",
> "isIncomplete":false,
> "guid":"9c348843-ebf0-4a0f-a909-4cdbf75ea39d",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_15",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_15@cm",
> "createTime":1612959528000,
> "name":"btwxb_table_15"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"DELETED",
> "isIncomplete":false,
> "guid":"a25fbfc3-f6fb-4f9f-bce4-1da3f2de5921",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_5",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_5@cm",
> "createTime":1612959525000,
> "name":"btwxb_table_5"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"ACTIVE",
> "isIncomplete":false,
> "guid":"bf6de3f6-48f2-4941-b5f2-a1ef3756ffa2",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_8",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_8@cm",
> "createTime":1612959527000,
> "name":"btwxb_table_8"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"ACTIVE",
> "isIncomplete":false,
> "guid":"977adf75-9336-484f-91ed-7976817fb729",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_7",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_7@cm",
> "createTime":1612959527000,
> "name":"btwxb_table_7"
> },
> 

[jira] [Updated] (ATLAS-4258) AtlasTasks: When propagate flag is flipped to false while "CLASSIFICATION_PROPAGATION_ADD" is pending , tag is propagated at the end

2021-04-23 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4258:
---
Summary: AtlasTasks: When propagate flag is flipped to false while 
"CLASSIFICATION_PROPAGATION_ADD" is pending , tag is propagated at the end  
(was: DeferredActions : When propagate flag is flipped to false while 
"CLASSIFICATION_PROPAGATION_ADD" is pending , tag is propagated at the end)

> AtlasTasks: When propagate flag is flipped to false while 
> "CLASSIFICATION_PROPAGATION_ADD" is pending , tag is propagated at the end
> 
>
> Key: ATLAS-4258
> URL: https://issues.apache.org/jira/browse/ATLAS-4258
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Sharmadha S
>Assignee: Ashutosh Mestry
>Priority: Major
>
> Associate tag1 to table1 with propagate set to True.
> Since it has 2000 level lineage , the task takes times and it is in PENDING 
> state.
> Now flip the propagate flag to False at table1. There is 
> CLASSIFICATION_PROPAGATION_DELETE task which gets completed soon.
> After some time "CLASSIFICATION_PROPAGATION_ADD" gets completed. 
> Now tag is propagated to all entities down the 2000 level lineage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4256) Deferred actions : When failover happens , the deferred tasks are set to COMPLETED without getting executed

2021-04-22 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4256:
---
Attachment: ATLAS-4256-Elegant-handling-of-tasks-in-f.patch

> Deferred actions : When failover happens , the deferred tasks are set to 
> COMPLETED without getting executed
> ---
>
> Key: ATLAS-4256
> URL: https://issues.apache.org/jira/browse/ATLAS-4256
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Sharmadha S
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: ATLAS-4256-Elegant-handling-of-tasks-in-f.patch
>
>
> Atlas is running on server1 and server2 and server1 is currently ACTIVE.
> Created a 2000 level lineage and added a tag to start of the lineage. 
> Deferred task for tag propagation started.
> Stopped server1 , now server2 became ACTIVE.
> server1 threw following exception :
> {code}
> 2021-04-20 20:07:21,137 ERROR - [atlas-task-0-etp1479696465-120 - 
> 541aca55-2402-43ad-911b-6756d9899b12:] ~ Error executing task. Please perform 
> the operation again! (TaskExecutor$TaskLogger:178)2021-04-20 20:07:21,137 
> ERROR - [atlas-task-0-etp1479696465-120 - 
> 541aca55-2402-43ad-911b-6756d9899b12:] ~ Error executing task. Please perform 
> the operation again! 
> (TaskExecutor$TaskLogger:178)java.lang.IllegalStateException: Graph has been 
> closed at 
> org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph.getAutoStartTx(JanusGraphBlueprintsGraph.java:76)
>  at 
> org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph.query(JanusGraphBlueprintsGraph.java:176)
>  at 
> org.apache.atlas.repository.graphdb.janus.query.NativeJanusGraphQuery.(NativeJanusGraphQuery.java:59)
>  at 
> org.apache.atlas.repository.graphdb.janus.query.AtlasJanusGraphQuery.createNativeTinkerpopQuery(AtlasJanusGraphQuery.java:54)
>  at 
> org.apache.atlas.repository.graphdb.tinkerpop.query.expr.AndCondition.create(AndCondition.java:85)
>  at 
> org.apache.atlas.repository.graphdb.tinkerpop.query.TinkerpopGraphQuery.vertices(TinkerpopGraphQuery.java:136)
>  at org.apache.atlas.tasks.TaskRegistry.getVertex(TaskRegistry.java:140) at 
> org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:91) at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  
> {code}
> Now server2 /api/atlas/admin/tasks showed the PENDING tasks.
> Started server1 now.
> Now all PENDING tasks are marked COMPLETED without getting executed.
> {code}
> 2021-04-20 20:09:35,450 INFO  - [main:] ~ TaskManagement: Found: 3: Tasks in 
> pending state. (TaskManagement:195)
> 2021-04-20 20:09:35,500 INFO  - [main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"81418bdb-5076-458a-abbd-7894d43d0408","createdBy":"hrt_qa","createdTime":1618949204339,"updatedTime":1618949204339,"parameters":{"relationshipGuid":null,"entityGuid":"1231b504-f6b7-45df-829b-b430a9e7c0d6","classificationVertexId":"4272"},"attemptCount":0,"status":"PENDING"}
>  (TaskExecutor$TaskLogger:170)
> 2021-04-20 20:09:35,503 INFO  - [main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"1130d28f-e2a6-49a6-ad99-7642042681a3","createdBy":"hrt_qa","createdTime":1618949205136,"updatedTime":1618949205136,"parameters":{"relationshipGuid":null,"entityGuid":"4c2db030-52fe-419c-aa79-c37db0908502","classificationVertexId":"81924136"},"attemptCount":0,"status":"PENDING"}
>  (TaskExecutor$TaskLogger:170)
> 2021-04-20 20:09:35,503 INFO  - [main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"dc4e5057-8e13-4234-9ba5-91e18d54a24d","createdBy":"hrt_qa","createdTime":1618949206861,"updatedTime":1618949206861,"parameters":{"relationshipGuid":null,"entityGuid":"c9f543a6-033c-45c7-ab23-a5476b6fad9c","classificationVertexId":"40964200"},"attemptCount":0,"status":"PENDING"}
>  (TaskExecutor$TaskLogger:170)
> 2021-04-20 20:09:35,532 INFO  - [atlas-task-0-main:] ~ GraphTransaction 
> intercept for 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.propagateClassification
>  (GraphTransactionAdvisor$1:41)
> 2021-04-20 20:09:35,804 INFO  - [main:] ~ Atlas is in HA Mode, enabling 
> ActiveServerFilter (AtlasSecurityConfig:167)
> 2021-04-20 20:09:36,203 INFO  - [atlas-task-0-main:] ~ 
> 

[jira] [Assigned] (ATLAS-4256) Deferred actions : When failover happens , the deferred tasks are set to COMPLETED without getting executed

2021-04-22 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4256:
--

Assignee: Ashutosh Mestry

> Deferred actions : When failover happens , the deferred tasks are set to 
> COMPLETED without getting executed
> ---
>
> Key: ATLAS-4256
> URL: https://issues.apache.org/jira/browse/ATLAS-4256
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Sharmadha S
>Assignee: Ashutosh Mestry
>Priority: Major
>
> Atlas is running on server1 and server2 and server1 is currently ACTIVE.
> Created a 2000 level lineage and added a tag to start of the lineage. 
> Deferred task for tag propagation started.
> Stopped server1 , now server2 became ACTIVE.
> server1 threw following exception :
> {code}
> 2021-04-20 20:07:21,137 ERROR - [atlas-task-0-etp1479696465-120 - 
> 541aca55-2402-43ad-911b-6756d9899b12:] ~ Error executing task. Please perform 
> the operation again! (TaskExecutor$TaskLogger:178)2021-04-20 20:07:21,137 
> ERROR - [atlas-task-0-etp1479696465-120 - 
> 541aca55-2402-43ad-911b-6756d9899b12:] ~ Error executing task. Please perform 
> the operation again! 
> (TaskExecutor$TaskLogger:178)java.lang.IllegalStateException: Graph has been 
> closed at 
> org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph.getAutoStartTx(JanusGraphBlueprintsGraph.java:76)
>  at 
> org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsGraph.query(JanusGraphBlueprintsGraph.java:176)
>  at 
> org.apache.atlas.repository.graphdb.janus.query.NativeJanusGraphQuery.(NativeJanusGraphQuery.java:59)
>  at 
> org.apache.atlas.repository.graphdb.janus.query.AtlasJanusGraphQuery.createNativeTinkerpopQuery(AtlasJanusGraphQuery.java:54)
>  at 
> org.apache.atlas.repository.graphdb.tinkerpop.query.expr.AndCondition.create(AndCondition.java:85)
>  at 
> org.apache.atlas.repository.graphdb.tinkerpop.query.TinkerpopGraphQuery.vertices(TinkerpopGraphQuery.java:136)
>  at org.apache.atlas.tasks.TaskRegistry.getVertex(TaskRegistry.java:140) at 
> org.apache.atlas.tasks.TaskExecutor$TaskConsumer.run(TaskExecutor.java:91) at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  
> {code}
> Now server2 /api/atlas/admin/tasks showed the PENDING tasks.
> Started server1 now.
> Now all PENDING tasks are marked COMPLETED without getting executed.
> {code}
> 2021-04-20 20:09:35,450 INFO  - [main:] ~ TaskManagement: Found: 3: Tasks in 
> pending state. (TaskManagement:195)
> 2021-04-20 20:09:35,500 INFO  - [main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"81418bdb-5076-458a-abbd-7894d43d0408","createdBy":"hrt_qa","createdTime":1618949204339,"updatedTime":1618949204339,"parameters":{"relationshipGuid":null,"entityGuid":"1231b504-f6b7-45df-829b-b430a9e7c0d6","classificationVertexId":"4272"},"attemptCount":0,"status":"PENDING"}
>  (TaskExecutor$TaskLogger:170)
> 2021-04-20 20:09:35,503 INFO  - [main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"1130d28f-e2a6-49a6-ad99-7642042681a3","createdBy":"hrt_qa","createdTime":1618949205136,"updatedTime":1618949205136,"parameters":{"relationshipGuid":null,"entityGuid":"4c2db030-52fe-419c-aa79-c37db0908502","classificationVertexId":"81924136"},"attemptCount":0,"status":"PENDING"}
>  (TaskExecutor$TaskLogger:170)
> 2021-04-20 20:09:35,503 INFO  - [main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"dc4e5057-8e13-4234-9ba5-91e18d54a24d","createdBy":"hrt_qa","createdTime":1618949206861,"updatedTime":1618949206861,"parameters":{"relationshipGuid":null,"entityGuid":"c9f543a6-033c-45c7-ab23-a5476b6fad9c","classificationVertexId":"40964200"},"attemptCount":0,"status":"PENDING"}
>  (TaskExecutor$TaskLogger:170)
> 2021-04-20 20:09:35,532 INFO  - [atlas-task-0-main:] ~ GraphTransaction 
> intercept for 
> org.apache.atlas.repository.store.graph.v2.EntityGraphMapper.propagateClassification
>  (GraphTransactionAdvisor$1:41)
> 2021-04-20 20:09:35,804 INFO  - [main:] ~ Atlas is in HA Mode, enabling 
> ActiveServerFilter (AtlasSecurityConfig:167)
> 2021-04-20 20:09:36,203 INFO  - [atlas-task-0-main:] ~ 
> {"type":"CLASSIFICATION_PROPAGATION_ADD","guid":"81418bdb-5076-458a-abbd-7894d43d0408","createdBy":"hrt_qa","createdTime":1618949204339,"updatedTime":1618949204339,"endTime":1618949375617,"parameters":{"relationshipGuid":null,"entityGuid":"1231b504-f6b7-45df-829b-b430a9e7c0d6","classificationVertexId":"4272"},"attemptCount":0,"status":"COMPLETE"}
>  

[jira] [Updated] (ATLAS-4204) Hive Hook: Configure HiveServer2 Hook to send Lineage-only Messages

2021-04-12 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4204:
---
Summary: Hive Hook: Configure HiveServer2 Hook to send Lineage-only 
Messages  (was: Hive Hook: Improve HS2 Messages)

> Hive Hook: Configure HiveServer2 Hook to send Lineage-only Messages
> ---
>
> Key: ATLAS-4204
> URL: https://issues.apache.org/jira/browse/ATLAS-4204
> Project: Atlas
>  Issue Type: Improvement
>  Components: hive-integration
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Background*
> HiveServer2 hook for Atlas sends notification message for both metadata (DDL 
> operations) and lineage (DML operations).
> Hive Metastore (HMS) hook already sends metadata information to Atlas. These 
> messages are all DDL operations.
> So duplicate messages about object updates are sent to Atlas.
> Atlas processes these messages like any other.
> This is additional processing time and increased volume. There is also a 
> potential of incorrect data being updated within Atlas if the sequence of 
> messages from HMS and HS2 gets changed.
> *Solution*
> This improvement will  send only lineage messages from HS2 hook. All the DDL 
> (schema definition) messages will continue be sent from HMS hook (no change 
> here).
> This will also reduce the volume of messages sent to Atlas from hive server2 
> and will help improve performance by avoiding processing duplicate messages.
> The improvement can be used via a configuration parameter. That way existing 
> behavior continues as is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4204) Hive Hook: Improve HS2 Messages

2021-03-17 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4204:
---
Summary: Hive Hook: Improve HS2 Messages  (was: Hive Hook: Improve HS2 
Message Sending)

> Hive Hook: Improve HS2 Messages
> ---
>
> Key: ATLAS-4204
> URL: https://issues.apache.org/jira/browse/ATLAS-4204
> Project: Atlas
>  Issue Type: Improvement
>  Components: hive-integration
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Background*
> HiveServer2 hook for Atlas sends notification message for both metadata (DDL 
> operations) and lineage (DML operations).
> Hive Metastore (HMS) hook already sends metadata information to Atlas. These 
> messages are all DDL operations.
> So duplicate messages about object updates are sent to Atlas.
> Atlas processes these messages like any other.
> This is additional processing time and increased volume. There is also a 
> potential of incorrect data being updated within Atlas if the sequence of 
> messages from HMS and HS2 gets changed.
> *Solution*
> This improvement will  send only lineage messages from HS2 hook. All the DDL 
> (schema definition) messages will continue be sent from HMS hook (no change 
> here).
> This will also reduce the volume of messages sent to Atlas from hive server2 
> and will help improve performance by avoiding processing duplicate messages.
> The improvement can be used via a configuration parameter. That way existing 
> behavior continues as is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4204) Hive Hook: Improve HS2 Message Sending

2021-03-15 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4204:
---
Component/s: hive-integration

> Hive Hook: Improve HS2 Message Sending
> --
>
> Key: ATLAS-4204
> URL: https://issues.apache.org/jira/browse/ATLAS-4204
> Project: Atlas
>  Issue Type: Improvement
>  Components: hive-integration
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Background*
> HiveServer2 hook for Atlas sends notification message for both metadata (DDL 
> operations) and lineage (DML operations).
> Hive Metastore (HMS) hook already sends metadata information to Atlas. These 
> messages are all DDL operations.
> So duplicate messages about object updates are sent to Atlas.
> Atlas processes these messages like any other.
> This is additional processing time and increased volume. There is also a 
> potential of incorrect data being updated within Atlas if the sequence of 
> messages from HMS and HS2 gets changed.
> *Solution*
> This improvement will  send only lineage messages from HS2 hook. All the DDL 
> (schema definition) messages will continue be sent from HMS hook (no change 
> here).
> This will also reduce the volume of messages sent to Atlas from hive server2 
> and will help improve performance by avoiding processing duplicate messages.
> The improvement can be used via a configuration parameter. That way existing 
> behavior continues as is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4204) Hive Hook: Improve HS2 Message Sending

2021-03-15 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4204:
--

 Summary: Hive Hook: Improve HS2 Message Sending
 Key: ATLAS-4204
 URL: https://issues.apache.org/jira/browse/ATLAS-4204
 Project: Atlas
  Issue Type: Improvement
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Background*

HiveServer2 hook for Atlas sends notification message for both metadata (DDL 
operations) and lineage (DML operations).

Hive Metastore (HMS) hook already sends metadata information to Atlas. These 
messages are all DDL operations.

So duplicate messages about object updates are sent to Atlas.

Atlas processes these messages like any other.

This is additional processing time and increased volume. There is also a 
potential of incorrect data being updated within Atlas if the sequence of 
messages from HMS and HS2 gets changed.

*Solution*

This improvement will  send only lineage messages from HS2 hook. All the DDL 
(schema definition) messages will continue be sent from HMS hook (no change 
here).

This will also reduce the volume of messages sent to Atlas from hive server2 
and will help improve performance by avoiding processing duplicate messages.

The improvement can be used via a configuration parameter. That way existing 
behavior continues as is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4165) [Atlas: Spooling] There is no create audit for a table that is spooled but instead creates several update audits for a create operation

2021-02-18 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4165:
--

Assignee: Ashutosh Mestry

> [Atlas: Spooling] There is no create audit for a table that is spooled but 
> instead creates several update audits for a create operation
> ---
>
> Key: ATLAS-4165
> URL: https://issues.apache.org/jira/browse/ATLAS-4165
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core, atlas-webui
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: Screenshot 2021-02-18 at 4.20.15 PM.png, Screenshot 
> 2021-02-18 at 4.59.45 PM.png, Screenshot 2021-02-18 at 5.00.13 PM.png
>
>
> When a table is spooled and created in atlas, it has no create audit entry. 
> Steps to repro:
>  # Bring Kafka down
>  # Create a table
>  # Wait for it to spool
>  # Start the kafka brokers
>  # Wait for the table to be created in atlas
> Now check the audit tab of that entity
> !Screenshot 2021-02-18 at 4.20.15 PM.png|width=679,height=277!
> NOTE:
> It creates several update audit entries for just 1 create operation
> This is observed for the *very first table* that is created via *Impala*
> *!Screenshot 2021-02-18 at 4.59.45 PM.png|width=431,height=175!*
> !Screenshot 2021-02-18 at 5.00.13 PM.png|width=369,height=100!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4164) [Atlas: Spooling] Tables created after spooling are created before the spooled tables when there is multiple frequent restart in kafka brokers

2021-02-17 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4164:
--

Assignee: Ashutosh Mestry

> [Atlas: Spooling] Tables created after spooling are created before the 
> spooled tables when there is multiple frequent restart in kafka brokers
> --
>
> Key: ATLAS-4164
> URL: https://issues.apache.org/jira/browse/ATLAS-4164
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
>
> Scenario:
>  * Stop kafka broker
>  * Create a few (20) tables save the prefix (abc_table_1, abc_table_2, ... 
> abc_table_n)
>  * Make sure the data is spooled
>  * Start kafka and create a few more tables (xyz_table_1, xyz_table_2, ... 
> xyz_table_n)
>  * Wait for 5 mins for the tables to reflect in atlas
> In this case we expect all the abc_table_* to be created before xyz_table_1, 
> meaning all the spooled tables are created before the tables that are created 
> after spooling.
>  
> Observation:
> createTime of some spooled tables is greater than the create time of the 
> xyz_table_1
>  
> Sample data:
> createTime for tables that are spooled:
> {code:java}
> [1613573518284, 1613573531470, 1613573531861, 1613573529446, 1613573543253, 
> 1613573525390, 1613573525950, 1613573517796, 1613573518284, 1613573522629, 
> 1613573513524, 1613573524856, 1613573518992, 1613573519477, 1613573519947, 
> 1613573521737, 1613573514066, 1613573514555, 1613573515065, 
> 1613573515605]{code}
> createTime for tables that are created after spooling:
> {code:java}
> [1613573540582, 1613573541300, 1613573551691, 1613573552628, 1613573553356, 
> 1613573555478, 1613573556275, 1613573556940, 1613573557763, 1613573558659, 
> 1613573560673, 1613573561363, 1613573562310, 1613573563096, 1613573564004, 
> 1613573566533, 1613573567602, 1613573568439, 1613573569379, 1613573570202] 
> {code}
> *1613573543253 < 1613573540582*
>  which means, the table created after spooling is created before spooled table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ATLAS-4152) [Atlas: Spooling] Multiple entries are created for same table when the table is dropped while kafka is down

2021-02-17 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry reassigned ATLAS-4152:
--

Assignee: Ashutosh Mestry

> [Atlas: Spooling] Multiple entries are created for same table when the table 
> is dropped while kafka is down
> ---
>
> Key: ATLAS-4152
> URL: https://issues.apache.org/jira/browse/ATLAS-4152
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Dharshana M Krishnamoorthy
>Assignee: Ashutosh Mestry
>Priority: Major
>
> A single table has multiple duplicate entries when the table is dropped while 
> kafka is down (Spooling scenario)
> Steps to re-produce:
>  * Stop kafka broker
>  * Create a few (20) tables save the prefix (abc_table_1, abc_table_2, ... 
> abc_table_n)
>  * Make sure the data is spooled
>  * Start kafka and create a few more tables (xyz_table_1, xyz_table_2, ... 
> xyz_table_n)
>  * Wait for 5 mins for the tables to reflect in atlas
>  * Fire a basic search with abc_* prefix and verify the tables are created
>  * Collect the createdTime of all the table and verify the order
> Here tables were created with prefix btwxb_table.
> *btwxb_table_5, btwxb_table_10, btwxb_table_15 and btwxb_table_20* are 
> dropped when kafka is down
>  Each of those tables are having a total of 3 entries per table_name. All the 
> tetails are same except the guid
> {code:java}
> {
> "searchParameters":{
> "includeSubTypes":true,
> "excludeDeletedEntities":false,
> "includeSubClassifications":true,
> "typeName":"hive_table",
> "limit":40,
> "offset":0,
> "includeClassificationAttributes":false,
> "query":"btwxb*"
> },
> "queryText":"btwxb*",
> "approximateCount":28,
> "queryType":"BASIC",
> "entities":[
> {
> "status":"DELETED",
> "isIncomplete":false,
> "guid":"9c348843-ebf0-4a0f-a909-4cdbf75ea39d",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_15",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_15@cm",
> "createTime":1612959528000,
> "name":"btwxb_table_15"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"DELETED",
> "isIncomplete":false,
> "guid":"a25fbfc3-f6fb-4f9f-bce4-1da3f2de5921",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_5",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_5@cm",
> "createTime":1612959525000,
> "name":"btwxb_table_5"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"ACTIVE",
> "isIncomplete":false,
> "guid":"bf6de3f6-48f2-4941-b5f2-a1ef3756ffa2",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_8",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_8@cm",
> "createTime":1612959527000,
> "name":"btwxb_table_8"
> },
> "classificationNames":[
> 
> ]
> },
> {
> "status":"ACTIVE",
> "isIncomplete":false,
> "guid":"977adf75-9336-484f-91ed-7976817fb729",
> "meanings":[
> 
> ],
> "labels":[
> 
> ],
> "typeName":"hive_table",
> "meaningNames":[
> 
> ],
> "displayText":"btwxb_table_7",
> "attributes":{
> "owner":"hrt_qa",
> "qualifiedName":"default.btwxb_table_7@cm",
> "createTime":1612959527000,
> "name":"btwxb_table_7"
> },
> "classificationNames":[
> 
> ]

[jira] [Updated] (ATLAS-4155) NotificationHookConsumer: Large Compressed Message Processing Problem

2021-02-14 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4155:
---
Attachment: ATLAS-4155-Kafka-commit-supplied-offset.patch

> NotificationHookConsumer: Large Compressed Message Processing Problem
> -
>
> Key: ATLAS-4155
> URL: https://issues.apache.org/jira/browse/ATLAS-4155
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: ATLAS-4155-Kafka-commit-supplied-offset.patch
>
>
> *Background*
> Notification messages can be large in size. To get over Kafka's limitation on 
> message size, Atlas has compressed and split messages. If message size goes 
> beyond stipulated threshold, the message is compressed. If compressed message 
> goes beyond the size, it is split into multiple messages.
> *Situation*
> Consider a message that is so large that uncompressing it takes longer than 
> Kafka's timeout for message. This causes the problem where the large message 
> offset is not committed in time and that causes Kafka to present the same 
> message again.
> Message Description:
> Number of splits: 8
> Compressed message size: 7,452,640
> Uncompressed message size: 520,803,946
> Time taken to uncompress and stitch messages: > 90 seconds
>  
> Sequence:
> 2021-02-10 14:57:24,221: first message received
> 2021-02-10 14:58:36,052: all splits combined – 72 seconds
> 2021-02-10 15:01:06,971: message processing completed – 90 seconds
> 2021-02-10 15:01:17,158: Kafka commit failed. Elapsed time since first 
> message: 197 seconds
> 2021-02-10 15:01:19,857: attempt #2: first message received
> 2021-02-10 15:03:01,993: attempt #2: all splits combined – 102 seconds
> 2021-02-10 15:04:44,896: attempt #2: Kafka commit failed. Elapsed time since 
> first message: 205 seconds
> Back to #5
> *Solution*
> Maintain last offset received. If the same offset is presented, commit the 
> offset and move on to the next message.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4155) NotificationHookConsumer: Large Compressed Message Processing Problem

2021-02-14 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4155:
--

 Summary: NotificationHookConsumer: Large Compressed Message 
Processing Problem
 Key: ATLAS-4155
 URL: https://issues.apache.org/jira/browse/ATLAS-4155
 Project: Atlas
  Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Background*

Notification messages can be large in size. To get over Kafka's limitation on 
message size, Atlas has compressed and split messages. If message size goes 
beyond stipulated threshold, the message is compressed. If compressed message 
goes beyond the size, it is split into multiple messages.

*Situation*

Consider a message that is so large that uncompressing it takes longer than 
Kafka's timeout for message. This causes the problem where the large message 
offset is not committed in time and that causes Kafka to present the same 
message again.

Message Description:
Number of splits: 8
Compressed message size: 7,452,640
Uncompressed message size: 520,803,946
Time taken to uncompress and stitch messages: > 90 seconds
 
Sequence:
2021-02-10 14:57:24,221: first message received
2021-02-10 14:58:36,052: all splits combined – 72 seconds
2021-02-10 15:01:06,971: message processing completed – 90 seconds
2021-02-10 15:01:17,158: Kafka commit failed. Elapsed time since first message: 
197 seconds
2021-02-10 15:01:19,857: attempt #2: first message received
2021-02-10 15:03:01,993: attempt #2: all splits combined – 102 seconds
2021-02-10 15:04:44,896: attempt #2: Kafka commit failed. Elapsed time since 
first message: 205 seconds
Back to #5

*Solution*

Maintain last offset received. If the same offset is presented, commit the 
offset and move on to the next message.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-2411) Disabled Integration Test: EntityV2JerseyResourceIT.java:335

2021-02-10 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-2411.

Resolution: Fixed

> Disabled Integration Test: EntityV2JerseyResourceIT.java:335
> 
>
> Key: ATLAS-2411
> URL: https://issues.apache.org/jira/browse/ATLAS-2411
> Project: Atlas
>  Issue Type: Bug
>Reporter: Graham Wallis
>Assignee: Ashutosh Mestry
>Priority: Blocker
> Fix For: 1.0.0
>
>
> The above test needs to be reviewed and should probably be re-enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-2410) Disabled Integration Test: EntityV2JerseyResourceIT.java:330

2021-02-10 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-2410.

Resolution: Fixed

> Disabled Integration Test: EntityV2JerseyResourceIT.java:330
> 
>
> Key: ATLAS-2410
> URL: https://issues.apache.org/jira/browse/ATLAS-2410
> Project: Atlas
>  Issue Type: Bug
>Reporter: Graham Wallis
>Assignee: Ashutosh Mestry
>Priority: Blocker
> Fix For: 1.0.0
>
>
> The above test needs to be reviewed and should probably be re-enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-2409) Disabled Integration Test: EntityV2JerseyResourceIT.java:325

2021-02-10 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-2409.

Resolution: Fixed

> Disabled Integration Test: EntityV2JerseyResourceIT.java:325
> 
>
> Key: ATLAS-2409
> URL: https://issues.apache.org/jira/browse/ATLAS-2409
> Project: Atlas
>  Issue Type: Bug
>Reporter: Graham Wallis
>Assignee: Ashutosh Mestry
>Priority: Blocker
> Fix For: 1.0.0
>
>
> The above test needs to be reviewed and should probably be re-enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-4151) FixedBufferList: Change Log Level to Debug

2021-02-09 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-4151.

Resolution: Fixed

> FixedBufferList: Change Log Level to Debug
> --
>
> Key: ATLAS-4151
> URL: https://issues.apache.org/jira/browse/ATLAS-4151
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Trivial
> Fix For: trunk
>
> Attachments: 
> ATLAS-4151-FixedBufferList-Change-Log-Level-to-Debug.patch
>
>
> *Background*
> Existing implementation emits updates from _FixedBufferList_ indicating inner 
> working. This fills logs with lot of entries that are not useful for 
> debugging.
> *Solution*
> Change the log level to _debug_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-4151) FixedBufferList: Change Log Level to Debug

2021-02-09 Thread Ashutosh Mestry (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17281917#comment-17281917
 ] 

Ashutosh Mestry commented on ATLAS-4151:


PC: Build: 
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/385/

> FixedBufferList: Change Log Level to Debug
> --
>
> Key: ATLAS-4151
> URL: https://issues.apache.org/jira/browse/ATLAS-4151
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Trivial
> Fix For: trunk
>
> Attachments: 
> ATLAS-4151-FixedBufferList-Change-Log-Level-to-Debug.patch
>
>
> *Background*
> Existing implementation emits updates from _FixedBufferList_ indicating inner 
> working. This fills logs with lot of entries that are not useful for 
> debugging.
> *Solution*
> Change the log level to _debug_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4151) FixedBufferList: Change Log Level to Debug

2021-02-09 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4151:
---
Attachment: ATLAS-4151-FixedBufferList-Change-Log-Level-to-Debug.patch

> FixedBufferList: Change Log Level to Debug
> --
>
> Key: ATLAS-4151
> URL: https://issues.apache.org/jira/browse/ATLAS-4151
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Trivial
> Fix For: trunk
>
> Attachments: 
> ATLAS-4151-FixedBufferList-Change-Log-Level-to-Debug.patch
>
>
> *Background*
> Existing implementation emits updates from _FixedBufferList_ indicating inner 
> working. This fills logs with lot of entries that are not useful for 
> debugging.
> *Solution*
> Change the log level to _debug_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4151) FixedBufferList: Change Log Level to Debug

2021-02-09 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4151:
--

 Summary: FixedBufferList: Change Log Level to Debug
 Key: ATLAS-4151
 URL: https://issues.apache.org/jira/browse/ATLAS-4151
 Project: Atlas
  Issue Type: Bug
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
 Fix For: trunk


*Background*

Existing implementation emits updates from _FixedBufferList_ indicating inner 
working. This fills logs with lot of entries that are not useful for debugging.

*Solution*

Change the log level to _debug_.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4136) Export Service: NPE if Options Explicitly set to NULL

2021-02-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4136:
---
Attachment: (was: 
ATLAS-4136-Export-Service-NPE-if-Options-Explicitly-.patch)

> Export Service: NPE if Options Explicitly set to NULL
> -
>
> Key: ATLAS-4136
> URL: https://issues.apache.org/jira/browse/ATLAS-4136
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> ATLAS-4136-Export-Service-NPE-if-Options-Explicitly-.patch
>
>
> *Steps to Duplicate*
>  # Create an export request by explicitly setting options to null. E.g.
> {code:java}
> {
>  "itemsToExport": [{
>  "typeName": "hive_db",
>  "uniqueAttributes": {
>  "qualifiedName": "abcd@cm"
>  }
>  }],
>  "options": null
> }{code}
> _Expected results:_ Export should proceed.
> _Actual results_: Export fails with NPE.
> Root cause: Changes introduced by changes made for ATLAS-4068.
> _Workaround_: Not having _options_ key addresses the problem.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4136) Export Service: NPE if Options Explicitly set to NULL

2021-02-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4136:
---
Attachment: ATLAS-4136-Export-Service-NPE-if-Options-Explicitly-.patch

> Export Service: NPE if Options Explicitly set to NULL
> -
>
> Key: ATLAS-4136
> URL: https://issues.apache.org/jira/browse/ATLAS-4136
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> ATLAS-4136-Export-Service-NPE-if-Options-Explicitly-.patch
>
>
> *Steps to Duplicate*
>  # Create an export request by explicitly setting options to null. E.g.
> {code:java}
> {
>  "itemsToExport": [{
>  "typeName": "hive_db",
>  "uniqueAttributes": {
>  "qualifiedName": "abcd@cm"
>  }
>  }],
>  "options": null
> }{code}
> _Expected results:_ Export should proceed.
> _Actual results_: Export fails with NPE.
> Root cause: Changes introduced by changes made for ATLAS-4068.
> _Workaround_: Not having _options_ key addresses the problem.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4136) Export Service: NPE if Options Explicitly set to NULL

2021-02-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4136:
---
Attachment: ATLAS-4136-Export-Service-NPE-if-Options-Explicitly-.patch

> Export Service: NPE if Options Explicitly set to NULL
> -
>
> Key: ATLAS-4136
> URL: https://issues.apache.org/jira/browse/ATLAS-4136
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: 
> ATLAS-4136-Export-Service-NPE-if-Options-Explicitly-.patch
>
>
> *Steps to Duplicate*
>  # Create an export request by explicitly setting options to null. E.g.
> {code:java}
> {
>  "itemsToExport": [{
>  "typeName": "hive_db",
>  "uniqueAttributes": {
>  "qualifiedName": "abcd@cm"
>  }
>  }],
>  "options": null
> }{code}
> _Expected results:_ Export should proceed.
> _Actual results_: Export fails with NPE.
> Root cause: Changes introduced by changes made for ATLAS-4068.
> _Workaround_: Not having _options_ key addresses the problem.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4136) Export Service: NPE if Options Explicitly set to NULL

2021-02-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4136:
---
Summary: Export Service: NPE if Options Explicitly set to NULL  (was: 
Export Service: NPE if No Options are Passed)

> Export Service: NPE if Options Explicitly set to NULL
> -
>
> Key: ATLAS-4136
> URL: https://issues.apache.org/jira/browse/ATLAS-4136
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
>
> *Steps to Duplicate*
>  # Create an export request by explicitly setting options to null. E.g.
> {code:java}
> {
>  "itemsToExport": [{
>  "typeName": "hive_db",
>  "uniqueAttributes": {
>  "qualifiedName": "abcd@cm"
>  }
>  }],
>  "options": null
> }{code}
> _Expected results:_ Export should proceed.
> _Actual results_: Export fails with NPE.
> Root cause: Changes introduced by changes made for ATLAS-4068.
> _Workaround_: Not having _options_ key addresses the problem.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4136) Export Service: NPE if No Options are Passed

2021-02-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4136:
---
Description: 
*Steps to Duplicate*
 # Create an export request by explicitly setting options to null. E.g.

{code:java}
{
 "itemsToExport": [{
 "typeName": "hive_db",
 "uniqueAttributes": {
 "qualifiedName": "abcd@cm"
 }
 }],
 "options": null
}{code}
_Expected results:_ Export should proceed.

_Actual results_: Export fails with NPE.

Root cause: Changes introduced by changes made for ATLAS-4068.

_Workaround_: Not having _options_ key addresses the problem.

 

  was:
*Steps to Duplicate*
 # Create an export request by explicitly setting options to null. E.g.

{code:java}
{
 "itemsToExport": [{
 "typeName": "hive_db",
 "uniqueAttributes": {
 "qualifiedName": "abcd@cm"
 }
 }],
 "options": null
}{code}
_Expected results:_ Export should proceed.

_Actual results_: Export fails with NPE.

Root cause: Changes introduced by changes made for ATLAS-4068.


> Export Service: NPE if No Options are Passed
> 
>
> Key: ATLAS-4136
> URL: https://issues.apache.org/jira/browse/ATLAS-4136
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
>
> *Steps to Duplicate*
>  # Create an export request by explicitly setting options to null. E.g.
> {code:java}
> {
>  "itemsToExport": [{
>  "typeName": "hive_db",
>  "uniqueAttributes": {
>  "qualifiedName": "abcd@cm"
>  }
>  }],
>  "options": null
> }{code}
> _Expected results:_ Export should proceed.
> _Actual results_: Export fails with NPE.
> Root cause: Changes introduced by changes made for ATLAS-4068.
> _Workaround_: Not having _options_ key addresses the problem.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4136) Export Service: NPE if No Options are Passed

2021-02-04 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4136:
---
Description: 
*Steps to Duplicate*
 # Create an export request by explicitly setting options to null. E.g.

{code:java}
{
 "itemsToExport": [{
 "typeName": "hive_db",
 "uniqueAttributes": {
 "qualifiedName": "abcd@cm"
 }
 }],
 "options": null
}{code}
_Expected results:_ Export should proceed.

_Actual results_: Export fails with NPE.

Root cause: Changes introduced by changes made for ATLAS-4068.

  was:
*Steps to Duplicate*
 # Create an export request by explicitly setting options to null. E.g.

{code:java}
{
 "itemsToExport": [{
 "typeName": "hive_db",
 "uniqueAttributes": {
 "qualifiedName": "abcd@cm"
 }
 }],
 "options": null
}{code}
_Expected results:_ Export should proceed.

_Actual results_: Export fails with NPE.


> Export Service: NPE if No Options are Passed
> 
>
> Key: ATLAS-4136
> URL: https://issues.apache.org/jira/browse/ATLAS-4136
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
>
> *Steps to Duplicate*
>  # Create an export request by explicitly setting options to null. E.g.
> {code:java}
> {
>  "itemsToExport": [{
>  "typeName": "hive_db",
>  "uniqueAttributes": {
>  "qualifiedName": "abcd@cm"
>  }
>  }],
>  "options": null
> }{code}
> _Expected results:_ Export should proceed.
> _Actual results_: Export fails with NPE.
> Root cause: Changes introduced by changes made for ATLAS-4068.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4136) Export Service: NPE if No Options are Passed

2021-02-04 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4136:
--

 Summary: Export Service: NPE if No Options are Passed
 Key: ATLAS-4136
 URL: https://issues.apache.org/jira/browse/ATLAS-4136
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Affects Versions: trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
 Fix For: trunk


*Steps to Duplicate*
 # Create an export request by explicitly setting options to null. E.g.

{code:java}
{
 "itemsToExport": [{
 "typeName": "hive_db",
 "uniqueAttributes": {
 "qualifiedName": "abcd@cm"
 }
 }],
 "options": null
}{code}
_Expected results:_ Export should proceed.

_Actual results_: Export fails with NPE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4122) Advanced Search: Fix for within clause with Double Quote Values

2021-02-02 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4122:
---
Attachment: ATLAS-4122-Advanced-Search-Literals-with-double-quot.patch

> Advanced Search: Fix for within clause with Double Quote Values
> ---
>
> Key: ATLAS-4122
> URL: https://issues.apache.org/jira/browse/ATLAS-4122
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk, 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk, 2.1.0
>
> Attachments: 
> ATLAS-4122-Advanced-Search-Literals-with-double-quot.patch
>
>
> *Steps to Duplicate*
>  # In 'Advance Search' fire a query: _hive_db where name = ["Sales", 
> "Reporting"]_
> _Expected results:_ With appropriate data being present, the query should 
> return the correct results.
> _Actual results:_ Query is not recognized as valid query and no results are 
> returned.
> Root cause: The change is attributed to the implementation of ATLAS-2932, 
> where script execution was replaced with traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4122) Advanced Search: Fix for within clause with Double Quote Values

2021-02-02 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4122:
---
  Component/s:  atlas-core
Fix Version/s: 2.1.0
   trunk
Affects Version/s: trunk
   2.1.0

> Advanced Search: Fix for within clause with Double Quote Values
> ---
>
> Key: ATLAS-4122
> URL: https://issues.apache.org/jira/browse/ATLAS-4122
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk, 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk, 2.1.0
>
>
> *Steps to Duplicate*
>  # In 'Advance Search' fire a query: _hive_db where name = ["Sales", 
> "Reporting"]_
> _Expected results:_ With appropriate data being present, the query should 
> return the correct results.
> _Actual results:_ Query is not recognized as valid query and no results are 
> returned.
> Root cause: The change is attributed to the implementation of ATLAS-2932, 
> where script execution was replaced with traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4122) Advanced Search: Fix for within clause with Double Quote Values

2021-02-02 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4122:
--

 Summary: Advanced Search: Fix for within clause with Double Quote 
Values
 Key: ATLAS-4122
 URL: https://issues.apache.org/jira/browse/ATLAS-4122
 Project: Atlas
  Issue Type: Bug
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Steps to Duplicate*
 # In 'Advance Search' fire a query: _hive_db where name = ["Sales", 
"Reporting"]_

_Expected results:_ With appropriate data being present, the query should 
return the correct results.

_Actual results:_ Query is not recognized as valid query and no results are 
returned.

Root cause: The change is attributed to the implementation of ATLAS-2932, where 
script execution was replaced with traversal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4121) Import Service: Improve Speed of Ingest for replicatedTo Option

2021-02-02 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4121:
---
Attachment: (was: ATLAS-4121-Concurrent-Import.patch)

> Import Service: Improve Speed of Ingest for replicatedTo Option
> ---
>
> Key: ATLAS-4121
> URL: https://issues.apache.org/jira/browse/ATLAS-4121
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: ATLAS-4121-Concurrent-Import.patch
>
>
> *Background*
> For Import replication options that includes _skipLineage_ and 
> _replicatedTo_, __ current implementation uses one entity-at-time import 
> approach.
> This approach can be end up taking long time to import for large payloads.
> *Solution*
>  * Use an approach that allows for entity of a type to be concurrently 
> created. 
>  * This assumes that the incoming stream of entities to be imported is 
> topologically sorted (where parent entities appear before the child entities).
>  * Create entities of a type.
>  * Once all entities of a type are created, fetch next type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4121) Import Service: Improve Speed of Ingest for replicatedTo Option

2021-02-02 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4121:
---
Attachment: ATLAS-4121-Concurrent-Import.patch

> Import Service: Improve Speed of Ingest for replicatedTo Option
> ---
>
> Key: ATLAS-4121
> URL: https://issues.apache.org/jira/browse/ATLAS-4121
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: ATLAS-4121-Concurrent-Import.patch
>
>
> *Background*
> For Import replication options that includes _skipLineage_ and 
> _replicatedTo_, __ current implementation uses one entity-at-time import 
> approach.
> This approach can be end up taking long time to import for large payloads.
> *Solution*
>  * Use an approach that allows for entity of a type to be concurrently 
> created. 
>  * This assumes that the incoming stream of entities to be imported is 
> topologically sorted (where parent entities appear before the child entities).
>  * Create entities of a type.
>  * Once all entities of a type are created, fetch next type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4121) Import Service: Improve Speed of Ingest for replicatedTo Option

2021-02-01 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4121:
---
Attachment: ATLAS-4121-Concurrent-Import.patch

> Import Service: Improve Speed of Ingest for replicatedTo Option
> ---
>
> Key: ATLAS-4121
> URL: https://issues.apache.org/jira/browse/ATLAS-4121
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk
>
> Attachments: ATLAS-4121-Concurrent-Import.patch
>
>
> *Background*
> For Import replication options that includes _skipLineage_ and 
> _replicatedTo_, __ current implementation uses one entity-at-time import 
> approach.
> This approach can be end up taking long time to import for large payloads.
> *Solution*
>  * Use an approach that allows for entity of a type to be concurrently 
> created. 
>  * This assumes that the incoming stream of entities to be imported is 
> topologically sorted (where parent entities appear before the child entities).
>  * Create entities of a type.
>  * Once all entities of a type are created, fetch next type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4121) Import Service: Improve Speed of Ingest for replicatedTo Option

2021-02-01 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4121:
--

 Summary: Import Service: Improve Speed of Ingest for replicatedTo 
Option
 Key: ATLAS-4121
 URL: https://issues.apache.org/jira/browse/ATLAS-4121
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Affects Versions: 2.1.0
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
 Fix For: trunk


*Background*

For Import replication options that includes _skipLineage_ and _replicatedTo_, 
__ current implementation uses one entity-at-time import approach.

This approach can be end up taking long time to import for large payloads.

*Solution*
 * Use an approach that allows for entity of a type to be concurrently created. 
 * This assumes that the incoming stream of entities to be imported is 
topologically sorted (where parent entities appear before the child entities).
 * Create entities of a type.
 * Once all entities of a type are created, fetch next type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-4109) Advanced Search: Glossary Clause: More Efficient Structure

2021-01-21 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-4109:
---
Attachment: ATLAS-4109-Advanced-Search-Glossary-terms-clause-cha.patch

> Advanced Search: Glossary Clause: More Efficient Structure
> --
>
> Key: ATLAS-4109
> URL: https://issues.apache.org/jira/browse/ATLAS-4109
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-4109-Advanced-Search-Glossary-terms-clause-cha.patch
>
>
> *Background*
> Glossary support was added to Atlas recently. Existing clause for fetching 
> entities linked to a glossary could be improved.
> ***Solution*
>  * Use edge traversal instead of vertex.
>  * Remove dedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-4109) Advanced Search: Glossary Clause: More Efficient Structure

2021-01-21 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-4109:
--

 Summary: Advanced Search: Glossary Clause: More Efficient Structure
 Key: ATLAS-4109
 URL: https://issues.apache.org/jira/browse/ATLAS-4109
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry


*Background*

Glossary support was added to Atlas recently. Existing clause for fetching 
entities linked to a glossary could be improved.

***Solution*
 * Use edge traversal instead of vertex.
 * Remove dedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ATLAS-2932) Update DSL to use Java Traversal API

2021-01-20 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry resolved ATLAS-2932.

Resolution: Fixed

> Update DSL to use Java Traversal API
> 
>
> Key: ATLAS-2932
> URL: https://issues.apache.org/jira/browse/ATLAS-2932
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0, trunk
>Reporter: Apoorv Naik
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> 0002-ATLAS-2932-Update-DSL-to-use-Tinkerpop-Java-APIs-ins.patch, 
> dsl-traversal-v2.3.patch
>
>
> Change DSL code to use Java Tinkerpop Traversals instead of 
> GremlinScriptEngine



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-2932) Update DSL to use Java Traversal API

2021-01-07 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-2932:
---
Attachment: dsl-traversal-v2.3.patch

> Update DSL to use Java Traversal API
> 
>
> Key: ATLAS-2932
> URL: https://issues.apache.org/jira/browse/ATLAS-2932
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0, trunk
>Reporter: Apoorv Naik
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> 0002-ATLAS-2932-Update-DSL-to-use-Tinkerpop-Java-APIs-ins.patch, 
> dsl-traversal-v2.3.patch
>
>
> Change DSL code to use Java Tinkerpop Traversals instead of 
> GremlinScriptEngine



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   >