Re: Atlas import taking huge amount of time

2019-08-09 Thread Bolke de Bruin
Will this solve / improve the issue with extremely slow Kafka consumption as 
well? 

Cheers
Bolke

Sent from my iPhone

> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu 
>  wrote:
> 
> Sure Ashutosh,
> 
> Please let me know once it is done. 
> 
> 
> 
> ---
> Regards,
> Sreeni 
> 
> On 25/07/19, 4:52 AM, "Ashutosh Mestry"  wrote:
> 
>This email is from an external sender.
> 
> 
>There are few dependent patches. I will try to put out a patch on version 
> 2.0. It will take me few days to get this. Please bear with me.
> 
>~ ashutosh
>...
>No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" 
>  wrote:
> 
>Hi Ashutosh,
> 
>Thanks for your reply.
> 
>Currently we are using Atlas 2.0.0 and I am not able to apply this 
> patch. It has lot of compilation errors.
> 
>Do you have a patch for 2.0.0?
> 
> 
>---
>Regards,
>Sreeni
> 
>On 23/07/19, 11:36 PM, "Ashutosh Mestry"  wrote:
> 
>This email is from an external sender.
> 
> 
>Hi
> 
>Existing import processes 1 entity at a time. Thus time taken is 
> linear. There is a JIRA that improves the situation. It is being tested right 
> now.
> 
>Please take a look at:
>JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
>Review: https://reviews.apache.org/r/71025/ (latest patch is here)
> 
>Best regards,
> 
>~ ashutosh
>...
>No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" 
>  wrote:
> 
>Hello folks,
> 
>We are trying to export and import the existing data to 
> different atlas system.
>We have around 1 entities in the exported zip file. The 
> export is taking around 2-3 mins.
>Total zip file size is 14 MB. The largest file in the zip is 
> around 7 MB which has almost 1000 relationshipAttributes in it.
>When we try to import this, the import is running for more 
> than 25 hours. Is this expected behaviour? Is there any way to speed up this 
> process?
> 
>Export command
>curl -igk -X POST -u admin:admin -H "Content-Type: 
> application/json" -H "Cache-Control: no-cache" -d '{
>"itemsToExport": [
>   { "typeName": "kafka_topic" }
>],
>"options": {
>"matchType": "forType"
>}
>}' "http:// localhost:21000/api/atlas/admin/export" > 
> /tmp/kafka_topic.zip
> 
> 
>Import command
>curl -ikg -X POST -u admin:admin -H "Content-Type: 
> multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip 
> "http://localhost:21000/api/atlas/admin/import;
> 
>let us know if we are missing in the import process
> 
> 
>---
>Regards,
>Sreeni
> 
> 
> 
> 
> 
> 


[jira] [Updated] (ATLAS-3362) Import Service: Table-level Imports: Updating Replication Info Improvements

2019-08-09 Thread Ashutosh Mestry (JIRA)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-3362:
---
Attachment: ATLAS-3362-Updated-logic-for-storing-repl-key-for-ta.patch

> Import Service: Table-level Imports: Updating Replication Info Improvements
> ---
>
> Key: ATLAS-3362
> URL: https://issues.apache.org/jira/browse/ATLAS-3362
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Attachments: 
> ATLAS-3362-Updated-logic-for-storing-repl-key-for-ta.patch
>
>
> *Background*
> During import of files that have _changeMarker_, _AuditsWriter_ updates 
> _AtlasServer_ with change marker using the parent entity. So far this worked 
> since the mechanism was used mostly for database-level exports (and imports).
> With the introduction of table-level exports, this does not work very well 
> since the list of tables exported to be exported can be large. 
> *Solution*
> For requests that are made for tables, fetch the guid for the table it 
> belongs to. This will address the problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ATLAS-3362) Import Service: Table-level Imports: Updating Replication Info Improvements

2019-08-09 Thread Ashutosh Mestry (JIRA)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-3362:
---
Attachment: (was: 
ATLAS-3362-Using-different-scheme-for-storing-repl-d.patch)

> Import Service: Table-level Imports: Updating Replication Info Improvements
> ---
>
> Key: ATLAS-3362
> URL: https://issues.apache.org/jira/browse/ATLAS-3362
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: trunk
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
>
> *Background*
> During import of files that have _changeMarker_, _AuditsWriter_ updates 
> _AtlasServer_ with change marker using the parent entity. So far this worked 
> since the mechanism was used mostly for database-level exports (and imports).
> With the introduction of table-level exports, this does not work very well 
> since the list of tables exported to be exported can be large. 
> *Solution*
> For requests that are made for tables, fetch the guid for the table it 
> belongs to. This will address the problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ATLAS-3365) Expose image names on lineage page

2019-08-09 Thread Daniel Kelencz (JIRA)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Kelencz updated ATLAS-3365:
--
Attachment: 0001-ATLAS-3365-expose-image-file-names-to-be-loaded-on-l.patch

> Expose image names on lineage page
> --
>
> Key: ATLAS-3365
> URL: https://issues.apache.org/jira/browse/ATLAS-3365
> Project: Atlas
>  Issue Type: Bug
>  Components: atlas-webui
>Affects Versions: 1.1.0
>Reporter: Daniel Kelencz
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: 
> 0001-ATLAS-3365-expose-image-file-names-to-be-loaded-on-l.patch
>
>
> Recently atlas ui behaviour changed on the lineage page and the filenames for 
> the elements
> are no longer available from the html.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ATLAS-3365) Expose image names on lineage page

2019-08-09 Thread Daniel Kelencz (JIRA)
Daniel Kelencz created ATLAS-3365:
-

 Summary: Expose image names on lineage page
 Key: ATLAS-3365
 URL: https://issues.apache.org/jira/browse/ATLAS-3365
 Project: Atlas
  Issue Type: Bug
  Components: atlas-webui
Affects Versions: 1.1.0
Reporter: Daniel Kelencz
 Fix For: 1.1.0


Recently atlas ui behaviour changed on the lineage page and the filenames for 
the elements
are no longer available from the html.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ATLAS-3364) UI: Compose logout url from Atlas base path (URI)

2019-08-09 Thread Nikhil Bonte (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903883#comment-16903883
 ] 

Nikhil Bonte commented on ATLAS-3364:
-

+1 for the patch. Thanks Keval for the patch.

> UI: Compose logout url from Atlas base path (URI)
> -
>
> Key: ATLAS-3364
> URL: https://issues.apache.org/jira/browse/ATLAS-3364
> Project: Atlas
>  Issue Type: Improvement
>  Components: atlas-webui
>Affects Versions: 2.0.0
>Reporter: Keval Bhatt
>Assignee: Keval Bhatt
>Priority: Critical
> Fix For: 2.1.0, 3.0.0
>
> Attachments: ATLAS-3364.patch
>
>
> Consider Atlas base URI while constructing logout URL. to avoid redirection 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ATLAS-3364) UI: Compose logout url from Atlas base path (URI)

2019-08-09 Thread Keval Bhatt (JIRA)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keval Bhatt updated ATLAS-3364:
---
Attachment: ATLAS-3364.patch

> UI: Compose logout url from Atlas base path (URI)
> -
>
> Key: ATLAS-3364
> URL: https://issues.apache.org/jira/browse/ATLAS-3364
> Project: Atlas
>  Issue Type: Improvement
>  Components: atlas-webui
>Affects Versions: 2.0.0
>Reporter: Keval Bhatt
>Assignee: Keval Bhatt
>Priority: Critical
> Fix For: 2.1.0, 3.0.0
>
> Attachments: ATLAS-3364.patch
>
>
> Consider Atlas base URI while constructing logout URL. to avoid redirection 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ATLAS-3364) UI: Compose logout url from Atlas base path (URI)

2019-08-09 Thread Keval Bhatt (JIRA)
Keval Bhatt created ATLAS-3364:
--

 Summary: UI: Compose logout url from Atlas base path (URI)
 Key: ATLAS-3364
 URL: https://issues.apache.org/jira/browse/ATLAS-3364
 Project: Atlas
  Issue Type: Improvement
  Components: atlas-webui
Affects Versions: 2.0.0
Reporter: Keval Bhatt
Assignee: Keval Bhatt
 Fix For: 2.1.0, 3.0.0


Consider Atlas base URI while constructing logout URL. to avoid redirection 
issue.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ATLAS-3305) Unable to scale atlas kafka consumers

2019-08-09 Thread Adam Rempter (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903621#comment-16903621
 ] 

Adam Rempter commented on ATLAS-3305:
-

Yes, exactly I was thinking something like this.

> Unable to scale atlas kafka consumers
> -
>
> Key: ATLAS-3305
> URL: https://issues.apache.org/jira/browse/ATLAS-3305
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core, atlas-intg
>Affects Versions: 1.1.0, 2.0.0
>Reporter: Adam Rempter
>Priority: Major
>  Labels: performance
>
> We wanted to scale kafka consumers for atlas, as we are getting many lineage 
> messages and processing them just with one consumer is not enough. 
>  
> There is parameter atlas.notification.hook.numthreads to scale consumers in  
> NotificationHookConsumer.
> But the method:
>  
> notificationInterface.createConsumers(NotificationType.HOOK, numThreads)
>  
> is always returning one element list, which effectively always starts one 
> consumer
> List> consumers = 
> Collections.singletonList(kafkaConsumer);
>  
> Log incorrectly says that nuber of consumers has been created:
> LOG.info("<== KafkaNotification.createConsumers(notificationType={}, 
> numConsumers={}, autoCommitEnabled={})", notificationType, numConsumers, 
> autoCommitEnabled)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ATLAS-3363) In Audits(UI) page of an entity, columns that are added later to Hbase or Hive table do not appear in COLUMN_STATS_ACCURATE field in 'parameters'

2019-08-09 Thread Rahul Kurup (JIRA)
Rahul Kurup created ATLAS-3363:
--

 Summary: In Audits(UI) page of an entity, columns that are added 
later to Hbase or Hive table do not appear in COLUMN_STATS_ACCURATE field in 
'parameters' 
 Key: ATLAS-3363
 URL: https://issues.apache.org/jira/browse/ATLAS-3363
 Project: Atlas
  Issue Type: Bug
Reporter: Rahul Kurup


Steps:
1. Create an Hbase or Hive table
2. Update the Hbase or Hive table with a new column.
3. Go to the entity details page of that table, and go to the Audits tab.
4. In the parameters column there is a 'COLUMN_STATS_ACCURATE' field which 
seems to list the columns for that table. However it does not list the added 
columns in that field. The newly added column does appear in the Properties tab 
though.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)