[jira] [Commented] (ATLAS-3398) Duplicates for unique attributes

2020-07-31 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169160#comment-17169160
 ] 

Damian Warszawski commented on ATLAS-3398:
--

Thanks [~amestry]. Of course, I don't mind.

> Duplicates for unique attributes 
> -
>
> Key: ATLAS-3398
> URL: https://issues.apache.org/jira/browse/ATLAS-3398
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, trunk
>Reporter: Bolke de Bruin
>Assignee: Ashutosh Mestry
>Priority: Blocker
>  Labels: integrity
> Attachments: zrzut_ekranu_2019-09-03_o_10.28.50.png
>
>
> We are seeing issues with entities being added to Atlas with duplicate 
> "qualifiedName". The guids differ and other attributes do also differ. Below 
> a graph that shows the distribution over time for duplicates. We have 
> difficulty determining which one is the right one (as they are different) in 
> order to clean them up.
> We are also not the only ones encountering this as you can in the linked 
> issue.
> We have noticed that Atlas does not use the 
> [locking|https://docs.janusgraph.org/master/advanced-topics/eventual-consistency/]
>  mechanism of Janus to prevent this:
>  
> !zrzut_ekranu_2019-09-03_o_10.28.50.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3398) Duplicates for unique attributes

2020-07-30 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168268#comment-17168268
 ] 

Damian Warszawski commented on ATLAS-3398:
--

[~mad...@apache.org], [~amestry]  uploaded another patch to fix the unit tests. 
Please have a look on it. Thanks.

> Duplicates for unique attributes 
> -
>
> Key: ATLAS-3398
> URL: https://issues.apache.org/jira/browse/ATLAS-3398
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, trunk
>Reporter: Bolke de Bruin
>Priority: Blocker
>  Labels: integrity
> Attachments: zrzut_ekranu_2019-09-03_o_10.28.50.png
>
>
> We are seeing issues with entities being added to Atlas with duplicate 
> "qualifiedName". The guids differ and other attributes do also differ. Below 
> a graph that shows the distribution over time for duplicates. We have 
> difficulty determining which one is the right one (as they are different) in 
> order to clean them up.
> We are also not the only ones encountering this as you can in the linked 
> issue.
> We have noticed that Atlas does not use the 
> [locking|https://docs.janusgraph.org/master/advanced-topics/eventual-consistency/]
>  mechanism of Janus to prevent this:
>  
> !zrzut_ekranu_2019-09-03_o_10.28.50.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-30 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/
---

(Updated July 30, 2020, 10:23 p.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

rebase with master, fix field mapping for unit tests


Repository: atlas


Description
---

Optional configuration to support locks on JanusGraph to ensure data consitency.

JanusGraph is eventually consistent by default which is efficient but results 
in duplicates when race condition occurs.


Reference to jira 
https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
 6ef9cb76c 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  
repository/src/test/java/org/apache/atlas/discovery/FreeTextSearchProcessorTest.java
 464b281fc 
  test-tools/src/main/resources/solr/core-template/solrconfig.xml 39cc6ab45 


Diff: https://reviews.apache.org/r/72695/diff/4/

Changes: https://reviews.apache.org/r/72695/diff/3-4/


Testing
---

Not possible to reproduce the error on local machine. Enable locking on our dev 
env and have not introduce any regression.


Thanks,

Damian Warszawski



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-29 Thread Damian Warszawski
ted [4] but
> found [1]
>   AtlasDiscoveryServiceTest.query_ALLWildcardTag:165 expected [5] but
> found [2]
>   AtlasDiscoveryServiceTest.query_NOTCLASSIFIEDTag:151 expected [true] but
> found [false]
>   AtlasDiscoveryServiceTest.query_entity:214 expected [true] but found
> [false]
>   AtlasDiscoveryServiceTest.query_entity_entityFilter:228 expected [true]
> but found [false]
>   AtlasDiscoveryServiceTest.query_entity_entityFilter_tag:243 expected
> [true] but found [false]
>   AtlasDiscoveryServiceTest.query_entity_entityFilter_tag_tagFilter:260
> expected [true] but found [false]
>   AtlasDiscoveryServiceTest.query_entity_tag:289 expected [true] but found
> [false]
>   AtlasDiscoveryServiceTest.query_entity_tag_tagFilter:275 expected [true]
> but found [false]
>   AtlasDiscoveryServiceTest.query_tag:188 expected [true] but found [false]
>   AtlasDiscoveryServiceTest.query_tag_tagFilter:202 expected [true] but
> found [false]
>   AtlasDiscoveryServiceTest.query_wildcardTag:176 expected [true] but
> found [false]
>   FreeTextSearchProcessorTest.searchByNameSortBy:91 expected [3] but found
> [0]
>   FreeTextSearchProcessorTest.searchTablesByName:71 expected [3] but found
> [0]
>
> Tests run: 756, Failures: 15, Errors: 0, Skipped: 14
>
>
> On 7/29/20, 6:11 AM, "Damian Warszawski"  behalf of damian.warszaw...@gmail.com> wrote:
>
>
>
> > On July 28, 2020, 10:55 p.m., Madhan Neethiraj wrote:
> > > Ship It!
>
> Can I ask you to merge it? I don't think I have permissions to merge
> it to master.
>
>
> - Damian
>
>
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/#review221401
> ---
>
>
> On July 25, 2020, 8:53 p.m., Damian Warszawski wrote:
> >
> > ---
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/72695/
> > ---
> >
> > (Updated July 25, 2020, 8:53 p.m.)
> >
> >
> > Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan,
> and Sarath Subramanian.
> >
> >
> > Repository: atlas
> >
> >
> > Description
> > ---
> >
> > Optional configuration to support locks on JanusGraph to ensure data
> consitency.
> >
> > JanusGraph is eventually consistent by default which is efficient
> but results in duplicates when race condition occurs.
> >
> >
> > Reference to jira
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> >
> >
> > Diffs
> > -
> >
> >
>  
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
> 6ef9cb76c
> >   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java
> 2c007ca01
> >
> >
> > Diff: https://reviews.apache.org/r/72695/diff/3/
> >
> >
> > Testing
> > ---
> >
> > Not possible to reproduce the error on local machine. Enable locking
> on our dev env and have not introduce any regression.
> >
> >
> > Thanks,
> >
> > Damian Warszawski
> >
> >
>
>
>
>


Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-29 Thread Damian Warszawski


> On July 28, 2020, 10:55 p.m., Madhan Neethiraj wrote:
> > Ship It!

Can I ask you to merge it? I don't think I have permissions to merge it to 
master.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/#review221401
---


On July 25, 2020, 8:53 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/
> ---
> 
> (Updated July 25, 2020, 8:53 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Optional configuration to support locks on JanusGraph to ensure data 
> consitency.
> 
> JanusGraph is eventually consistent by default which is efficient but results 
> in duplicates when race condition occurs.
> 
> 
> Reference to jira 
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
>  6ef9cb76c 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
> 
> 
> Diff: https://reviews.apache.org/r/72695/diff/3/
> 
> 
> Testing
> ---
> 
> Not possible to reproduce the error on local machine. Enable locking on our 
> dev env and have not introduce any regression.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-25 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/
---

(Updated July 25, 2020, 8:53 p.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

fix condition for uniqueness, rename property corresponding to consistency lock


Repository: atlas


Description
---

Optional configuration to support locks on JanusGraph to ensure data consitency.

JanusGraph is eventually consistent by default which is efficient but results 
in duplicates when race condition occurs.


Reference to jira 
https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
 6ef9cb76c 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 


Diff: https://reviews.apache.org/r/72695/diff/3/

Changes: https://reviews.apache.org/r/72695/diff/2-3/


Testing
---

Not possible to reproduce the error on local machine. Enable locking on our dev 
env and have not introduce any regression.


Thanks,

Damian Warszawski



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-24 Thread Damian Warszawski


> On July 23, 2020, 4:52 a.m., Ashutosh Mestry wrote:
> > repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
> > Lines 44 (patched)
> > <https://reviews.apache.org/r/72695/diff/1/?file=2236038#file2236038line44>
> >
> > I suggest use AtlasConfiguration for this. Also, I think this should be 
> > true by default.
> 
> Damian Warszawski wrote:
> Great simplification. Thanks Ashutosh.

Not sure if that should be true by default as long as not having a proper 
benchmarks how the performance is degradated.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/#review221323
-------


On July 24, 2020, 8:23 a.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/
> ---
> 
> (Updated July 24, 2020, 8:23 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Optional configuration to support locks on JanusGraph to ensure data 
> consitency.
> 
> JanusGraph is eventually consistent by default which is efficient but results 
> in duplicates when race condition occurs.
> 
> 
> Reference to jira 
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
>  6ef9cb76c 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
> 
> 
> Diff: https://reviews.apache.org/r/72695/diff/2/
> 
> 
> Testing
> ---
> 
> Not possible to reproduce the error on local machine. Enable locking on our 
> dev env and have not introduce any regression.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-24 Thread Damian Warszawski


> On July 23, 2020, 9:28 p.m., Ashutosh Mestry wrote:
> > repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
> > Line 39 (original), 39 (patched)
> > <https://reviews.apache.org/r/72695/diff/1/?file=2236038#file2236038line39>
> >
> > We will need to add another *JavaPatch* to update existing data.

Could you elaborate on it? Does the locking itself change the structure of the 
index?
We usually do the full-import when index is changed.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/#review221338
---


On July 24, 2020, 8:23 a.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/
> ---
> 
> (Updated July 24, 2020, 8:23 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Optional configuration to support locks on JanusGraph to ensure data 
> consitency.
> 
> JanusGraph is eventually consistent by default which is efficient but results 
> in duplicates when race condition occurs.
> 
> 
> Reference to jira 
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
>  6ef9cb76c 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
> 
> 
> Diff: https://reviews.apache.org/r/72695/diff/2/
> 
> 
> Testing
> ---
> 
> Not possible to reproduce the error on local machine. Enable locking on our 
> dev env and have not introduce any regression.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-24 Thread Damian Warszawski


> On July 23, 2020, 4:52 a.m., Ashutosh Mestry wrote:
> > repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
> > Lines 44 (patched)
> > <https://reviews.apache.org/r/72695/diff/1/?file=2236038#file2236038line44>
> >
> > I suggest use AtlasConfiguration for this. Also, I think this should be 
> > true by default.

Great simplification. Thanks Ashutosh.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/#review221323
-------


On July 24, 2020, 8:23 a.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/
> ---
> 
> (Updated July 24, 2020, 8:23 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Optional configuration to support locks on JanusGraph to ensure data 
> consitency.
> 
> JanusGraph is eventually consistent by default which is efficient but results 
> in duplicates when race condition occurs.
> 
> 
> Reference to jira 
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
>  6ef9cb76c 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
> 
> 
> Diff: https://reviews.apache.org/r/72695/diff/2/
> 
> 
> Testing
> ---
> 
> Not possible to reproduce the error on local machine. Enable locking on our 
> dev env and have not introduce any regression.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-24 Thread Damian Warszawski


> On July 21, 2020, 9:06 p.m., Madhan Neethiraj wrote:
> > graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
> > Lines 260 (patched)
> > <https://reviews.apache.org/r/72695/diff/1/?file=2236033#file2236033line260>
> >
> > Consistency lock might be relevant/needed only for unique-index i.e. 
> > isUnique=true. If this true, consider calling setConsistency(LOCK) when 
> > isUnique is true, without requiring additional argument. Same applies for 
> > #278 as well.
> > 
> > It will be useful support following configuration, to optionally 
> > disable consistentcy-lock:
> >   atlas.graph.storage.unique-key.consitency-lock.enabled
> > 
> > This configuration can be sent to AtlasJanusGraphManagement during 
> > construction - from AtlasJanusGraph.getManagementSystem().
> > 
> > Above will avoid updates to many methods for the addition of 
> > lockEnabled argument.

Great suggestion. Changed accordingly. Thanks.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/#review221297
---


On July 24, 2020, 8:23 a.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/
> ---
> 
> (Updated July 24, 2020, 8:23 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Optional configuration to support locks on JanusGraph to ensure data 
> consitency.
> 
> JanusGraph is eventually consistent by default which is efficient but results 
> in duplicates when race condition occurs.
> 
> 
> Reference to jira 
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
>  6ef9cb76c 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
> 
> 
> Diff: https://reviews.apache.org/r/72695/diff/2/
> 
> 
> Testing
> ---
> 
> Not possible to reproduce the error on local machine. Enable locking on our 
> dev env and have not introduce any regression.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-24 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/
---

(Updated July 24, 2020, 8:23 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

move locking conf to AtlasConfiguration


Repository: atlas


Description
---

Optional configuration to support locks on JanusGraph to ensure data consitency.

JanusGraph is eventually consistent by default which is efficient but results 
in duplicates when race condition occurs.


Reference to jira 
https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
 6ef9cb76c 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 


Diff: https://reviews.apache.org/r/72695/diff/2/

Changes: https://reviews.apache.org/r/72695/diff/1-2/


Testing
---

Not possible to reproduce the error on local machine. Enable locking on our dev 
env and have not introduce any regression.


Thanks,

Damian Warszawski



[jira] [Commented] (ATLAS-3398) Duplicates for unique attributes

2020-07-21 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161884#comment-17161884
 ] 

Damian Warszawski commented on ATLAS-3398:
--

According to our observations that can be a race condition between hive-hook 
(kafka event) and profiler (org.apache.atlas:atlas-client-v2:2.0.0). It can be 
also the case with atlas-client which make implicit retries while calling Atlas 
API. 

> Duplicates for unique attributes 
> -
>
> Key: ATLAS-3398
> URL: https://issues.apache.org/jira/browse/ATLAS-3398
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, trunk
>Reporter: Bolke de Bruin
>Priority: Blocker
>  Labels: integrity
> Attachments: zrzut_ekranu_2019-09-03_o_10.28.50.png
>
>
> We are seeing issues with entities being added to Atlas with duplicate 
> "qualifiedName". The guids differ and other attributes do also differ. Below 
> a graph that shows the distribution over time for duplicates. We have 
> difficulty determining which one is the right one (as they are different) in 
> order to clean them up.
> We are also not the only ones encountering this as you can in the linked 
> issue.
> We have noticed that Atlas does not use the 
> [locking|https://docs.janusgraph.org/master/advanced-topics/eventual-consistency/]
>  mechanism of Janus to prevent this:
>  
> !zrzut_ekranu_2019-09-03_o_10.28.50.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3398) Duplicates for unique attributes

2020-07-20 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161569#comment-17161569
 ] 

Damian Warszawski commented on ATLAS-3398:
--

Optional configuration to support locks on JanusGraph to ensure data consitency 
-> [https://reviews.apache.org/r/72695/]

> Duplicates for unique attributes 
> -
>
> Key: ATLAS-3398
> URL: https://issues.apache.org/jira/browse/ATLAS-3398
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, trunk
>Reporter: Bolke de Bruin
>Priority: Blocker
>  Labels: integrity
> Attachments: zrzut_ekranu_2019-09-03_o_10.28.50.png
>
>
> We are seeing issues with entities being added to Atlas with duplicate 
> "qualifiedName". The guids differ and other attributes do also differ. Below 
> a graph that shows the distribution over time for duplicates. We have 
> difficulty determining which one is the right one (as they are different) in 
> order to clean them up.
> We are also not the only ones encountering this as you can in the linked 
> issue.
> We have noticed that Atlas does not use the 
> [locking|https://docs.janusgraph.org/master/advanced-topics/eventual-consistency/]
>  mechanism of Janus to prevent this:
>  
> !zrzut_ekranu_2019-09-03_o_10.28.50.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-20 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/
---

Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Repository: atlas


Description
---

Optional configuration to support locks on JanusGraph to ensure data consitency.

JanusGraph is eventually consistent by default which is efficient but results 
in duplicates when race condition occurs.


Reference to jira 
https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398


Diffs
-

  
graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraphManagement.java
 fca789027 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
 6ef9cb76c 
  
graphdb/janus/src/test/java/org/apache/atlas/repository/graphdb/janus/AbstractGraphDatabaseTest.java
 35004157f 
  
graphdb/janus/src/test/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusDatabaseTest.java
 5cd55093e 
  intg/src/main/java/org/apache/atlas/ApplicationProperties.java e662c8fae 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 e35f3594f 
  
repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
 5a9ac2abe 
  
repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatch.java
 d3111f110 


Diff: https://reviews.apache.org/r/72695/diff/1/


Testing
---

Not possible to reproduce the error on local machine. Enable locking on our dev 
env and have not introduce any regression.


Thanks,

Damian Warszawski



[jira] [Commented] (ATLAS-3758) Support sort params for FreeTextSearchProcessor

2020-06-01 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17121023#comment-17121023
 ] 

Damian Warszawski commented on ATLAS-3758:
--

thx for update

> Support sort params for FreeTextSearchProcessor
> ---
>
> Key: ATLAS-3758
> URL: https://issues.apache.org/jira/browse/ATLAS-3758
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 2.1.0, 3.0.0
>
> Attachments: ATLAS-3758-2-branch-2.0.patch, ATLAS-3758.patch
>
>
> *Problem description*
> No way to sort results by specified attribute while freetext search is 
> enabled.
> *Goals*
> As a team we are working to use Atlas as a metadata storage for 
> [https://github.com/lyft/amundsen]. It is required to sort results by any 
> particular attribute e.g. custom attribute which represents popularity score 
> to provide basic search relevancy for end users.
> *Proposed solution*
>  * add required parameters in the indexed query if specified



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ATLAS-3654) Support solr in standalone (http) mode

2020-05-27 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski closed ATLAS-3654.


fixed and merged

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 2.1.0, 3.0.0
>
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72441: Support solr in standalone (http) mode

2020-05-26 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72441/
---

(Updated May 26, 2020, 9:58 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

setting default value for solr mode to cloud


Repository: atlas


Description
---

Atlas does not support running Solr in standalone(http) mode.

It is especially useful for testing purposes to make setup as simple as 
possible without Zookeeper. It also enables full integration with JanusGraph as 
it support both mode of running Solr `cloud` and `http` 
https://docs.janusgraph.org/index-backend/solr/. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

Proposed solution

call solr V1 API  while creating/updating request handlers in standalone solr
update atlas start script to enable standalone embedded solr

Reference to jira https://issues.apache.org/jira/browse/ATLAS-3654
Patch was applied against master branch


Diffs (updated)
-

  distro/src/bin/atlas_config.py f09026ff9 
  docs/src/documents/Setup/InstallationInstruction.md d1b22d624 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphIndexClient.java
 ba65f3d00 
  graphdb/janus/src/main/java/org/janusgraph/diskstorage/solr/Solr6Index.java 
484c161f0 


Diff: https://reviews.apache.org/r/72441/diff/4/

Changes: https://reviews.apache.org/r/72441/diff/3-4/


Testing
---

Patch was applied and verified on our dev env with embedded solr and external 
hbase.


Thanks,

Damian Warszawski



Re: Review Request 72441: Support solr in standalone (http) mode

2020-05-25 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72441/
---

(Updated May 25, 2020, 8:40 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

fix markdown issues


Repository: atlas


Description
---

Atlas does not support running Solr in standalone(http) mode.

It is especially useful for testing purposes to make setup as simple as 
possible without Zookeeper. It also enables full integration with JanusGraph as 
it support both mode of running Solr `cloud` and `http` 
https://docs.janusgraph.org/index-backend/solr/. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

Proposed solution

call solr V1 API  while creating/updating request handlers in standalone solr
update atlas start script to enable standalone embedded solr

Reference to jira https://issues.apache.org/jira/browse/ATLAS-3654
Patch was applied against master branch


Diffs (updated)
-

  distro/src/bin/atlas_config.py f09026ff9 
  docs/src/documents/Setup/InstallationInstruction.md d1b22d624 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphIndexClient.java
 ba65f3d00 
  graphdb/janus/src/main/java/org/janusgraph/diskstorage/solr/Solr6Index.java 
484c161f0 


Diff: https://reviews.apache.org/r/72441/diff/3/

Changes: https://reviews.apache.org/r/72441/diff/2-3/


Testing
---

Patch was applied and verified on our dev env with embedded solr and external 
hbase.


Thanks,

Damian Warszawski



Re: Review Request 72441: Support solr in standalone (http) mode

2020-05-19 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72441/
---

(Updated May 19, 2020, 10:02 p.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

modified installation instruction


Repository: atlas


Description
---

Atlas does not support running Solr in standalone(http) mode.

It is especially useful for testing purposes to make setup as simple as 
possible without Zookeeper. It also enables full integration with JanusGraph as 
it support both mode of running Solr `cloud` and `http` 
https://docs.janusgraph.org/index-backend/solr/. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

Proposed solution

call solr V1 API  while creating/updating request handlers in standalone solr
update atlas start script to enable standalone embedded solr

Reference to jira https://issues.apache.org/jira/browse/ATLAS-3654
Patch was applied against master branch


Diffs (updated)
-

  distro/src/bin/atlas_config.py f09026ff9 
  docs/src/documents/Setup/InstallationInstruction.md d1b22d624 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphIndexClient.java
 ba65f3d00 
  graphdb/janus/src/main/java/org/janusgraph/diskstorage/solr/Solr6Index.java 
484c161f0 


Diff: https://reviews.apache.org/r/72441/diff/2/

Changes: https://reviews.apache.org/r/72441/diff/1-2/


Testing
---

Patch was applied and verified on our dev env with embedded solr and external 
hbase.


Thanks,

Damian Warszawski



Re: Review Request 72441: Support solr in standalone (http) mode

2020-05-19 Thread Damian Warszawski


> On May 6, 2020, 7:06 p.m., Sarath Subramanian wrote:
> > can you add a small readme or doc on steps to start atlas with standalone 
> > solr (http) and attach it to JIRA.

Modified installation instruction. Please have a look on latest patch.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72441/#review220661
---


On May 19, 2020, 10:02 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72441/
> ---
> 
> (Updated May 19, 2020, 10:02 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Atlas does not support running Solr in standalone(http) mode.
> 
> It is especially useful for testing purposes to make setup as simple as 
> possible without Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> https://docs.janusgraph.org/index-backend/solr/. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> 
> Proposed solution
> 
> call solr V1 API  while creating/updating request handlers in standalone solr
> update atlas start script to enable standalone embedded solr
> 
> Reference to jira https://issues.apache.org/jira/browse/ATLAS-3654
> Patch was applied against master branch
> 
> 
> Diffs
> -
> 
>   distro/src/bin/atlas_config.py f09026ff9 
>   docs/src/documents/Setup/InstallationInstruction.md d1b22d624 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphIndexClient.java
>  ba65f3d00 
>   graphdb/janus/src/main/java/org/janusgraph/diskstorage/solr/Solr6Index.java 
> 484c161f0 
> 
> 
> Diff: https://reviews.apache.org/r/72441/diff/2/
> 
> 
> Testing
> ---
> 
> Patch was applied and verified on our dev env with embedded solr and external 
> hbase.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



[jira] [Closed] (ATLAS-3776) graph query fails when orderBy attribute is specified

2020-05-14 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski closed ATLAS-3776.


fixed and merged.

> graph query fails when orderBy attribute is specified
> -
>
> Key: ATLAS-3776
> URL: https://issues.apache.org/jira/browse/ATLAS-3776
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 2.1.0, 3.0.0
>
>
> EntitySearchProcessor fails when doing search by classification and specify 
> orderBy attribute. The issue is that for graph query you cannot refer to 
> attribute by name but need to provide absolute path to entity attribute e.g. 
>  
> ```
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> this query fails with following exception:
>  
> ```
> {"exception":{"message":"Provided key does not exist: 
> Table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
>  Provided key does not exist: hive_table.popularityScore\n\tat 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
>  org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
>  orderBy(GraphCentricQueryBuilder.java:160)
> ```
>  
> When specify full reference to attribute e.g. 
>  
> ```
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> it fails on validation stage
>  
> ```
> {"exception":{"message":"Attribute Table.popularityScore not found for type 
> Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
>  Attribute Table.popularityScore not found for type Table\n\tat 
> org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)
> ```
> workaround is provided as a patch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ATLAS-3758) Support sort params for FreeTextSearchProcessor

2020-05-14 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski closed ATLAS-3758.


Fixed and merged.

> Support sort params for FreeTextSearchProcessor
> ---
>
> Key: ATLAS-3758
> URL: https://issues.apache.org/jira/browse/ATLAS-3758
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 2.1.0, 3.0.0
>
> Attachments: ATLAS-3758.patch
>
>
> *Problem description*
> No way to sort results by specified attribute while freetext search is 
> enabled.
> *Goals*
> As a team we are working to use Atlas as a metadata storage for 
> [https://github.com/lyft/amundsen]. It is required to sort results by any 
> particular attribute e.g. custom attribute which represents popularity score 
> to provide basic search relevancy for end users.
> *Proposed solution*
>  * add required parameters in the indexed query if specified



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ATLAS-3760) Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-05-14 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski closed ATLAS-3760.


Fixed and merged

> Optimize FreeTextSearchProcessor to apply exclude deleted entity  filter on 
> solr side.
> --
>
> Key: ATLAS-3760
> URL: https://issues.apache.org/jira/browse/ATLAS-3760
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 2.1.0, 3.0.0
>
>
> *Problem description*
> Current implementation of FreeTextSearchProcessor applies filtering in memory 
> to exclude deleted entities.
> This introduces significant performance overhead by generating redundant 
> calls to solr index. 
> *Goals*
> Improve performance of FreeTextSearchProcessor by applying filter in solr 
> query.
> *Proposed solution*
>  * replace in-memory filtering with filter in solr query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72440: Support sort params for FreeTextSearchProcessor

2020-05-13 Thread Damian Warszawski


> On May 7, 2020, 12:21 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java
> > Lines 995 (patched)
> > <https://reviews.apache.org/r/72440/diff/1/?file=2228710#file2228710line995>
> >
> > entityType could be null when called from FreeTextSearchProcessor. 
> > Please review and update to handle this condition.

Good point, additional check added.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72440/#review220666
-------


On May 13, 2020, 10:51 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72440/
> ---
> 
> (Updated May 13, 2020, 10:51 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, Madhan Neethiraj, 
> and Sarath Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> No way to sort results by specified attribute while freetext search is 
> enabled. In our case we would like to enforce ordering by introducing custom 
> attribute definition e.g. popularity score from 
> https://github.com/dwarszawski/amundsen-atlas-types/blob/master/amundsenatlastypes/schema/01_2_table_schema.json
> 
> 
> Reference to jira https://issues.apache.org/jira/browse/ATLAS-3758
> Patched applied against master branch.
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  fb12244ed 
>   
> repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
>  9850d8ecf 
>   repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
> 11eb7ca49 
>   
> repository/src/test/java/org/apache/atlas/discovery/FreeTextSearchProcessorTest.java
>  PRE-CREATION 
>   test-tools/src/main/resources/solr/core-template/solrconfig.xml 9264f99d4 
> 
> 
> Diff: https://reviews.apache.org/r/72440/diff/2/
> 
> 
> Testing
> ---
> 
> Patch was applied on our dev env with custom entity definitions and 
> successfully verified if order is applied as specified in the search query.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72440: Support sort params for FreeTextSearchProcessor

2020-05-13 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72440/
---

(Updated May 13, 2020, 10:51 p.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, Madhan Neethiraj, 
and Sarath Subramanian.


Changes
---

add unit tests for freetext search processor


Repository: atlas


Description
---

No way to sort results by specified attribute while freetext search is enabled. 
In our case we would like to enforce ordering by introducing custom attribute 
definition e.g. popularity score from 
https://github.com/dwarszawski/amundsen-atlas-types/blob/master/amundsenatlastypes/schema/01_2_table_schema.json


Reference to jira https://issues.apache.org/jira/browse/ATLAS-3758
Patched applied against master branch.


Diffs (updated)
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
fb12244ed 
  
repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
 9850d8ecf 
  repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
11eb7ca49 
  
repository/src/test/java/org/apache/atlas/discovery/FreeTextSearchProcessorTest.java
 PRE-CREATION 
  test-tools/src/main/resources/solr/core-template/solrconfig.xml 9264f99d4 


Diff: https://reviews.apache.org/r/72440/diff/2/

Changes: https://reviews.apache.org/r/72440/diff/1-2/


Testing
---

Patch was applied on our dev env with custom entity definitions and 
successfully verified if order is applied as specified in the search query.


Thanks,

Damian Warszawski



Re: Review Request 72440: Support sort params for FreeTextSearchProcessor

2020-05-13 Thread Damian Warszawski


> On May 6, 2020, 6:57 a.m., Bolke de Bruin wrote:
> > Can you please update the tests?
> 
> Sarath Subramanian wrote:
> +1

introduced unit tests for FreetextSearchProcessor


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72440/#review220648
---


On May 13, 2020, 10:51 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72440/
> ---
> 
> (Updated May 13, 2020, 10:51 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, Madhan Neethiraj, 
> and Sarath Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> No way to sort results by specified attribute while freetext search is 
> enabled. In our case we would like to enforce ordering by introducing custom 
> attribute definition e.g. popularity score from 
> https://github.com/dwarszawski/amundsen-atlas-types/blob/master/amundsenatlastypes/schema/01_2_table_schema.json
> 
> 
> Reference to jira https://issues.apache.org/jira/browse/ATLAS-3758
> Patched applied against master branch.
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  fb12244ed 
>   
> repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
>  9850d8ecf 
>   repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
> 11eb7ca49 
>   
> repository/src/test/java/org/apache/atlas/discovery/FreeTextSearchProcessorTest.java
>  PRE-CREATION 
>   test-tools/src/main/resources/solr/core-template/solrconfig.xml 9264f99d4 
> 
> 
> Diff: https://reviews.apache.org/r/72440/diff/2/
> 
> 
> Testing
> ---
> 
> Patch was applied on our dev env with custom entity definitions and 
> successfully verified if order is applied as specified in the search query.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-07 Thread Damian Warszawski


> On May 6, 2020, 10:59 p.m., Madhan Neethiraj wrote:
> > graphdb/common/src/main/java/org/apache/atlas/repository/graphdb/tinkerpop/query/TinkerpopGraphQuery.java
> > Line 215 (original), 215 (patched)
> > <https://reviews.apache.org/r/72459/diff/4/?file=2230572#file2230572line215>
> >
> > Why is it necessary to switch to LinkedHashSet here?

Only LinkedHashSet maintains insertion order. Provided test for search with 
sortBy fails when underlying structure is HashSet.


> On May 6, 2020, 10:59 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
> > Line 203 (original), 204 (patched)
> > <https://reviews.apache.org/r/72459/diff/4/?file=2230573#file2230573line204>
> >
> > Instead of ., I suggest the following:
> > 
> >   AtlasAttribute sortByAttribute = 
> > context.getEntityType().getAttribute(sortBy);
> >   
> >   if (sortByAttribute != null) {
> > AtlasGraphQuery.SortOrder qrySortOrder = sortOrder == 
> > SortOrder.ASCENDING ? ASC : DESC;
> > 
> > graphQuery.orderBy(sortByAttribute.getVertexPropertyName(), 
> > qrySortOrder)
> >   }
> > 
> > This will ensure that sortBy will work for system-attributes as well - 
> > like __timestamp, __modificationTimestamp.

Great suggestion. Fixed, thanks.


> On May 6, 2020, 10:59 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
> > Lines 367 (patched)
> > <https://reviews.apache.org/r/72459/diff/4/?file=2230573#file2230573line367>
> >
> > I suggest to replace use of guava library class with JDK equivalent: 
> >   StreamSupport.stream(graphQuery.vertexIds().iterator(), false).count()

Fixed.


- Damian


-------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/#review220665
---


On May 7, 2020, 8:03 a.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72459/
> ---
> 
> (Updated May 7, 2020, 8:03 a.m.)
> 
> 
> Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon 
> Rodrigues.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> EntitySearchProcessor fails when doing search by classification and specify 
> orderBy attribute. The issue is that for graph query you cannot refer to 
> attribute by name but need to provide absolute path to entity attribute e.g. 
> 
>  
> 
> ```
> 
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> 
> this query fails with following exception:
> 
>  
> 
> ```
> 
> {"exception":{"message":"Provided key does not exist: 
> hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
>  Provided key does not exist: hive_table.popularityScore\n\tat 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
>  org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
> orderBy(GraphCentricQueryBuilder.java:160)
> 
> ```
> 
>  
> 
> When specify full reference to attribute e.g. 
> 
>  
> 
> ```
> 
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> 
> it fails on validation stage
> 
>  
> 
> ```
> 
> {"exception":{"message":"Attribute Table.popularityScore not found for type 
> Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBa

Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-07 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/
---

(Updated May 7, 2020, 8:03 a.m.)


Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon Rodrigues.


Repository: atlas


Description
---

EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776


Diffs (updated)
-

  
graphdb/common/src/main/java/org/apache/atlas/repository/graphdb/tinkerpop/query/TinkerpopGraphQuery.java
 c70e8bfe8 
  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 
  
repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/query/BasicTestSetup.java 9aa554ad5 
  repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java 0bbff2f46 


Diff: https://reviews.apache.org/r/72459/diff/5/

Changes: https://reviews.apache.org/r/72459/diff/4-5/


Testing
---

tested on our dev env.


Thanks,

Damian Warszawski



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-06 Thread Damian Warszawski


> On May 6, 2020, 6:49 p.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
> > Lines 369 (patched)
> > <https://reviews.apache.org/r/72459/diff/4/?file=2230573#file2230573line369>
> >
> > typo maybe? => 'return -1'

As the return type is long, I made it more explicit with `l` suffix. It 
compiles properly.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/#review220660
---


On May 5, 2020, 10:11 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72459/
> ---
> 
> (Updated May 5, 2020, 10:11 p.m.)
> 
> 
> Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon 
> Rodrigues.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> EntitySearchProcessor fails when doing search by classification and specify 
> orderBy attribute. The issue is that for graph query you cannot refer to 
> attribute by name but need to provide absolute path to entity attribute e.g. 
> 
>  
> 
> ```
> 
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> 
> this query fails with following exception:
> 
>  
> 
> ```
> 
> {"exception":{"message":"Provided key does not exist: 
> hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
>  Provided key does not exist: hive_table.popularityScore\n\tat 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
>  org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
> orderBy(GraphCentricQueryBuilder.java:160)
> 
> ```
> 
>  
> 
> When specify full reference to attribute e.g. 
> 
>  
> 
> ```
> 
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> 
> it fails on validation stage
> 
>  
> 
> ```
> 
> {"exception":{"message":"Attribute Table.popularityScore not found for type 
> Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
>  Attribute Table.popularityScore not found for type Table\n\tat 
> org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)
> 
> ```
> 
> Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776
> 
> 
> Diffs
> -
> 
>   
> graphdb/common/src/main/java/org/apache/atlas/repository/graphdb/tinkerpop/query/TinkerpopGraphQuery.java
>  c70e8bfe8 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  1a7bf6b16 
>   
> repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
>  PRE-CREATION 
>   repository/src/test/java/org/apache/atlas/query/BasicTestSetup.java 
> 9aa554ad5 
>   repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java 
> 0bbff2f46 
> 
> 
> Diff: https://reviews.apache.org/r/72459/diff/4/
> 
> 
> Testing
> ---
> 
> tested on our dev env.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-05 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/
---

(Updated May 5, 2020, 10:11 p.m.)


Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon Rodrigues.


Changes
---

Fixed orderBy for graphQuery by replacing Hashset with LinkedHashSet to 
maintain insertion order.


Repository: atlas


Description
---

EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776


Diffs (updated)
-

  
graphdb/common/src/main/java/org/apache/atlas/repository/graphdb/tinkerpop/query/TinkerpopGraphQuery.java
 c70e8bfe8 
  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 
  
repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/query/BasicTestSetup.java 9aa554ad5 
  repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java 0bbff2f46 


Diff: https://reviews.apache.org/r/72459/diff/4/

Changes: https://reviews.apache.org/r/72459/diff/3-4/


Testing
---

tested on our dev env.


Thanks,

Damian Warszawski



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-05 Thread Damian Warszawski


> On May 2, 2020, 2:44 p.m., Bolke de Bruin wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
> > Line 203 (original), 203 (patched)
> > <https://reviews.apache.org/r/72459/diff/1/?file=2229558#file2229558line203>
> >
> > can you add a test please?

Added bunch of tests, introduce little fix to get right result count for 
graphQuery. There is kind of undeterminism while running test for sortBy. I 
think that implementation of sortBy on graphQuery never worked properly.


- Damian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/#review220587
-------


On May 5, 2020, 3:26 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72459/
> ---
> 
> (Updated May 5, 2020, 3:26 p.m.)
> 
> 
> Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon 
> Rodrigues.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> EntitySearchProcessor fails when doing search by classification and specify 
> orderBy attribute. The issue is that for graph query you cannot refer to 
> attribute by name but need to provide absolute path to entity attribute e.g. 
> 
>  
> 
> ```
> 
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> 
> this query fails with following exception:
> 
>  
> 
> ```
> 
> {"exception":{"message":"Provided key does not exist: 
> hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
>  Provided key does not exist: hive_table.popularityScore\n\tat 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
>  org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
> orderBy(GraphCentricQueryBuilder.java:160)
> 
> ```
> 
>  
> 
> When specify full reference to attribute e.g. 
> 
>  
> 
> ```
> 
> { "attributes": [ "description", "comment", "popularityScore" ], 
> "classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
> "limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
> "DESCENDING", "typeName": "hive_table" }
> ```
> 
> it fails on validation stage
> 
>  
> 
> ```
> 
> {"exception":{"message":"Attribute Table.popularityScore not found for type 
> Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
>  Attribute Table.popularityScore not found for type Table\n\tat 
> org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)
> 
> ```
> 
> Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java
>  1a7bf6b16 
>   
> repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
>  PRE-CREATION 
>   repository/src/test/java/org/apache/atlas/query/BasicTestSetup.java 
> 9aa554ad5 
>   repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java 
> 0bbff2f46 
> 
> 
> Diff: https://reviews.apache.org/r/72459/diff/3/
> 
> 
> Testing
> ---
> 
> tested on our dev env.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-05 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/
---

(Updated May 5, 2020, 3:26 p.m.)


Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon Rodrigues.


Changes
---

optimize getResultCount() for graphQuery


Repository: atlas


Description
---

EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776


Diffs (updated)
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 
  
repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/query/BasicTestSetup.java 9aa554ad5 
  repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java 0bbff2f46 


Diff: https://reviews.apache.org/r/72459/diff/3/

Changes: https://reviews.apache.org/r/72459/diff/2-3/


Testing
---

tested on our dev env.


Thanks,

Damian Warszawski



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-05 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/
---

(Updated May 5, 2020, 11:45 a.m.)


Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon Rodrigues.


Repository: atlas


Description
---

EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776


Diffs (updated)
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 
  
repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/query/BasicTestSetup.java 9aa554ad5 
  repository/src/test/java/org/apache/atlas/query/DSLQueriesTest.java 0bbff2f46 


Diff: https://reviews.apache.org/r/72459/diff/2/

Changes: https://reviews.apache.org/r/72459/diff/1-2/


Testing
---

tested on our dev env.


Thanks,

Damian Warszawski



Re: Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-05 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/
---

(Updated May 5, 2020, 11:45 a.m.)


Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon Rodrigues.


Repository: atlas


Description
---

EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 


Diff: https://reviews.apache.org/r/72459/diff/1/


Testing
---

tested on our dev env.


File Attachments (updated)


ATLAS-3776-1.patch
  
https://reviews.apache.org/media/uploaded/files/2020/05/05/96de2792-c75b-4440-8368-8791be7b005f__ATLAS-3776-1.patch


Thanks,

Damian Warszawski



[jira] [Updated] (ATLAS-3776) graph query fails when orderBy attribute is specified

2020-05-01 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3776:
-
Description: 
EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }

```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
Table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
 orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }

```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

workaround is provided as a patch.

  was:
EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{
 "attributes": [
 "description",
 "comment",
 "popularityScore"
 ],
 "classification": "customer_NON_PII",
 "excludeDeletedEntities": "False",
 "limit": "",
 "offset": 100,
 "sortBy": "Table.popularityScore",
 "sortOrder": "DESCENDING",
 "typeName": "hive_table"
}

```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{
 "attributes": [
 "description",
 "comment",
 "popularityScore"
 ],
 "classification": "customer_NON_PII",
 "excludeDeletedEntities": "False",
 "limit": "",
 "offset": 100,
 "sortBy": "Table.popularityScore",
 "sortOrder": "DESCENDING",
 "typeName": "hive_table"
}

```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

workaround is provided as a patch.


> graph query fails when orderBy attribute is specified
> -
>
> Key: ATLAS-3776
> URL: https://issues.apache.org/jira/browse/ATLAS-3776
> Project: Atlas
>  Issu

Review Request 72459: EntitySearchProcessor is failing on graph query with sortBy attribute.

2020-05-01 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72459/
---

Review request for atlas, Bolke de Bruin, Madhan Neethiraj, and Nixon Rodrigues.


Repository: atlas


Description
---

EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{ "attributes": [ "description", "comment", "popularityScore" ], 
"classification": "customer_NON_PII", "excludeDeletedEntities": "False", 
"limit": "", "offset": 100, "sortBy": "Table.popularityScore", "sortOrder": 
"DESCENDING", "typeName": "hive_table" }
```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

Reference to JIRA https://issues.apache.org/jira/browse/ATLAS-3776


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 


Diff: https://reviews.apache.org/r/72459/diff/1/


Testing
---

tested on our dev env.


Thanks,

Damian Warszawski



[jira] [Created] (ATLAS-3776) graph query fails when orderBy attribute is specified

2020-05-01 Thread Damian Warszawski (Jira)
Damian Warszawski created ATLAS-3776:


 Summary: graph query fails when orderBy attribute is specified
 Key: ATLAS-3776
 URL: https://issues.apache.org/jira/browse/ATLAS-3776
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Affects Versions: 3.0.0
Reporter: Damian Warszawski


EntitySearchProcessor fails when doing search by classification and specify 
orderBy attribute. The issue is that for graph query you cannot refer to 
attribute by name but need to provide absolute path to entity attribute e.g. 

 

```

{
 "attributes": [
 "description",
 "comment",
 "popularityScore"
 ],
 "classification": "customer_NON_PII",
 "excludeDeletedEntities": "False",
 "limit": "",
 "offset": 100,
 "sortBy": "Table.popularityScore",
 "sortOrder": "DESCENDING",
 "typeName": "hive_table"
}

```

this query fails with following exception:

 

```

{"exception":{"message":"Provided key does not exist: 
hive_table.popularityScore","class":"java.lang.IllegalArgumentException","stacktrace":"java.lang.IllegalArgumentException:
 Provided key does not exist: hive_table.popularityScore\n\tat 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)\n\tat
 org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.
orderBy(GraphCentricQueryBuilder.java:160)

```

 

When specify full reference to attribute e.g. 

 

```

{
 "attributes": [
 "description",
 "comment",
 "popularityScore"
 ],
 "classification": "customer_NON_PII",
 "excludeDeletedEntities": "False",
 "limit": "",
 "offset": 100,
 "sortBy": "Table.popularityScore",
 "sortOrder": "DESCENDING",
 "typeName": "hive_table"
}

```

it fails on validation stage

 

```

{"exception":{"message":"Attribute Table.popularityScore not found for type 
Table","class":"org.apache.atlas.exception.AtlasBaseException","stacktrace":"org.apache.atlas.exception.AtlasBaseException:
 Attribute Table.popularityScore not found for type Table\n\tat 
org.apache.atlas.discovery.SearchContext.validateAttributes(SearchContext.java:288)

```

workaround is provided as a patch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ATLAS-3654) Support solr in standalone (http) mode

2020-04-29 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095892#comment-17095892
 ] 

Damian Warszawski edited comment on ATLAS-3654 at 4/29/20, 9:10 PM:


[~nixon],

it is controlled with following application property 
`_atlas.graph.index.search.solr.mode_` which is also used by JanusGraph. 

Package is build with the profile `_embedded-hbase-solr_` as it used to be for 
`cloud` mode for compatibility reasons.

Perhaps, it would useful to create another profile for `_embedded-solr_` only. 

 


was (Author: dwarszawski):
[~nixon],

it is controlled with following application property 
`atlas.graph.index.search.solr.mode` which is also used by JanusGraph. 

Package is build with the profile `embedded-hbase-solr` as it used to be for 
`cloud` mode for compatibility reasons.

Perhaps, it would useful to create another profile for `embedded-solr` only. 

 

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3654) Support solr in standalone (http) mode

2020-04-29 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095892#comment-17095892
 ] 

Damian Warszawski commented on ATLAS-3654:
--

it is controlled with following application property 
`atlas.graph.index.search.solr.mode` which is also used by JanusGraph. 

Package is build with the profile `embedded-hbase-solr` as it used to be for 
`cloud` mode for compatibility reasons.

Perhaps, it would useful to create another profile for `embedded-solr` only. 

 

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ATLAS-3654) Support solr in standalone (http) mode

2020-04-29 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095892#comment-17095892
 ] 

Damian Warszawski edited comment on ATLAS-3654 at 4/29/20, 9:09 PM:


[~nixon],

it is controlled with following application property 
`atlas.graph.index.search.solr.mode` which is also used by JanusGraph. 

Package is build with the profile `embedded-hbase-solr` as it used to be for 
`cloud` mode for compatibility reasons.

Perhaps, it would useful to create another profile for `embedded-solr` only. 

 


was (Author: dwarszawski):
it is controlled with following application property 
`atlas.graph.index.search.solr.mode` which is also used by JanusGraph. 

Package is build with the profile `embedded-hbase-solr` as it used to be for 
`cloud` mode for compatibility reasons.

Perhaps, it would useful to create another profile for `embedded-solr` only. 

 

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ATLAS-3760) Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-04-29 Thread Damian Warszawski (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095884#comment-17095884
 ] 

Damian Warszawski commented on ATLAS-3760:
--

[~madhan] thanks for getting this done so quickly.

> Optimize FreeTextSearchProcessor to apply exclude deleted entity  filter on 
> solr side.
> --
>
> Key: ATLAS-3760
> URL: https://issues.apache.org/jira/browse/ATLAS-3760
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 2.1.0, 3.0.0
>
>
> *Problem description*
> Current implementation of FreeTextSearchProcessor applies filtering in memory 
> to exclude deleted entities.
> This introduces significant performance overhead by generating redundant 
> calls to solr index. 
> *Goals*
> Improve performance of FreeTextSearchProcessor by applying filter in solr 
> query.
> *Proposed solution*
>  * replace in-memory filtering with filter in solr query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72441: Support solr in standalone (http) mode

2020-04-29 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72441/
---

(Updated April 29, 2020, 10:32 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

Added reference to jira.


Repository: atlas


Description (updated)
---

Atlas does not support running Solr in standalone(http) mode.

It is especially useful for testing purposes to make setup as simple as 
possible without Zookeeper. It also enables full integration with JanusGraph as 
it support both mode of running Solr `cloud` and `http` 
https://docs.janusgraph.org/index-backend/solr/. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

Proposed solution

call solr V1 API  while creating/updating request handlers in standalone solr
update atlas start script to enable standalone embedded solr

Reference to jira https://issues.apache.org/jira/browse/ATLAS-3654
Patch was applied against master branch


Diffs
-

  distro/src/bin/atlas_config.py f09026ff9 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphIndexClient.java
 ba65f3d00 
  graphdb/janus/src/main/java/org/janusgraph/diskstorage/solr/Solr6Index.java 
484c161f0 


Diff: https://reviews.apache.org/r/72441/diff/1/


Testing
---

Patch was applied and verified on our dev env with embedded solr and external 
hbase.


Thanks,

Damian Warszawski



Re: Review Request 72440: Support sort params for FreeTextSearchProcessor

2020-04-29 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72440/
---

(Updated April 29, 2020, 10:26 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, Madhan Neethiraj, 
and Sarath Subramanian.


Changes
---

added reference to jira


Repository: atlas


Description (updated)
---

No way to sort results by specified attribute while freetext search is enabled. 
In our case we would like to enforce ordering by introducing custom attribute 
definition e.g. popularity score from 
https://github.com/dwarszawski/amundsen-atlas-types/blob/master/amundsenatlastypes/schema/01_2_table_schema.json


Reference to jira https://issues.apache.org/jira/browse/ATLAS-3758
Patched applied against master branch.


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 
  
repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
 92b5eb4d2 
  repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
11eb7ca49 


Diff: https://reviews.apache.org/r/72440/diff/1/


Testing
---

Patch was applied on our dev env with custom entity definitions and 
successfully verified if order is applied as specified in the search query.


Thanks,

Damian Warszawski



Re: Review Request 72446: Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-04-29 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72446/
---

(Updated April 29, 2020, 10:14 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Changes
---

reference to jira in description


Repository: atlas


Description
---

Current implementation of FreeTextSearchProcessor applies filtering in memory 
to exclude deleted entities. This introduces significant performance overhead 
by generating redundant calls to solr index. The goal is to improve performance 
of FreeTextSearchProcessor by applying filter in solr query.


Reference to jira https://issues.apache.org/jira/browse/ATLAS-3760


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
 92b5eb4d2 


Diff: https://reviews.apache.org/r/72446/diff/1/


Testing (updated)
---

Verified on our dev env and achieved 10x faster response for simple call to 
atlas basic search with over 50k entities.

Patch applied on top of master branch.


Thanks,

Damian Warszawski



Re: Review Request 72446: Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-04-29 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72446/
---

(Updated April 29, 2020, 10:12 a.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Repository: atlas


Description (updated)
---

Current implementation of FreeTextSearchProcessor applies filtering in memory 
to exclude deleted entities. This introduces significant performance overhead 
by generating redundant calls to solr index. The goal is to improve performance 
of FreeTextSearchProcessor by applying filter in solr query.


Reference to jira https://issues.apache.org/jira/browse/ATLAS-3760


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
 92b5eb4d2 


Diff: https://reviews.apache.org/r/72446/diff/1/


Testing
---

Verified on our dev env and achieved 10x faster response for simple call to 
atlas basic search with over 50k entities.


Thanks,

Damian Warszawski



Review Request 72446: Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-04-28 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72446/
---

Review request for atlas and Bolke de Bruin.


Repository: atlas


Description
---

Current implementation of FreeTextSearchProcessor applies filtering in memory 
to exclude deleted entities. This introduces significant performance overhead 
by generating redundant calls to solr index. The goal is to improve performance 
of FreeTextSearchProcessor by applying filter in solr query.


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
 92b5eb4d2 


Diff: https://reviews.apache.org/r/72446/diff/1/


Testing
---

Verified on our dev env and achieved 10x faster response for simple call to 
atlas basic search with over 50k entities.


Thanks,

Damian Warszawski



Re: Review Request 72440: Support sort params for FreeTextSearchProcessor

2020-04-28 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72440/
---

(Updated April 28, 2020, 8:59 p.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, Madhan Neethiraj, 
and Sarath Subramanian.


Repository: atlas


Description
---

No way to sort results by specified attribute while freetext search is enabled. 
In our case we would like to enforce ordering by introducing custom attribute 
definition e.g. popularity score from 
https://github.com/dwarszawski/amundsen-atlas-types/blob/master/amundsenatlastypes/schema/01_2_table_schema.json


Diffs
-

  
repository/src/main/java/org/apache/atlas/discovery/EntitySearchProcessor.java 
1a7bf6b16 
  
repository/src/main/java/org/apache/atlas/discovery/FreeTextSearchProcessor.java
 92b5eb4d2 
  repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
11eb7ca49 


Diff: https://reviews.apache.org/r/72440/diff/1/


Testing
---

Patch was applied on our dev env with custom entity definitions and 
successfully verified if order is applied as specified in the search query.


Thanks,

Damian Warszawski



Re: Review Request 72441: Support solr in standalone (http) mode

2020-04-28 Thread Damian Warszawski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72441/
---

(Updated April 28, 2020, 8:58 p.m.)


Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
Subramanian.


Repository: atlas


Description
---

Atlas does not support running Solr in standalone(http) mode.

It is especially useful for testing purposes to make setup as simple as 
possible without Zookeeper. It also enables full integration with JanusGraph as 
it support both mode of running Solr `cloud` and `http` 
https://docs.janusgraph.org/index-backend/solr/. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

Proposed solution

call solr V1 API  while creating/updating request handlers in standalone solr
update atlas start script to enable standalone embedded solr


Diffs
-

  distro/src/bin/atlas_config.py f09026ff9 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphIndexClient.java
 ba65f3d00 
  graphdb/janus/src/main/java/org/janusgraph/diskstorage/solr/Solr6Index.java 
484c161f0 


Diff: https://reviews.apache.org/r/72441/diff/1/


Testing
---

Patch was applied and verified on our dev env with embedded solr and external 
hbase.


Thanks,

Damian Warszawski



[jira] [Updated] (ATLAS-3760) Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-04-28 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3760:
-
Description: 
*Problem description*

Current implementation of FreeTextSearchProcessor applies filtering in memory 
to exclude deleted entities.

This introduces significant performance overhead by generating redundant calls 
to solr index. 

*Goals*

Improve performance of FreeTextSearchProcessor by applying filter in solr query.

*Proposed solution*
 * replace in-memory filtering with filter in solr query.

> Optimize FreeTextSearchProcessor to apply exclude deleted entity  filter on 
> solr side.
> --
>
> Key: ATLAS-3760
> URL: https://issues.apache.org/jira/browse/ATLAS-3760
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>    Reporter: Damian Warszawski
>Priority: Minor
> Fix For: 3.0.0
>
>
> *Problem description*
> Current implementation of FreeTextSearchProcessor applies filtering in memory 
> to exclude deleted entities.
> This introduces significant performance overhead by generating redundant 
> calls to solr index. 
> *Goals*
> Improve performance of FreeTextSearchProcessor by applying filter in solr 
> query.
> *Proposed solution*
>  * replace in-memory filtering with filter in solr query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-3760) Optimize FreeTextSearchProcessor to apply exclude deleted entity filter on solr side.

2020-04-28 Thread Damian Warszawski (Jira)
Damian Warszawski created ATLAS-3760:


 Summary: Optimize FreeTextSearchProcessor to apply exclude deleted 
entity  filter on solr side.
 Key: ATLAS-3760
 URL: https://issues.apache.org/jira/browse/ATLAS-3760
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Reporter: Damian Warszawski
 Fix For: 3.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-04-27 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: ATLAS-3654.patch

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-04-27 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: (was: ATLAS-3654.patch)

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3758) Support sort params for FreeTextSearchProcessor

2020-04-27 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3758:
-
Description: 
*Problem description*

No way to sort results by specified attribute while freetext search is enabled.

*Goals*

As a team we are working to use Atlas as a metadata storage for 
[https://github.com/lyft/amundsen]. It is required to sort results by any 
particular attribute e.g. custom attribute which represents popularity score to 
provide basic search relevancy for end users.

*Proposed solution*
 * add required parameters in the indexed query if specified

  was:
*Problem description*

No way to sort results by specified attribute while freetext search is enabled.

*Goals*

As a team we are working to use Atlas as a metadata storage for 
[https://github.com/lyft/amundsen]. It is required to sort results by 
particular attribute e.g. popularityScore to provide basic search relevancy for 
end users.

*Proposed solution*
 * add required parameters in the indexed query if specified


> Support sort params for FreeTextSearchProcessor
> ---
>
> Key: ATLAS-3758
> URL: https://issues.apache.org/jira/browse/ATLAS-3758
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3758.patch
>
>
> *Problem description*
> No way to sort results by specified attribute while freetext search is 
> enabled.
> *Goals*
> As a team we are working to use Atlas as a metadata storage for 
> [https://github.com/lyft/amundsen]. It is required to sort results by any 
> particular attribute e.g. custom attribute which represents popularity score 
> to provide basic search relevancy for end users.
> *Proposed solution*
>  * add required parameters in the indexed query if specified



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3758) Support sort params for FreeTextSearchProcessor

2020-04-27 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3758:
-
Attachment: ATLAS-3758.patch

> Support sort params for FreeTextSearchProcessor
> ---
>
> Key: ATLAS-3758
> URL: https://issues.apache.org/jira/browse/ATLAS-3758
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3758.patch
>
>
> *Problem description*
> No way to sort results by specified attribute while freetext search is 
> enabled.
> *Goals*
> As a team we are working to use Atlas as a metadata storage for 
> [https://github.com/lyft/amundsen]. It is required to sort results by 
> particular attribute e.g. popularityScore to provide basic search relevancy 
> for end users.
> *Proposed solution*
>  * add required parameters in the indexed query if specified



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-3758) Support sort params for FreeTextSearchProcessor

2020-04-27 Thread Damian Warszawski (Jira)
Damian Warszawski created ATLAS-3758:


 Summary: Support sort params for FreeTextSearchProcessor
 Key: ATLAS-3758
 URL: https://issues.apache.org/jira/browse/ATLAS-3758
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Affects Versions: 3.0.0
Reporter: Damian Warszawski


*Problem description*

No way to sort results by specified attribute while freetext search is enabled.

*Goals*

As a team we are working to use Atlas as a metadata storage for 
[https://github.com/lyft/amundsen]. It is required to sort results by 
particular attribute e.g. popularityScore to provide basic search relevancy for 
end users.

*Proposed solution*
 * add required parameters in the indexed query if specified



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Enable support for solr in http mode.

2020-03-21 Thread Damian Warszawski
Hi everyone,

For testing purposes in our Atlas dev deployments In ING WBAA, it was
required to run embedded solr with external Hbase. It was enabled by
changes in the start scripts as well as in the way Solr client is
configured to avoid setting up zookeeper by using solr in http mode.
Additional benefit is full-integration with JanusGraph configuration.

Here is propose solution to enable support of Solr in http(standalone) mode:

https://issues.apache.org/jira/browse/ATLAS-3654
https://github.com/apache/atlas/pull/90

As it is my first PR to Atlas please let me know if I missed anything here.

Damian


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
External issue URL: https://github.com/apache/atlas/pull/90

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: ATLAS-3654.patch

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: (was: ATLAS-3654.patch)

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Warszawski updated ATLAS-3654:
-
Attachment: ATLAS-3654.patch

> Support solr in standalone (http) mode
> --
>
> Key: ATLAS-3654
> URL: https://issues.apache.org/jira/browse/ATLAS-3654
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 3.0.0
>    Reporter: Damian Warszawski
>Priority: Minor
> Attachments: ATLAS-3654.patch
>
>
> *Problem description*
> Atlas does not support running Solr in standalone(http) mode.
> *Goals*
>  It is especially useful for testing purposes to make setup as simple as 
> possible without  Zookeeper. It also enables full integration with JanusGraph 
> as it support both mode of running Solr `cloud` and `http` 
> [https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
> decouple hbase and solr while running embedded mode so that solr can be run 
> in embbeded mode with external hbase.
> *Proposed solution*
>  * call solr V1 API  while creating/updating request handlers in standalone 
> solr
>  * update atlas start script to enable standalone embedded solr
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-3654) Support solr in standalone (http) mode

2020-03-05 Thread Damian Warszawski (Jira)
Damian Warszawski created ATLAS-3654:


 Summary: Support solr in standalone (http) mode
 Key: ATLAS-3654
 URL: https://issues.apache.org/jira/browse/ATLAS-3654
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Affects Versions: 3.0.0
Reporter: Damian Warszawski


*Problem description*

Atlas does not support running Solr in standalone(http) mode.

*Goals*

 It is especially useful for testing purposes to make setup as simple as 
possible without  Zookeeper. It also enables full integration with JanusGraph 
as it support both mode of running Solr `cloud` and `http` 
[https://docs.janusgraph.org/index-backend/solr/]. Additional benefit is to 
decouple hbase and solr while running embedded mode so that solr can be run in 
embbeded mode with external hbase.

*Proposed solution*
 * call solr V1 API  while creating/updating request handlers in standalone solr
 * update atlas start script to enable standalone embedded solr

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)