[ https://issues.apache.org/jira/browse/ATLAS-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016897#comment-16016897 ]
Hemanth Yamijala commented on ATLAS-1818: ----------------------------------------- [~ashutoshm], thanks for taking up this really useful improvement! One question: Regarding your last point - is the intention to provide faceted search in searchUsingBasicQuery that might follow a Lucene syntax or some such? > Performance of Basic Search that Uses indexQuery Takes Long Time to Fetch > Results > --------------------------------------------------------------------------------- > > Key: ATLAS-1818 > URL: https://issues.apache.org/jira/browse/ATLAS-1818 > Project: Atlas > Issue Type: Bug > Components: atlas-core, atlas-webui > Affects Versions: trunk, 0.8-incubating > Reporter: Ashutosh Mestry > Assignee: Ashutosh Mestry > Fix For: trunk, 0.8-incubating > > Attachments: ATLAS-1818.patch > > Original Estimate: 120h > Time Spent: 96h > Remaining Estimate: 24h > > h3. Background > An environment that is setup with 100K hive_tables each with 84 columns. > The basic search with query parameter specified is executed. Results take 75 > secs to appear. > h3. Analysis & Findings > Similar test was performed with smaller data set (200 hive_tables each with > 81 columns) resulted in less than ideal performance. > Atlas Basic Search API uses _graph.indexQuery_ for performing search. This > uses _Solr_ for doing the search. > There are 2 aspects that affect performance: > * Solr's default for returning max query set when no limit is specified is > 100K. In the test scenario, this is returning entire result set. > * Once result set is returned, _EntityDiscoveryService.searchUsingBasicQuery_ > does a sequential scan to weed out pertinent data. This operation is > proportional to size of the result set. > h3. Solution > Following changes will improve performance: > * Solr's max result set property is governed by > _atlas.graph.index.search.max-result-set-size_. It will make sense to set > this to a lower number. > * Modify Solr's configuration _solrconfig.xml_ to use _FastLRUCache_. > * Modify _EntityDiscoveryService.searchUsingBasicQuery_ to form a query that > takes additional paramters. -- This message was sent by Atlassian JIRA (v6.3.15#6346)