[jira] [Commented] (DRILL-5089) Skip initializing all enabled storage plugins for every query

ASF GitHub Bot (JIRA) Sat, 25 Mar 2017 22:47:42 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942158#comment-15942158
 ]


ASF GitHub Bot commented on DRILL-5089:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/795#discussion_r108051857
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/SchemaTreeProvider.java
 ---
    @@ -119,6 +127,74 @@ public SchemaPlus createRootSchema(SchemaConfig 
schemaConfig) {
         }
       }
     
    +
    +  public SchemaPlus createPartialRootSchema(final String userName, final 
SchemaConfigInfoProvider provider,
    +                                            final String storage) {
    +    final String schemaUser = isImpersonationEnabled ? userName : 
ImpersonationUtil.getProcessUserName();
    +    final SchemaConfig schemaConfig = SchemaConfig.newBuilder(schemaUser, 
provider).build();
    +    final SchemaPlus rootSchema = 
SimpleCalciteSchema.createRootSchema(false);
    +    Set<String> storageSet = Sets.newHashSet();
    +    storageSet.add(storage);
    +    addNewStoragesToRootSchema(schemaConfig, rootSchema, storageSet);
    +    schemaTreesToClose.add(rootSchema);
    +    return rootSchema;
    +  }
    +
    +  public SchemaPlus addPartialRootSchema(final String userName, final 
SchemaConfigInfoProvider provider,
    +                                            Set<String> storages, 
SchemaPlus rootSchema) {
    +    final String schemaUser = isImpersonationEnabled ? userName : 
ImpersonationUtil.getProcessUserName();
    +    final SchemaConfig schemaConfig = SchemaConfig.newBuilder(schemaUser, 
provider).build();
    +    addNewStoragesToRootSchema(schemaConfig, rootSchema, storages);
    +    schemaTreesToClose.add(rootSchema);
    +    return rootSchema;
    +  }
    +
    +  private void expandSecondLevelSchema(SchemaPlus parent) {
    --- End diff --
    
    Maybe explain this a bit? Why are we expanding second-level schemas for 
*all* top-level schemas? Can't we do the expansion on the fly as we resolve? 
That is, if a query has a path "a.b.c.d", can't we just resolve a, then within 
a, resolve b, and so on until we get to d? Else, we are still open to a 
performance hit if, say, a is a directory of a million files, or a database 
with 10K tables.


> Skip initializing all enabled storage plugins for every query
> -------------------------------------------------------------
>
>                 Key: DRILL-5089
>                 URL: https://issues.apache.org/jira/browse/DRILL-5089
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>    Affects Versions: 1.9.0
>            Reporter: Abhishek Girish
>            Assignee: Chunhui Shi
>            Priority: Critical
>
> In a query's lifecycle, at attempt is made to initialize each enabled storage 
> plugin, while building the schema tree. This is done regardless of the actual 
> plugins involved within a query. 
> Sometimes, when one or more of the enabled storage plugins have issues - 
> either due to misconfiguration or the underlying datasource being slow or 
> being down, the overall query time taken increases drastically. Most likely 
> due the attempt being made to register schemas from a faulty plugin.
> For example, when a jdbc plugin is configured with SQL Server, and at one 
> point the underlying SQL Server db goes down, any Drill query starting to 
> execute at that point and beyond begin to slow down drastically. 
> We must skip registering unrelated schemas (& workspaces) for a query. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (DRILL-5089) Skip initializing all enabled storage plugins for every query

Reply via email to