[jira] [Comment Edited] (TAJO-1383) Improve broadcast table cache

Jinho Kim (JIRA) Fri, 13 Mar 2015 00:33:18 -0700

    [ 
https://issues.apache.org/jira/browse/TAJO-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360042#comment-14360042
 ]


Jinho Kim edited comment on TAJO-1383 at 3/13/15 7:32 AM:
----------------------------------------------------------

Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/404#issuecomment-78846800
  
    I was ran simple benchmark
    
    * Cluster : 1Master + 4Worker
    * TPC-H 100GB part of Q16
    
    {code:sql}
    select 
      p_brand, p_type, p_size, ps_suppkey
    from 
      partsupp ps join part p 
      on 
        p.p_partkey = ps.ps_partkey and p.p_brand <> 'Brand#45' 
        and not p.p_type like 'MEDIUM POLISHED%'
      join supplier_tmp s 
      on 
        ps.ps_suppkey = s.s_suppkey;
    {code}
    
    * This is only join benchmark (broadcast table is "supplier_tmp")
    
    ||Broadcast || execution time||
    |false | 45 sec|
    |true | 33 sec|
    |improved | 22 sec|



was (Author: githubbot):
Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/404#issuecomment-78846800
  
    I was ran simple benchmark
    
    * Cluster : 1Master + 4Worker
    * TPC-H 100GB part of Q16
    
    ```
    select 
      p_brand, p_type, p_size, ps_suppkey
    from 
      partsupp ps join part p 
      on 
        p.p_partkey = ps.ps_partkey and p.p_brand <> 'Brand#45' 
        and not p.p_type like 'MEDIUM POLISHED%'
      join supplier_tmp s 
      on 
        ps.ps_suppkey = s.s_suppkey;
        
    ```
    
    * This is only join benchmark (broadcast table is "supplier_tmp")
    
    Broadcast | execution time
    ------------ | -------------
    false | 45 sec
    true | 33 sec
    improved | 22 sec


> Improve broadcast table cache
> -----------------------------
>
>                 Key: TAJO-1383
>                 URL: https://issues.apache.org/jira/browse/TAJO-1383
>             Project: Tajo
>          Issue Type: Improvement
>          Components: physical operator
>    Affects Versions: 0.8.0, 0.9.0, 0.10.0
>            Reporter: Jinho Kim
>            Assignee: Jinho Kim
>              Labels: performance
>         Attachments: TAJO-1383.patch
>
>
> Currently, broadcast implementation keep a tuples on scan operator and It 
> create a duplicated table cache in memory.
> We should improve it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TAJO-1383) Improve broadcast table cache

Reply via email to