[GitHub] [spark] wangyum commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint

GitBox Thu, 16 Jan 2020 23:00:47 -0800

wangyum commented on issue #27252: [SPARK-29231][SQL] Constraints should be 
inferred from cast equality constraint
URL: https://github.com/apache/spark/pull/27252#issuecomment-575501130
 
 
   PostgreSQL and Hive support this feature:
   ```sql
   postgres=# EXPLAIN  select t1.* from spark_29231_1 t1 join spark_29231_2 t2 
on (t1.c1 = t2.c1 and t1.c1 = 1);
                                     QUERY PLAN
   
------------------------------------------------------------------------------
    Nested Loop  (cost=0.00..69.77 rows=90 width=16)
      ->  Seq Scan on spark_29231_2 t2  (cost=0.00..35.50 rows=10 width=4)
            Filter: (c1 = 1)
      ->  Materialize  (cost=0.00..33.17 rows=9 width=16)
            ->  Seq Scan on spark_29231_1 t1  (cost=0.00..33.12 rows=9 width=16)
                  Filter: (c1 = 1)
   (6 rows)
   ```
   
   ```sql
   hive> explain select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on 
(t1.c1 = t2.c1 and t1.c1 = 1);
   Warning: Map Join MAPJOIN[11][bigTable=?] in task 'Stage-3:MAPRED' is a 
cross product
   OK
   STAGE DEPENDENCIES:
     Stage-4 is a root stage
     Stage-3 depends on stages: Stage-4
     Stage-0 depends on stages: Stage-3
   
   STAGE PLANS:
     Stage: Stage-4
       Map Reduce Local Work
         Alias -> Map Local Tables:
           $hdt$_0:t1
             Fetch Operator
               limit: -1
         Alias -> Map Local Operator Tree:
           $hdt$_0:t1
             TableScan
               alias: t1
               Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE
               Filter Operator
                 predicate: (c1 = 1L) (type: boolean)
                 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
                 Select Operator
                   expressions: c2 (type: bigint)
                   outputColumnNames: _col1
                   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
                   HashTable Sink Operator
                     keys:
                       0
                       1
   
     Stage: Stage-3
       Map Reduce
         Map Operator Tree:
             TableScan
               alias: t2
               Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE
               Filter Operator
                 predicate: (UDFToLong(c1) = 1) (type: boolean)
                 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
                 Select Operator
                   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: NONE
                   Map Join Operator
                     condition map:
                          Inner Join 0 to 1
                     keys:
                       0
                       1
                     outputColumnNames: _col1
                     Statistics: Num rows: 1 Data size: 1 Basic stats: PARTIAL 
Column stats: NONE
                     Select Operator
                       expressions: 1L (type: bigint), _col1 (type: bigint)
                       outputColumnNames: _col0, _col1
                       Statistics: Num rows: 1 Data size: 1 Basic stats: 
PARTIAL Column stats: NONE
                       File Output Operator
                         compressed: false
                         Statistics: Num rows: 1 Data size: 1 Basic stats: 
PARTIAL Column stats: NONE
                         table:
                             input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                             output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                             serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
         Execution mode: vectorized
         Local Work:
           Map Reduce Local Work
   
     Stage: Stage-0
       Fetch Operator
         limit: -1
         Processor Tree:
           ListSink
   
   Time taken: 0.2 seconds, Fetched: 69 row(s)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] wangyum commented on issue #27252: [SPARK-29231][SQL] Constraints should be inferred from cast equality constraint

Reply via email to