turboFei removed a comment on issue #24992:  [SPARK-28194][SQL] Judge whether 
to reorder joinKeys to prevent None.get in EnsureRequirements
URL: https://github.com/apache/spark/pull/24992#issuecomment-507584403
 
 
   @maropu 
   A brief query:
   ```
   SELECT
         TO_DATE(goodsid2buname.day) day,
         add_cart.goods_id,
         goodsid2buname.bu_name,
         add_cart.add_cart_uv
     FROM (
         SELECT
             DISTINCT
             day,
             CAST(goods_id AS STRING) goods_id,
             bu_name
         FROM db.tbla
     ) goodsid2buname
     JOIN (
         SELECT
             day,
             TO_DATE(cart_datetime) add_cart_day, 
             goods_id,
             COUNT(DISTINCT account_id) add_cart_uv
         FROM db.tblb  
         WHERE day = DATE_ADD(CURRENT_DATE(), -1)
         GROUP BY day, goods_id, TO_DATE(cart_datetime)
     ) add_cart
     ON goodsid2buname.day = add_cart.day
         AND goodsid2buname.goods_id = add_cart.goods_id
         AND goodsid2buname.day = add_cart.add_cart_day;
   ```
   And the relative table `db.tbla` and `db.tblb` are all parquet tables 
created by `stored as parquet`.
   Their columns information are shown below.
   ```
    > describe  db.tbla;
   19/07/02 16:45:29 INFO CodeGenerator: Code generated in 207.305477 ms
   load_time    bigint  NULL
   product_id   bigint  NULL
   goods_id     bigint  NULL
   bu_id        bigint  NULL
   bu_name      string  NULL
   is_virtual_bu        bigint  NULL
   day  string  NULL
   bu_type      bigint  NULL
   # Partition Information
   # col_name   data_type       comment
   day  string  NULL
   bu_type      bigint  NULL
   Time taken: 0.593 seconds, Fetched 12 row(s)
   19/07/02 16:45:29 INFO SparkSQLCLIDriver: Time taken: 0.593 seconds, Fetched 
12 row(s)
   
   > describe  db.tblb;
   account_id   string  NULL
   sku_id       string  NULL
   goods_id     string  NULL
   backend_sku_id       string  NULL
   backend_goods_id     string  NULL
   bu_cat_id    string  NULL      
   goods_num    int     NULL
   backend_goods_num    int     NULL
   price        double  NULL
   backend_price        double  NULL
   cart_alg_price       double  NULL
   cart_datetime        timestamp       NULL
   id   string  NULL
   day  string  NULL
   # Partition Information
   # col_name   data_type       comment
   day  string  NULL
   Time taken: 0.246 seconds, Fetched 17 row(s)
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to