ecpbi opened a new issue #15470:
URL: https://github.com/apache/superset/issues/15470


   Hello,
   
   I implemented a relational schema on a 10.5.11 MariaDB database to store the 
data of my data warehouse and I installed the version 0.999.0dev of Superset 
(this is what tells me the "About" item of the "Settings" menu).
   
   I just tried the area chart and I noticed the following strange behavior of 
the query generator. It creates a non needed self join. Here is what I did and 
what was generated.
   First, I created avirtual SQL dataset from a query that works perfectly well 
on my DB.
   
   _SELECT
        va.application_host
        , va.application_pid
        , va.application_port
        , usr.appuserlogin
        , apps.EventDate
        , apps.EventTime
        , vs.service_host
        , vs.service_pid
        , vs.service_port
        , ori.origine_host
        , ori.origine_pid
        , ori.origine_port
        , oc.operation_categorie
        , oc.operation
        , oc.WHATSUCCESS
        , oc.objet_categorie
        , apps.objet
        , apps.PERFDATA
   FROM
        application_supervision AS apps NATURAL
   JOIN appuser AS usr NATURAL
   JOIN v_application AS va NATURAL
   JOIN v_service AS vs NATURAL
   JOIN v_origine AS ori NATURAL
   JOIN objet_categorie AS oc_
   
   Then I used this dataset to create a chart drawing an area chart.
   
![image](https://user-images.githubusercontent.com/84923750/123934062-3787d200-d993-11eb-916f-32a92ae2d2a0.png)
   
   Finaly, I ran the query to get the chart on my screen, that works perfectly, 
maybe because my tables count a very low number of rows.
   After the rendering of the chart, I look at the query that is generated by 
the tool and I get the following.
   
   _SELECT DATE(`EventDate`) AS __timestamp,
          application_host AS application_host,
          application_port AS application_port,
          SUM(PERFDATA) AS `nb transactions`
   FROM
     (SELECT va.application_host ,
             va.application_pid ,
             va.application_port ,
             usr.appuserlogin ,
             apps.EventDate ,
             apps.EventTime ,
             vs.service_host ,
             vs.service_pid ,
             vs.service_port ,
             ori.origine_host ,
             ori.origine_pid ,
             ori.origine_port ,
             oc.operation_categorie ,
             oc.operation ,
             oc.WHATSUCCESS ,
             oc.objet_categorie ,
             apps.objet ,
             apps.PERFDATA
      FROM application_supervision AS apps
      NATURAL
   JOIN appuser AS usr
      NATURAL
   JOIN v_application AS va
      NATURAL
   JOIN v_service AS vs
      NATURAL
   JOIN v_origine AS ori
      NATURAL
   JOIN objet_categorie AS oc) AS virtual_table
   INNER JOIN
     (SELECT application_host AS application_host__,
             application_port AS application_port__,
             SUM(PERFDATA) AS mme_inner__
      FROM
        (SELECT va.application_host ,
                va.application_pid ,
                va.application_port ,
                usr.appuserlogin ,
                apps.EventDate ,
                apps.EventTime ,
                vs.service_host ,
                vs.service_pid ,
                vs.service_port ,
                ori.origine_host ,
                ori.origine_pid ,
                ori.origine_port ,
                oc.operation_categorie ,
                oc.operation ,
                oc.WHATSUCCESS ,
                oc.objet_categorie ,
                apps.objet ,
                apps.PERFDATA
         FROM application_supervision AS apps
         NATURAL
   JOIN appuser AS usr
         NATURAL
   JOIN v_application AS va
         NATURAL
   JOIN v_service AS vs
         NATURAL
   JOIN v_origine AS ori
         NATURAL
   JOIN objet_categorie AS oc) AS virtual_table
      WHERE ((objet_categorie ='Transactions'
              AND application_host like 'fr1btp260%'))
      GROUP BY application_host,
               application_port
      ORDER BY mme_inner__ ASC
      LIMIT 100) AS anon_1 ON application_host = application_host__
   AND application_port = application_port__
   WHERE ((objet_categorie ='Transactions'
           AND application_host like 'fr1btp260%'))
   GROUP BY application_host,
            application_port,
            DATE(`EventDate`)
   LIMIT 10000;
   
   I understand tue definition of the virtual_table derived table, as only a 
few columns of the dataset are used, but what is the utility of the anon_1 
derived table ? This derived table is just useless as I get exactly the same 
result by simply dropping it from the query. Not only is this self join not 
usefull, but it has a strong probability to bring performance issuses when the 
data grow up.
   Maybe this has a purpose, but, from my point of view, this is only a bug.
   
   Regards,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to