rumbin opened a new issue #17042: URL: https://github.com/apache/superset/issues/17042
Explore's pill indicator of the number of result rows is showing wrong numbers for Box Plots. Instead of the number of rows returned from the DB query it displays the number of _aggregated_ rows. This way, users have no clue if the row limit kicked in and the box plot is based on incomplete data. Furthermore, the row limit is... * not adjustable for this plot * not communicated to the user * applied without imposing any sort order, thus resulting in a potentially arbitrary sample of rows being evaluated by the box plot #### How to reproduce the bug 1. Use a dataset that has more rows than the configured ROW_LIMIT 2. Create a box plot on a numeric column, _distribute across_ a column with a high cardinality, e.g., the primary key of the dataset. 3. Run 4. Observe the number of rows displayed in the indicator pill ### Expected results Minimal: * The actual number of rows returned by the DB query is shown in the indicator. * If the ROW_LIMIT is hit, the indicator turns red, like it does with all the other visualizations. Ideal, additionally: * The row limit is configurable, maybe even disabled, like for the histogram, iirc * If a row limit is applied, some sensible ORDER BY should be applied in order to at least yield deterministic, if incomplete, results. ### Actual results 1. Only the number of aggregated rows are shown, equalling the number of series displayed in the chart. 2. This number is consistent with the _Data_ table below the chart, which also only shows one row per series (box). #### Screenshots/Screencasts The row count pill shows only 7 result rows:  The _Data_ table lists these aggregated rows of the 7 distinct series:  The query applies the configured ROW_LIMIT of 100000:  A Big Number that calculates the distinct count of entities that the box plot was distributed across proves that the ccardinality is much higher than the ROW_LIMIT and therefore the boxplot was based on incomplete data witghout the row count pill turning red:  ### Environment - browser type and version: Chrome 93.0.4577.63 - superset version: 1.3.1, installed via pip - python version: 3.8.11 - node.js version: v4.6.1 - feature flags active: ``` "THUMBNAILS": True, "ALERT_REPORTS": True, "ALERTS_ATTACH_REPORTS": True, "SQLLAB_BACKEND_PERSISTENCE": True, "ENABLE_TEMPLATE_PROCESSING": True, "DASHBOARD_NATIVE_FILTERS": True, "DASHBOARD_CROSS_FILTERS": True, "DASHBOARD_NATIVE_FILTERS_SET": True, "ENABLE_EXPLORE_DRAG_AND_DROP": True, "DASHBOARD_CACHE": True ``` ### Checklist Make sure to follow these steps before submitting your issue - thank you! - [ x] I have checked the superset logs for python stacktraces and included it here as text if there are any. - [ x] I have reproduced the issue with at least the latest released version of superset. - [ x] I have checked the issue tracker for the same issue and I haven't found one similar. ### Additional context ´pip freeze´ ``` aiohttp==3.7.4.post0 alembic==1.7.3 amqp==2.6.1 apache-superset==1.3.1 apispec==3.3.2 asn1crypto==1.4.0 async-timeout==3.0.1 attrs==21.2.0 azure-common==1.1.27 azure-core==1.18.0 azure-storage-blob==12.9.0 Babel==2.9.1 backoff==1.11.1 billiard==3.6.4.0 bleach==3.3.1 boto3==1.18.51 botocore==1.21.51 Brotli==1.0.9 cachelib==0.1.1 cachetools==4.2.4 celery==4.4.7 certifi==2021.5.30 cffi==1.14.6 chardet==4.0.0 charset-normalizer==2.0.6 click==7.1.2 cmdstanpy==0.9.68 colorama==0.4.4 convertdate==2.3.2 cron-descriptor==1.2.24 croniter==1.0.15 cryptography==3.4.8 cx-Oracle==8.2.1 cycler==0.10.0 Cython==0.29.24 defusedxml==0.7.1 deprecation==2.1.0 dnspython==2.1.0 elasticsearch==7.13.4 elasticsearch-dbapi==0.2.6 email-validator==1.1.3 ephem==4.1 et-xmlfile==1.1.0 Flask==1.1.4 Flask-AppBuilder==3.3.3 Flask-Babel==1.0.0 Flask-Caching==1.10.1 Flask-Compress==1.10.1 Flask-JWT-Extended==3.25.1 Flask-Login==0.4.1 Flask-Migrate==3.1.0 Flask-OpenID==1.3.0 Flask-SQLAlchemy==2.5.1 flask-talisman==0.8.1 Flask-WTF==0.14.3 future==0.18.2 geographiclib==1.52 geopy==2.2.0 gevent==21.8.0 google-api-core==2.0.1 google-auth==2.2.1 google-cloud-bigquery==2.27.1 google-cloud-core==2.0.0 google-crc32c==1.2.0 google-resumable-media==2.0.3 googleapis-common-protos==1.53.0 graphlib-backport==1.0.3 greenlet==1.1.2 grpcio==1.41.0 gunicorn==20.0.4 hdbcli==2.10.13 holidays==0.10.3 humanize==3.11.0 idna==3.2 importlib-resources==5.2.2 isodate==0.6.0 itsdangerous==1.1.0 Jinja2==2.11.3 jmespath==0.10.0 jsonschema==3.2.0 kiwisolver==1.3.2 kombu==4.6.11 korean-lunar-calendar==0.2.1 LunarCalendar==0.0.9 Mako==1.1.5 Markdown==3.3.4 MarkupSafe==2.0.1 marshmallow==3.13.0 marshmallow-enum==1.5.1 marshmallow-sqlalchemy==0.23.1 matplotlib==3.4.3 msgpack==1.0.2 msrest==0.6.21 multidict==5.1.0 numpy==1.21.2 oauthlib==3.1.1 openpyxl==3.0.9 oscrypto==1.2.1 packaging==21.0 pandas==1.2.5 parsedatetime==2.6 pgsanity==0.2.9 Pillow==8.3.2 polyline==1.4.0 prison==0.2.1 prophet==1.0 proto-plus==1.19.2 protobuf==3.18.0 psycopg2-binary==2.8.6 pyarrow==4.0.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybigquery==0.10.2 pycparser==2.20 pycryptodomex==3.10.4 pyhdb==0.3.4 PyJWT==1.7.1 PyMeeus==0.5.11 pyOpenSSL==20.0.1 pyparsing==2.4.7 pyrsistent==0.18.0 pystan==2.18.0.0 python-dateutil==2.8.2 python-dotenv==0.19.0 python-geohash==0.8.5 python-ldap==3.3.1 python3-openid==3.2.0 pytz==2021.1 PyYAML==5.4.1 redis==3.5.3 requests==2.26.0 requests-oauthlib==1.3.0 rsa==4.7.2 s3transfer==0.5.0 selenium==3.141.0 setuptools-git==1.2 simplejson==3.17.5 six==1.16.0 slackclient==2.5.0 snowflake-connector-python==2.6.2 snowflake-sqlalchemy==1.2.4 SQLAlchemy==1.3.24 sqlalchemy-hana==0.5.0 SQLAlchemy-Utils==0.36.8 sqlparse==0.3.0 tabulate==0.8.9 tqdm==4.62.3 typing-extensions==3.10.0.2 ujson==4.2.0 urllib3==1.26.7 vine==1.3.0 webencodings==0.5.1 Werkzeug==1.0.1 WTForms==2.3.3 WTForms-JSON==0.3.3 xlrd==2.0.1 yarl==1.6.3 zipp==3.6.0 zope.event==4.5.0 zope.interface==5.4.0 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
