For reference, someone wrote a `pandas` connector (https://github.com/apache/incubator-superset/pull/3492) in the past that we never merged. The main reason it wasn't merge is that it was a fair amount of code to manage coming from a non-committer, while the connector interface wasn't super well-defined and "settled" at that point. Evolving the interface would mean carrying the pandas connector along for the ride.
Also the problem of where to persist the dataframe. Since our web servers are stateless, the pandas dataframe needs to be brought up in memory from the network prior to performing aggregations / filters. With something like Arrow that becomes somewhat reasonable, but it feels like there should be a dedicated service (that resembles a database quite a bit) loading/caching/computing on those files. [ Full content available at: https://github.com/apache/incubator-superset/pull/6041 ] This message was relayed via gitbox.apache.org for [email protected]
