Hi, I’ve been playing a bit with v1.0.0 and stumbled upon a few questions/issues:
1. For query cost estimation one usually needs some additional information about a table such as the number of rows. Is the cost estimation implemented for fs sources as well? If yes, how is the metadata extracted and cached? From my understanding some formats like parquet store it in the file footer, but what about json or csv files? Can this information be queried/retrieved somehow by the user? 2. I’ve been working with the following query: $q = select * from region join nation on region.R_REGIONKEY = nation.N_REGIONKEY; where region and nation are the sample data files imported into a dfs.tmp schema. running queries like select R_REGIONKEY from ($q); results in an error "Column 'R_REGIONKEY' is ambiguous”. However queries like select R_REGIONKEY from (SELECT * FROM region); work fine, as well as saving the result of the join with CREATE TABLE and then replacing $q with the saved table’s name. Why is that and what are the rules for renaming columns in join queries? 3. I’ve been trying to execute a logical plan using the web interface. It works fine with a simple scan - project query, but when trying to use the output of EXPLAIN … FOR $q (with resultMode changed to “EXEC”) it throws the following error: SYSTEM ERROR: java.lang.IllegalArgumentException: Conflicting property-based creators: already had [constructor for org.apache.drill.common.logical.data.Join, ... the whole logical query and full error message are at https://gist.github.com/pyetras/bf625b6697de62284996 4. What are the supported conditions for joins? The sql interface seems to support only (e1 == e2 [AND])*, but the logical operator reference at https://docs.google.com/document/d/1QTL8warUYS2KjldQrGUse7zp8eA72VKtLOHwfXy6c7I/mobilebasic?pli=1#cmnt7 mentions other relations and also cartesian joins. Are those simply not implemented for the sql parser or not supported in Drill at all? Sorry for the long read and thanks for your assistance, -- Piotr Sokólski
