Hi Dave,
The issue is not in joining, Drill can join empty schemaless table (for
example empty JSON file or empty directory).
DRILL-4517 is exactly describes the issue. You can add your test case with
data to that jira ticket.
Regarding workarounds, I am not aware of any.
Kind regards
Vitalii
On Thu, May 24, 2018 at 5:19 AM Dave Challis
wrote:
> We've got some processes that dump some reporting data as a bunch of
> parquet files, then runs queries involving joins with those tables (i.e. we
> have a main table which is always non-empty, then a number of link tables
> which join against which can be empty).
>
> The Parquet files contain schema metadata, but some contain no row data.
>
> Trying to join against them in Drill using e.g.
>
> SELECT *
> FROM dfs.`a.parquet` AS A
> JOIN dfs.`b.parquet` AS B ON (A.id=B.id)
> JOIN dfs.`c.parquet` AS C ON (A.id=C.id);
>
> Fails with: "SYSTEM ERROR: IllegalArgumentException: MinorFragmentId 0 has
> no read entries assigned" if either b.parquet or c.parquet contain no rows.
>
> It looks like it might have been reported as an issue here
> https://issues.apache.org/jira/browse/DRILL-4517 , but as it hasn't been
> fixed since 2016, I'm wondering if there are any suggested workarounds for
> the above, rather than waiting for a fix.
>
> In MySQL/Postgres etc., joining against empty tables is fine, so this
> behaviour was a bit unexpected, and is a major blocker for a project I'm
> using Drill for.
>
> Thanks,
> Dave
>