I think you meant to ask how likely it is that a nested loop join plan will
be chosen if planner.enable_nljoin_for_*scalar_only *is set to FASLE, and
the query does not contain a cartesian join?

If so, I believe it's unlikely due to how the optimizer computes the cost
of nested loop join plans.  But as I noted in my original reply, I don't
know if this has been extensively tested by other Drill users.

Your suggestion of a new option *planner.enable_nljoin_for_crossjoin *should
effectively be the same as setting planner.enable_nljoin_for_*scalar_only*
to false because the way the optimizer estimates costs, it should never
consider cross join orderings unless that's the only available join
ordering option.  For those cartesian joins, the only option would be a
nested loop join.

-- Zelaine

On Tue, Dec 27, 2016 at 10:26 AM, [email protected] <[email protected]>
wrote:

> Zelaine,
>
> Thanks for the clarity.  I can understand the performance concerns behind
> the nested loop.
>
> Can a nested loop be selected for "other" query plans when "
> planner.enable_nljoin_for_*scalar_only" *is set to true?
>
> I am trying to determine if I should turn it system wide, or ONLY turn it
> on when I need a Cartesian product result.
>
> Any ideas?  Suggestions?
>
> Outside of the Performance Concern..   There are two valid nested loop
> usecase.
> 1) Scalar subqueries
> 2) ANSI SQL Cross Joins
>
> Do you know if there are plans to incorporate this as a default setting?
>
>
> *planner.enable_nljoin_for_scalar_only *should probably be
> planner.enable_nljoin_for_*scalar *
>
> Similarly, there should be *planner.enable_nljoin_for_crossjoin* and the
> default setting should be "true"
>
> Do you agree?   Am I off base someplace?
>
> If there are no plans, and you agree with my assessment.   I will probably
> file this as a feature request
>
>
>
> Regards,
> CLN
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Dec 26, 2016 at 7:58 PM, Zelaine Fong <[email protected]> wrote:
>
>> I'm not sure how widely nested loop joins outside of scalar subqueries
>> have been exercised by Drill users, since that setting is not the default.
>> Note that nested loop joins can only be processed using broadcast joins
>> [1].  So you will incur a lot of network transfer overhead unless the
>> smaller of the tables you're joining is kept to a minimum.
>>
>> [1] https://drill.apache.org/docs/join-planning-guidelines/
>>
>> -- Zelaine
>>
>> On Mon, Dec 26, 2016 at 7:05 AM, [email protected] <[email protected]>
>> wrote:
>>
>>> Zelaine,
>>>
>>> I appreciate it...   That worked.
>>>
>>> I am thinking of turning on this feature system wide.
>>>
>>> Is there any foreseeable issue with using nested joins outside of scalar
>>> subqueries?  Performance or otherwise?
>>>
>>> Regards,
>>> CLN
>>>
>>>
>>> On Sun, Dec 25, 2016 at 7:22 PM, Zelaine Fong <[email protected]>
>>> wrote:
>>>
>>>> Alternatively, you can set the following configuration to false:
>>>>
>>>> alter session set `planner.enable_nljoin_for_scalar_only` = false;
>>>>
>>>> Cartesian joins need to be processed as a nested loop join, and by
>>>> default, Drill only considers nested joins in the case where at least one
>>>> side of the join is a scalar subquery.
>>>>
>>>> -- Zelaine
>>>>
>>>> On Sun, Dec 25, 2016 at 2:46 PM, Ted Dunning <[email protected]>
>>>> wrote:
>>>>
>>>>> You can fake the limitation by adding a constant column to both
>>>>> tables, I
>>>>> think, and then joining on the constant.
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Dec 25, 2016 at 2:04 PM, [email protected] <
>>>>> [email protected]>
>>>>> wrote:
>>>>>
>>>>> >
>>>>> > I am trying to do a cross join to get a cartesian products.
>>>>> >
>>>>> > Per the error message (attached) and the JIRA ticket I see it isn't
>>>>> > supported.
>>>>> > https://issues.apache.org/jira/browse/DRILL-3807
>>>>> >
>>>>> > I wrote the query against using dfs on csv file types.
>>>>> >
>>>>> > Can I execute a cross join in Apache Drill just by moving my data to
>>>>> a
>>>>> > different file type or Storage Plugin. Such as Parquet,JSON, or RDBMS
>>>>> > Plugin.
>>>>> >
>>>>> > Regards,
>>>>> > CLN
>>>>> >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to