This "CannotPlanException" definitely is a bug in query planner. I
thought we had put code to show that extremely long error msg "only"
in debug mode. Looks like it's not that case.

Could you please open a JIRA and post your query, if possible? thx.

On Thu, Jun 23, 2016 at 10:45 AM, John Omernik <[email protected]> wrote:
> Jinfeng -
>
> I wrote my item prior to reading yours. Just an FYI, when I ran with that
> settting, I got a "CannotPlanException" (with an error that is easily the
> longest "non-verbose"( heck this beats all the verbose errors I've had)
> I've ever seen. I'd post it here, but I am not unsure if my Google has
> enough storage to handle this message....
>
> (kidding... sorta)
>
> John
>
>
>
> On Thu, Jun 23, 2016 at 12:37 PM, Jinfeng Ni <[email protected]> wrote:
>
>> Do you partition by day in your CTAS? If that's the case, CTAS will
>> produce at least one parquet file for each value of "day".  If you
>> have 100 days, then you will end up at least 100 files. However, in
>> case the query is executed in distributed mode, there could be more
>> than one file per value.
>>
>> In order to get one and only one parquet file for each partition
>> value, turn on this option:
>>
>> alter session set `store.partition.hash_distribute` = true;
>>
>>
>>
>> On Thu, Jun 23, 2016 at 10:26 AM, Jason Altekruse <[email protected]>
>> wrote:
>> > Apply a sort in your CTAS, this will force the data down to a single
>> stream
>> > before writing.
>> >
>> > Jason Altekruse
>> > Software Engineer at Dremio
>> > Apache Drill Committer
>> >
>> > On Thu, Jun 23, 2016 at 10:23 AM, John Omernik <[email protected]> wrote:
>> >
>> >> When have a small query writing smaller data (like aggregate tables for
>> >> faster aggregates for Dashboards etc).  It appears to write a ton of
>> small
>> >> files.  Not sure why, maybe its just how the join worked out etc. I
>> have a
>> >> "day" that is 1.5M in total size, but 400 files total. This seems
>> >> excessive.
>> >>
>> >> While I don't have the "small files" issues because I run MapR-FS,
>> having
>> >> 400 files that make 1.5 mb of total date kills me on the planning phase.
>> >>  How can I get Drill, when doing a CTAS to go through a round of
>> >> consolidation on the parquet files?
>> >>
>> >> Thanks
>> >>
>> >> John
>> >>
>>

Reply via email to