This "CannotPlanException" definitely is a bug in query planner. I thought we had put code to show that extremely long error msg "only" in debug mode. Looks like it's not that case.
Could you please open a JIRA and post your query, if possible? thx. On Thu, Jun 23, 2016 at 10:45 AM, John Omernik <[email protected]> wrote: > Jinfeng - > > I wrote my item prior to reading yours. Just an FYI, when I ran with that > settting, I got a "CannotPlanException" (with an error that is easily the > longest "non-verbose"( heck this beats all the verbose errors I've had) > I've ever seen. I'd post it here, but I am not unsure if my Google has > enough storage to handle this message.... > > (kidding... sorta) > > John > > > > On Thu, Jun 23, 2016 at 12:37 PM, Jinfeng Ni <[email protected]> wrote: > >> Do you partition by day in your CTAS? If that's the case, CTAS will >> produce at least one parquet file for each value of "day". If you >> have 100 days, then you will end up at least 100 files. However, in >> case the query is executed in distributed mode, there could be more >> than one file per value. >> >> In order to get one and only one parquet file for each partition >> value, turn on this option: >> >> alter session set `store.partition.hash_distribute` = true; >> >> >> >> On Thu, Jun 23, 2016 at 10:26 AM, Jason Altekruse <[email protected]> >> wrote: >> > Apply a sort in your CTAS, this will force the data down to a single >> stream >> > before writing. >> > >> > Jason Altekruse >> > Software Engineer at Dremio >> > Apache Drill Committer >> > >> > On Thu, Jun 23, 2016 at 10:23 AM, John Omernik <[email protected]> wrote: >> > >> >> When have a small query writing smaller data (like aggregate tables for >> >> faster aggregates for Dashboards etc). It appears to write a ton of >> small >> >> files. Not sure why, maybe its just how the join worked out etc. I >> have a >> >> "day" that is 1.5M in total size, but 400 files total. This seems >> >> excessive. >> >> >> >> While I don't have the "small files" issues because I run MapR-FS, >> having >> >> 400 files that make 1.5 mb of total date kills me on the planning phase. >> >> How can I get Drill, when doing a CTAS to go through a round of >> >> consolidation on the parquet files? >> >> >> >> Thanks >> >> >> >> John >> >> >>
