@John, I agree that this should work. While I am not certain, I don't think the issue is specific to a particular plugin, but the way in a query's lifecycle, the foreman attempts to initialize every enabled storage plugin before proceeding to execute the query. So when a particular plugin isn't configured correctly or the underlying datasource is not up, this could drastically slow down the query execution time.
I'll look up to see if we have a JIRA for this already - if not will file one. On Wed, Nov 30, 2016 at 8:12 AM, John Omernik <[email protected]> wrote: > So just my opinion in reading this thread. (sorry for swooping in an > opining) > > If a CTAS is done from any data source into Parquet files.... there should > be NO dependency on the original data source to query the resultant Parquet > files. As a Drill user, as a Drill admin, this breaks the concept of > least surprise. If I take data from one source, and create Parquet files > in a distributed file system, it should just work. If there are "issues" > with JDBC plugins or the HBase/Hive plugins in a similar manner, these > needs to be hunted down by a large group of villages with pitchforks and > torches. I just can't see how this could be acceptable at any level. The > whole idea of Parquet files is they are self describing, schema included > files.... thus a read of a directory of Parquet files should have NO > dependancies on anything but the parquet files... even the Parquet > "additions" (such as the METADATA Cache) should be a fail open thing... if > it exists great, use it, speed things up, but if it doesn't read the > parquet files as normal (Which I believe is how it operates) > > John > > On Wed, Nov 30, 2016 at 12:12 AM, Abhishek Girish < > [email protected] > > wrote: > > > Can you attempt to disable to jdbc plugin (configured with SQLServer) and > > try the query (on parquet) when SQL Server is offline? > > > > I've seen a similar issue previously when the HBase / Hive plugin was > > enabled but either the plugin configuration was wrong or the underlying > > data source was down. > > > > On Fri, Nov 25, 2016 at 3:21 AM, Rahul Raj <rahul.raj@option3consulting. > > com> > > wrote: > > > > > I have created a parquet file using CTAS from a MS SQL Server. The > query > > on > > > parquet is getting stuck in STARTING state for a long time before > > returning > > > the results. > > > > > > We could see that drill was trying to connect to the MS SQL server from > > > which the data was imported. The MSSQL server was down, drill threw an > > > exception "Failure while attempting to load JDBC schema", and then > > returned > > > the results. While SQL server is running, the query executes without > > > issues. > > > > > > Why is drill querying the DB metadata externally and not the imported > > > parquets? > > > > > > Rahul. > > > > > > -- > > > **** This email and any files transmitted with it are confidential and > > > intended solely for the use of the individual or entity to whom it is > > > addressed. If you are not the named addressee then you should not > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > immediately and delete this e-mail from your system.**** > > > > > >
