You are mostly correct. Verify that resources you are referencing exist in a readable format that you have permission to access (files, tables, views, etc) If the assets are considered strong-schema, verify that the references you are using exist and have compatible data types
Right now, schemaness falls into these two main categories: strong-schemaed views hive tables hbase column families text weak-schemaed json mongodb hbase column qualifiers parquet Note that we really need to move Parquet from the strong weak-schemaed to strong-schemaed list since the format itself is relatively strong-schemaed. (I say relative because Parquet doesn't require an application to record logical data types and many systems that generate Parquet today don't generate logical type information). This has caused us to initially treat it as weakly-schemaed since this allows more liberal casting capabilities than is normally allowed by SQL and thus a better user experience with Parquet data that doesn't have logical type information. On Thu, Jan 29, 2015 at 9:14 AM, Andries Engelbrecht < [email protected]> wrote: > Which steps and checks does Drill perform when creating a view? > > When creating a view on a directory structure with a large number of > directories and JSON files in each directory, the view creation takes 5-7 > seconds on small cluster. > > From a few tests it seems that Drill will verify Hive tables and columns > being used in a view. > > For the JSON docs in the DFS it does verify the storage plugin and the > directory it is being pointed at. > If the directory is empty the view creation does fail. > Drill does not seem to verify if the maps (specified in the view) in JSON > files are present, likely due to the convention to assign null to non > existent maps (still need to dig deeper on this topic on the conventions > being used for complex data types) > > Thx > > —Andries >
