Few questions/comments inline. On Thu, Jul 30, 2015 at 2:53 PM, mehant baid <[email protected]> wrote:
> Based on the discussion in the hangout I wanted to start a thread around > Drop table support. > > Couple of high level points about what is planned to be supported > > 1. In the first iteration Drop table will only support dropping tables in > the file system and not dropping tables in Hive/ Hbase or other storage > plugins. > 2. Since Drop table is potentially "risky" we want to be pessimistic about > dropping tables. > > There are two broad scenarios while dealing with Drop table - Security > enabled and Security Disabled. In both cases we would like to follow the > below workflow > > 1. Check if the table being dropped can be consumed by Drill. > [Neeraja] I am assuming if security is enabled, this is done with the impersonated user identity. is this accurate. > * Meaning do all the files in the directories conform to a format that > Drill can read (parquet, json, csv etc). Jacques pointed out that if there > is a bug in this logic where if one of the files in the directory conforms > to a format that Drill can read we create a DrillTable and error out if we > encounter other files we cannot read. > [Neeraja] What does it mean to create DrillTable here? > * The above point can in the worst case entail reading the entire file > system, if a user issues a drop table command on the root of the file > system. But its more likely that we will encounter a file that Drill cannot > read soon and abort the Drop with an error. > * Another minor clarification is we consider only those directories to > be consumable by Drill if they contain file formats that are homogenous and > can be read by Drill. For eg: we should fail if a user is trying to delete > a directory that contains both JSON and Parquet files. > > 2. Once we have confirmed that the table requested to be dropped contains > homogenous files which can be read by Drill, we delve into the file > permissions. > * If security is enabled, we impersonate the user issuing the command > and drop the directory (succeeds if FS allows and user has correct > permissions). > * If security is not enabled, we only drop the directory if all the > files are owned by the user Drillbit is running as (being pessimistic about > drop). We should collect this information when checking for homogenous > files. > [Neeraja] Why do we need this check. How is this different from the impersonated user scenario. > > Open Questions: > > Views: How do we handle views that were created on top of the dropped > table. Following are a couple of scenarios we might want to explore > * Views are treated as a different entity and its useful for the user > to have a view definition still in place as the dropped table will be > replaced with new set of files with the exact schema and existing view > definition suffices. AFAIK, Oracle and SQL Server have this model and don't > drop the views if the base table is dropped. > * Once the table is dropped, the view definition is no longer needed > and hence should be dropped automatically. We can probably punt on this > till we have dotdrill files. With dotdrill files we can maintain some > information to indicate the views on this table and can drop the views > implicitly. But given that some of the popular databases don't do this, we > might want to conform to the standard behavior. > [Neeraja] Agree with the recommendation here. It seems we can go with a simpler approach here i.e treat views as different entity Also will there any mechanism to recover once you accidentally drop? > Thanks > Mehant >
