What you are suggesting makes sense in the case when security is
enabled. So when Drill is accessing the file system it will impersonate
the user who issued the command and drop will happen if the user has
sufficient permissions.
However when security isn't enabled, Drill will be accessing the file
system as the Drill user itself which is most likely to be a super user
who has permissions to delete most files. To prevent any catastrophic
drops checking for homogenous file formats makes sure that at least the
directory being dropped is something that can be read by Drill. This
will prevent any accidental drops (like dropping the home directory etc,
because its likely to have file formats that cannot be read by Drill).
This will not prevent against malicious behavior (for handling this
security should be enabled).
Thanks
Mehant
On 8/5/15 11:43 AM, Ted Dunning wrote:
Is any check really necessary?
Can't we just say that for data sources that are file-like that drop is a
rough synonym for rm? If you have permission to remove files and
directories, you can do it. If you don't, it will fail, possibly half
done. I have never seen a bug filed against rm to add more elaborate
semantics, so why is it so necessary for Drill to have elaborate semantics
here?
On Wed, Aug 5, 2015 at 11:09 AM, Ramana I N <[email protected]> wrote:
The homogenous check- Will it be just checking for types are homogenous or
if they are actually types that can be read by drill?
Also, is there a good way to determine if a file can be read by drill? And
will there be a perf hit if there are large number of files?
Regards
Ramana
On Wed, Aug 5, 2015 at 11:03 AM, Mehant Baid <[email protected]>
wrote:
I agree, it is definitely restrictive. We can lift the restriction for
being able to drop a table (when security is off) only if the Drill user
owns it. I think the check for homogenous files should give us enough
confidence that we are not deleting a non Drill directory.
Thanks
Mehant
On 8/4/15 10:00 PM, Neeraja Rentachintala wrote:
Ted, thats fair point on the recovery part.
Regarding the other point by Mehant (copied below) ,there is an
implication
that user can drop only Drill managed tables (i.e created as Drill user)
when security is not enabled. I think this check is too restrictive
(also
unintuitive). Drill doesn't have the concept of external/managed tables
and
a user (impersonated user if security is enabled or Drillbit service
user
if no security is enabled) should be able to drop the table if they have
permissions to do so. The above design proposes a check to verify if the
files that need to be deleted are readable by Drill and I believe is a
good
validation to have.
/The above check is in the case when security is not enabled. Meaning we
are executing as the Drill user. If we are running as the Drill user
(which
might be root or a super user) its likely that this user has permissions
to
delete most files and checking for permissions might not suffice. So
when
security isn't enabled the proposal is to delete only those files that
are
owned (created) by the Drill user./
On Fri, Jul 31, 2015 at 12:09 AM, Ted Dunning <[email protected]>
wrote:
On Thu, Jul 30, 2015 at 4:56 PM, Neeraja Rentachintala <
[email protected]> wrote:
Also will there any mechanism to recover once you accidentally drop?
yes. Snapshots <https://www.mapr.com/resources/videos/mapr-snapshots
.
Seriously, recovery of data due to user error is a platform thing. How
can
we recover from turning off the cluster? From removing a disk on an
Oracle
node?
I don't think that this is Drill's business.