[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-06-09 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359902#comment-17359902
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}delete(path) MUST be a no-op if the path isn't there. The way to view 
the semantics of the call is that delete(path) == true implies the path is no 
longer present.
{quote}
[~ste...@apache.org] It seems that we don't follow this everywhere.

DFS client (NameNode -> FSNameSystem#delete) doesn't follow this and I just 
quickly tested Http FS with WebHdfs and LocalFS, and this semantic is not 
followed. For non existing file, FS#delete returns false.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-06-02 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355690#comment-17355690
 ] 

Steve Loughran commented on HDFS-15982:
---

I've only just seen this by way of the revert entry in the trunk commit llog


bq. If some behaviour doesn't look good or feel is important to change, Let us 
change in Hadoop-Common first, Let all filesystem adapt and we can happily 
change.

delete(path) MUST be a no-op if the path isn't there. The way to view the 
semantics of the call is that delete(path) == true implies the path is no 
longer present.


bq.  Hive relies on rename call to see if the target exists, if it returns 
false, means target already exist, it appends then counter and then rename 
again. If someone starts throwing exception in that code will break, Which 
isn't a good thing. 

rename() failure reporting is a PITA as the "What does false mean?" is so 
vague. But we are stuck with it, even as filesystems tighten their own failure 
reporting (HADOOP-16271). In the absence of a switch to FileContext, my goal 
there is to make rename/3 public: HADOOP-11452. 


> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339535#comment-17339535
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}Regarding the UI. If the trash interval isn't set and If I select 
NO(move to trash), It still deletes with success? Check if the behaviour is 
like that, The client may be in wrong impression that things moved to trash, 
but it actually didn't. We should have bugged him back, Trash isn't enabled.
{quote}
If trash interval isn't set and if we select NO, it does delete with success 
(as per logic 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1560]:
 file moves to trash only if skiptrash is false and trashInterval > 0)
{code:java}
case DELETE: {
  Configuration conf =
  (Configuration) context.getAttribute(JspHelper.CURRENT_CONF);
  long trashInterval =
  conf.getLong(FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT);
  if (trashInterval > 0 && !skipTrash.getValue()) {
LOG.info("{} is {} , trying to archive {} instead of removing",
FS_TRASH_INTERVAL_KEY, trashInterval, fullpath);
org.apache.hadoop.fs.Path path =
new org.apache.hadoop.fs.Path(fullpath);
Configuration clonedConf = new Configuration(conf);
// To avoid caching FS objects and prevent OOM issues
clonedConf.set("fs.hdfs.impl.disable.cache", "true");
FileSystem fs = FileSystem.get(clonedConf);
boolean movedToTrash = Trash.moveToAppropriateTrash(fs, path,
clonedConf);
if (movedToTrash) {
  final String js = JsonUtil.toJsonString("boolean", true);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}
// Same is the behavior with Delete shell command.
// If moveToAppropriateTrash() returns false, file deletion
// is attempted rather than throwing Error.
LOG.debug("Could not move {} to Trash, attempting removal", fullpath);
  }
  final boolean b = cp.delete(fullpath, recursive.getValue());
  final String js = JsonUtil.toJsonString("boolean", b);
  return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
}

{code}
I think in UI, we can provide additional info in same model: "These buttons are 
useful only if fs.trash.interval has been configured. Without setting interval, 
files will be hard deleted anyways."

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339518#comment-17339518
 ] 

Ayush Saxena commented on HDFS-15982:
-

DefaultFs is like if you don't specify the fs in the path it will get used, if 
the path has it, it won't get used. So, that isn't a mandatory conf itself. In 
the present case also. The WebHdfs is using the HTTP URI to reach to Namenode, 
then I think the DefaultFs points to the RPC. The URI is changing, it isn't the 
same FileSystem URI on which the call was made. (Just Tweaking a test).
{code:java}
2021-05-05 13:04:26,933 [Listener at localhost/50984] ERROR web.TestWebHDFS 
(TestWebHDFS.java:testWebHdfsNoRedirect(1613)) - URI of the FileSystem during 
call webhdfs://localhost:50970

2021-05-05 13:04:34,513 [IPC Server handler 8 on default port 50971] ERROR 
resources.NamenodeWebHdfsMethods (NamenodeWebHdfsMethods.java:delete(1574)) - 
URI of the FileSystem created for Trash: hdfs://localhost:50971
{code}
In the Namenode it would be there, but there are cases like
We need to check how things behave if the underlying DefaultFs is 
{{ViewDistributedFileSystem}} or is using {{ViewFsOverloadScheme}} I checked 
for the latter the FS for trash was initiated with {{ViewFsOverloadScheme}} in 
the delete method. So, if that resolves to some other place through its mount 
table. So, that is also something needs to be checked. Will the Namenode call 
the other NS or something like that.

The defaultFs concern was there in the previous jira itself, so that is worth a 
thought.

Router would also have issues, if the trash path resolves to different NS, or 
to some path which isn't in the Mount Table, when Default Namespace isn't 
configured. 

Regarding the UI. If the trash interval isn't set and If I select NO(move to 
trash), It still deletes with success? Check if the behaviour is like that, The 
client may be in wrong impression that things moved to trash, but it actually 
didn't. We should have bugged him back, Trash isn't enabled.

The API behaviour has changed. Exception/Return value. That needs to be sorted.

Make sure ALL FileSystems behave similar to WebHdfs, unless there is a strong 
reason. (WebHdfs!=FsShell)

And finally find out what was the danger that Daryn Talked about. I remember 
seeing a Jira, where it was quoted "Namenode should never make an RPC to 
itself", The Namenode can hang or something like that, not sure, what was the 
context then. But I couldn't find the jira, so was Daryn's concern related to 
that. Not sure. We can't risk out provided we are thinking to put this in a 
stable release.

Compatibility is to be restored.

How to proceed.
Lets wait for [~weichiu], he is chasing the 3.3.1 release AFAIK. So, he might 
have an answer how much we can wait for this and the answers to the questions 
above. If we get everything sorted, or he already have answers. Things should 
be good. 

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339444#comment-17339444
 ] 

Viraj Jasani commented on HDFS-15982:
-

FYI [~smeng] as we discussed over similar case on PR.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339438#comment-17339438
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}I was following the older jira.

[~daryn] had a comment:
{quote}You cannot or should not create a default fs and parse a path in the NN. 
It's very dangerous. Give me some time (that I don't have) and I'd likely come 
up with a nasty exploit.
{quote}{quote}
This seems interesting. Although yes, this is default fs, but it is 
instantiated from WebHdfs config object only (which is used by all endpoints in 
NamenodeWebHdfsMethods). Is WebHdfs server implementation used for any other 
FileSystem (from current and future viewpoints)?
{quote}Router part needs to be checked again and confirmed, Trash itself has 
issues with Router(There are Jiras). So, if Trash becomes true by default, I 
doubt delete through Router don't break or moves to some weird place.
{quote}
Oops, I was not aware of existing concerns with router path resolution with 
Trash. I think this is fair point to make this Jira a compatible change w.r.t 
DELETE REST API calls. Let me provide an addendum for trunk to bring default 
value of skiptrash as true. branch-3.3 backport PR#2925 is pending. Shall we 
get it in and then I can provide addendum for both trunk and branch-3.3 for 
clean history?

Btw [~ayushtkn] you might also want to check screenshots attached on this Jira 
to take a look at how skiptrash is handled from Web UI.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-04 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339287#comment-17339287
 ] 

Ayush Saxena commented on HDFS-15982:
-

Yahh, means anybody coming to HDFS via REST will move to trash, So, this is a 
change in behaviour, so this shouldn't be true anyway and regarding shell, 
FsShell is a different concept, Can not be confused as synonym to RPC calls. 
So, yeps we need to go back to the default behaviour. So the consensus is 
lost...

I was following the older jira.

[~daryn] had a comment:
{quote}You cannot or should not create a default fs and parse a path in the NN. 
It's very dangerous. Give me some time (that I don't have) and I'd likely come 
up with a nasty exploit.
{quote}
Do we have an answer to this and what was the danger actually?

There was another concern raised, that the FileSystem is formed using defaultFs 
URI, not of the Path, How that got sorted? What if both are different?

Router part needs to be checked again and confirmed, Trash itself has issues 
with Router(There are Jiras). So, if Trash becomes true by default, I doubt 
delete through Router don't break or moves to some weird place.

 

 

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-04 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339268#comment-17339268
 ] 

Viraj Jasani commented on HDFS-15982:
-

Perhaps there was some misunderstanding w.r.t FileSystem behaviour that was 
mentioned above, my bad.

If you are talking about FileSystem#delete API, there is no change done in 
FileSystem as part of this Jira (i.e WebHdfsFileSystem is not updated). As part 
of this Jira, we have introduced change in WebHdfs NameNode implementation 
(DELETE API endpoint in NamenodeWebHdfsMethods) by introducing an additional 
query param: skiptrash (similar to skipTrash param in DELETE shell command). 
This Jira is marked incompatible because of the default value of skiptrash 
query param, which is false as of now. If we change the default value to true, 
this Jira would not need to be marked incompatible. However, if you take a 
closer look, DELETE shell command (rm) has default value of skipTrash param as 
false, and hence by replicating the same behaviour in WebHdfs NameNode 
implementation of DELETE API, we are trying to be compatible with shell command 
behaviour (but at the same time incompatible with previous behaviour of 
NamenodeWebHdfsMethods DELETE API, which never had moving to trash feature).

Please take a look at Wei-Chiu's comments above:
{quote}This is a big incompatible change. If we think this should be part of 
3.4.0, risking our compatibility guarantee (which I think makes sense, given 
how many times I was involved in accidental data deletion), I think it can be 
part of 3.3.1. We traditionally regard 3.3.0 as non-production ready, so making 
an incompat change in 3.3.1 probably is justifiable.
{quote}
I am of the similar opinion and hence I tried to keep default value of 
skiptrash query param as false, this was the consensus so far. However, if the 
consensus changes to make this Jira a compatible change (by changing default 
value of skiptrash as true), I will be happy to provide addendum patch for the 
same.

 
{quote}It is never like for improvements we just mark incompatible and then 
side-line the compatibility because someone feels it would be good. For Bugs, 
yes we do so if it is the last option, for which he have a flag telling sorry 
we had no choice.
{quote}
I agree. However, if we look at this feature from a different viewpoint, we 
might feel this was a bug in NamenodeWebHdfsMethods implementation, which did 
not allow moving files to trash similar to how Delete shell command (rm) always 
did. But as I mentioned, if consensus changes to make this Jira compatible, I 
can provide addendum to change default value of skiptrash as true.

 
{quote}Regarding the router part, Is the call going back to the router, since 
the path needs to be resolved with respect to the mount table, or is it 
resolving with respect to the Namenode itself?
{quote}
NamenodeWebHdfsMethods DELETE implementation, before using FileSystem#delete, 
uses Trash#moveToAppropriateTrash utility which internally takes care of 
resolving symlinks, mount points etc using FileSystem#rename.

As per the doc:
{code:java}
/**
 * In case of the symlinks or mount points, one has to move the appropriate
 * trashbin in the actual volume of the path p being deleted.
 *
 * Hence we get the file system of the fully-qualified resolved-path and
 * then move the path p to the trashbin in that volume,
 * @param fs - the filesystem of path p
 * @param p - the  path being deleted - to be moved to trasg
 * @param conf - configuration
 * @return false if the item is already in the trash or trash is disabled
 * @throws IOException on error
 */
public static boolean moveToAppropriateTrash(FileSystem fs, Path p,
Configuration conf) throws IOException {
  Path fullyResolvedPath = fs.resolvePath(p);
.{code}
Unless I am mistaken, fs.resolvePath(p) should resolve path through 
DistributedFileSystem -> DFSClient -> NameNode.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data 

[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-04 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339097#comment-17339097
 ] 

Ayush Saxena commented on HDFS-15982:
-

We can fix things as an Addendum PR, if all agree.

The basic ask is behaviour should stay same for all {{FileSystems}}. It 
shouldn't be like if I move from {{WebHdfs}} to another FS, behaviour changes, 
if other FS throws exception, we too should throw, if returns a value for 
certain case, {{WebHdfs}} should return the same. If other Fs moves to trash 
with the value set, then WebHdfs also should in that case, if nobody does that 
WebHdfs should also not do that.

If some behaviour doesn't look good or feel is important to change, Let us 
change in Hadoop-Common first, Let all filesystem adapt and we can happily 
change.

Regarding compatibility, An incompatible change is something which isn't 
suppose to be done because the applications tend to believe that behaviour 
won't change and Hadoop in general used to(not now I guess) have very high 
standards. Eg. Hive relies on rename call to see if the target exists, if it 
returns false, means target already exist, it appends then counter and then 
rename again. If someone starts throwing exception in that code will break, 
Which isn't a good thing. HDFS-13732 got reverted for just a CLI change which 
in all cases is good.

It is never like for improvements we just mark incompatible and then side-line 
the compatibility because someone feels it would be good. For Bugs, yes we do 
so if it is the last option, for which he have a flag telling sorry we had no 
choice.

Regarding the router part, Is the call going back to the router, since the path 
needs to be resolved with respect to the mount table, or is it resolving with 
respect to the Namenode itself?

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-04 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338945#comment-17338945
 ] 

Viraj Jasani commented on HDFS-15982:
-

Thanks for the questions [~ayushtkn]

All of below examples include positive value of fs.trash.interval config (to 
enable trash through core-site config).
{quote} * If my FileSystem is WebHdfs, If I call delete with recursive false on 
a non empty directory, will it delete now or throw me an exception? Should be a 
NO{quote}
{code:java}
$ curl -X DELETE 
"http://localhost:9870/webhdfs/v1/xyz?op=DELETE=false;
{"boolean":true}
{code}
{code:java}
$ curl -X DELETE 
"http://localhost:9870/webhdfs/v1/xyz?op=DELETE=false=true; 
{"RemoteException":{"exception":"PathIsNotEmptyDirectoryException","javaClassName":"org.apache.hadoop.fs.PathIsNotEmptyDirectoryException","message":"`/xyz
 is non empty': Directory is not empty"}}
{code}
I believe for skiptrash true case, we should just return \{"boolean":false} ?

 
{quote} * if my FileSystem is WebHdfs, If I call delete on a non-existing file, 
the response will be false? or Now an exception.{quote}
{code:java}
$ curl -X DELETE 
"http://localhost:9870/webhdfs/v1/xyz1?op=DELETE=false;
{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
 does not exist: /xyz1"}}
{code}
{code:java}
$ curl -X DELETE 
"http://localhost:9870/webhdfs/v1/xyz1?op=DELETE=false=true;
 {"boolean":false}
{code}
Similarly here, for moving to trash case, we should return \{"boolean":false} ?
{quote} * How does this trash path resolution behaves when the client is coming 
through Router?{quote}
I think the resolution should be taken care of by FileSystem 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java#L1573-L1575].

For the above cases where Exceptions are being thrown instead of returning 
\{"boolean":false}, let me file a follow-up subtask soon once you confirm the 
expected behaviour [~ayushtkn].

Thanks

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-05-04 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338912#comment-17338912
 ] 

Ayush Saxena commented on HDFS-15982:
-

Out of curiosity:
 * If my FileSystem is WebHdfs, If I call delete with recursive false on a non 
empty directory, will it delete now or throw me an exception? Should be a NO
 * if my FileSystem is WebHdfs, If I call delete on a non-existing file, the 
response will be false? or Now an exception.
 * How does this trash path resolution behaves when the client is coming 
through Router?

If any behaviour is changed only wrt only WebHdfs, should be pulled up to 
FileSystem, and extended to all Filesystems.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, hdfs-client, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-27 Thread Bhavik Patel (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17333176#comment-17333176
 ] 

Bhavik Patel commented on HDFS-15982:
-

[~brahma] [~chaosun] [~aceric] [~daryn] can you please review this PR? 

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-27 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332992#comment-17332992
 ] 

Viraj Jasani commented on HDFS-15982:
-

[~aajisaka] [~ayushtkn] [~liuml07] [~tasanuma] Sorry for the wider ping, since 
3.3.1 RC cut is going to happen very soon, could you please help review PRs as 
per your convenience:
 # trunk [PR|https://github.com/apache/hadoop/pull/2927]
 # branch-3.3 backport [PR|https://github.com/apache/hadoop/pull/2925] (trunk 
PR is cleanly applied to branch-3.3)

Thanks

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, httpfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-26 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331801#comment-17331801
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}[~vjasani] I think [~weichiu] is trying to tell like along with webhdfs 
API we have to also consider the behavior of httpfs(server) delete API.
{quote}
Oh I see, my bad. Yes, this is also taken care of 
[here|https://github.com/apache/hadoop/pull/2927/files#diff-b8d1575f2afb5b04a56f4a43eb8ed4387fbef8ebe51c4503052413a2887cf96bR747].

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-25 Thread Bhavik Patel (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331728#comment-17331728
 ] 

Bhavik Patel commented on HDFS-15982:
-


{code:java}
You will want to make sure to add a release note. Also, we should consider 
making the delete behavior of DistributedFileSystem consistent with 
webhdfs/httpfs (i.e. do not skip trash by default)
{code}

[~vjasani] I think [~weichiu] is trying to tell like along with webhdfs API we 
have to also consider the behavior of httpfs(server) delete API.

[~weichiu] As we are updating the current delete API with the skipTrash param 
so it will not break any existing functionality, but of course we have to 
mention in the release note

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-25 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331561#comment-17331561
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}So – essentially we change the default behavior of webhdfs delete 
operation.
{quote}
Although this is correct, however we are actually aligning webhdfs delete 
behaviour with delete command. As far as API endpoint is concern, we are 
introducing new query param (skiptrash) similar to skipTrash argument with 
delete command.
{quote}You will want to make sure to add a release note.
{quote}
Sure thing, since this is incompatible change w.r.t DELETE API's default 
behaviour.
{quote}Also, we should consider making the delete behavior of 
DistributedFileSystem consistent with webhdfs/httpfs (i.e. do not skip trash by 
default)
{quote}
With this Jira, if fs.trash.interval > 0, *default* behavior of *delete 
command* as well as *delete http API* both will be to move deleted data to 
trash until trash interval (so far, only delete command used to do this). 
Hence, the changes proposed are at NamenodeWebHdfsMethods API endpoint level. 
Hence, I am not sure if we should perform any changes at server side i.e at 
DistributedFileSystem#delete(Path f, boolean recursive) level. Sorry if I 
misunderstood your concern. But if you meant if we should not skip trash by 
default (whether we use delete command or delete http API endpoint), then yes, 
changes are done appropriately.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-25 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331521#comment-17331521
 ] 

Wei-Chiu Chuang commented on HDFS-15982:


So – essentially we change the default behavior of webhdfs delete operation.

 

This is a big incompatible change. If we think this should be part of 3.4.0, 
risking our compatibility guarantee (which I think makes sense, given how many 
times I was involved in accidental data deletion), I think it can be part of 
3.3.1. We traditionally regard 3.3.0 as non-production ready, so making an 
incompat change in 3.3.1 probably is justifiable. 

 

You will want to make sure to add a release note. Also, we should consider 
making the delete behavior of DistributedFileSystem consistent with 
webhdfs/httpfs (i.e. do not skip trash by default)

 

Thoughts?

 

(BTW, thanks for involving me. I suspect it's going to break some of our 
applications/integration tests, so having a little more preparedness is good)

 

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs, webhdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-25 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331426#comment-17331426
 ] 

Viraj Jasani commented on HDFS-15982:
-

Thanks for the PR review [~liuml07]. If this is not too much of change 
(specifically not a brittle one), I was thinking if we could include it with 
upcoming 3.3.1 release (before [~weichiu] announces code freeze on branch-3.3, 
which is likely in coming week after hadoop-thirdparty release).

Thanks

FYI [~kpalanisamy] [~daryn] 

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-23 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330913#comment-17330913
 ] 

Mingliang Liu commented on HDFS-15982:
--

With the support of optional skipTrash I think this makes more sense. As there 
are very related reviews on [HDFS-14320], I will defer to [~kpalanisamy] 
[~weichiu] and [~daryn] for review/comments how we can move forward.

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15982) Deleted data using HTTP API should be saved to the trash

2021-04-23 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330398#comment-17330398
 ] 

Viraj Jasani commented on HDFS-15982:
-

{quote}Let's rename the JIRA subject and also PR by replacing "Web UI" with 
"HTTP API". HDFS "Web UI" is usually about the web portal that one can browse 
for information purpose. This JIRA is to change the "RESTful HTTP API", not 
about the Web UI.
{quote}
Thanks [~liuml07], yes the core changes are at API level and hence I have 
updated Jira as well as PR titles. However, in the recent revision, I have 
added "skipTrash" option in API, which can be accessed through UI. Please find 
attached screenshots.
{quote}My only concern about this is that, the "Trash" concept is not a part of 
the FileSystem DELETE API. Changing this behavior may break existing 
applications that assumes storage will be released.
{quote}
I understand and hence there is no attempt to change FileSystem DELETE API 
itself, rather the changes are limited to  NamenodeWebHdfsMethods endpoint only.
{quote}It seems counter-intuitive that one can skipTrash from command line but 
can not using WebHDFS. Since keeping data in Trash for a while is usually a 
good idea, I think I'm fine with this feature proposal. Ideally we can expose 
-skipTrash parameter so users can choose.
{quote}
Thanks for this advice. Updated PR revision has this change.
{quote}When I explore I found [HDFS-14320] is all about the same idea and 
similar implementation. Do you guys want to post there and try with a 
collaboration to get this in? I did not look into that closely.
{quote}
Apologies, I was not aware of existing Jira already. Thanks for pointing this 
out. I have commented over this Jira as well. Since that Jira's last patch 
revision was more than 2 yr old, I am not sure if the patch is upto date or we 
will get active response from there.

 

!Screenshot 2021-04-23 at 4.19.42 PM.png!

> Deleted data using HTTP API should be saved to the trash
> 
>
> Key: HDFS-15982
> URL: https://issues.apache.org/jira/browse/HDFS-15982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Reporter: Bhavik Patel
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2021-04-23 at 4.19.42 PM.png, Screenshot 
> 2021-04-23 at 4.36.57 PM.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If we delete the data from the Web UI then it should be first moved to 
> configured/default Trash directory and after the trash interval time, it 
> should be removed. currently, data directly removed from the system[This 
> behavior should be the same as CLI cmd]
> This can be helpful when the user accidentally deletes data from the Web UI.
> Similarly we should provide "Skip Trash" option in HTTP API as well which 
> should be accessible through Web UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org