[ 
https://issues.apache.org/jira/browse/HADOOP-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704648#comment-17704648
 ] 

Steve Loughran commented on HADOOP-18679:
-----------------------------------------




h2. Possible API: delete queue

app creates a "DeleteOperation" from a filesystem which implements the 
DeleteOperationFactory interface; something like

{code}

DeleteOperationBuilder builder = createDeleteOperation(basepath)
builder.opt()  // for any options to set
builder.progress(progressable)

// then you get a queue you can submit files to delete to 
DeleteOperation deleter = builder.build()

Future<DeleteOutcome> oneOutcome = deleter.deleteFile(path)
Future<DeleteOutcome>[] outcomes = deleter.deleteFiles(paths[])
 
{code}

fs would build up pages of deletions and submit in batches, once a page comes 
back complete each of the outcomes.
if a store was on single file delete (third party stores need this), you'd get 
a few at a time in a thread pool.
A normal store would do it in batches of 200, again across a thread pool, but 
maybe with some rate limiting.
way too easy to overload s3 with big delete requests.

caller would get to wait for the outcome of every request; a failure of a 
single delete wouldn't halt the
uploads, though there'd be some other methods on DeleteQueue 

{code}

class DeleteOperation implements Closeable. IOStatisticsSource

void flush()   // wait for everything queued to complete
boolean cancel(path)   // remove a path from the queue if not already active
abort() cancel all not-yet submitted uploads.
close() : flush() then stop, unless abort() called first.
size(): queue size.[]
pageSize(): size of a page before a post; 1 -> single DELETE mode.

{code}



s3a would handle retries, permission failures would be reported in the outcome
IOStatistics api would have stats on IO (requests made, duration) and in 
close() it'd update the thread context iostats.

options would include
* whether to abort on first failure

this is fairly close to what we do in directory delete, though there we also 
queue tombstone markers (which end in /) and
abort the delete as soon as one page of deletes fails. we could make fail-fast 
an opt() option, perhaps

note, this doesn't doesn't take RemoteIterator<>, which is a pity, you'd wan't 
that for wiring up incremental listings.

for that we'd not be able to return a list of Futures<> as the list length 
isn't known at submission time, which implies a different way of reporting 
outcome, where the key outcome is probably "did this fail". providing a 
predicate to call back would be one strategy

it'd let you do good things like 

{code}
deleter.deleteFiles(
  filteringRemoteIterator(fs.listFiles(table), st -> st.getLen() == 0),
  outcome -> {if (outcome.failed) LOG.info("failed to delete {}", 
outcome.getPath()) })
{code}

  


> Add API for bulk/paged object deletion
> --------------------------------------
>
>                 Key: HADOOP-18679
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18679
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.5
>            Reporter: Steve Loughran
>            Priority: Major
>
> iceberg and hbase could benefit from being able to give a list of individual 
> files to delete -files which may be scattered round the bucket for better 
> read peformance. 
> Add some new optional interface for an object store which allows a caller to 
> submit a list of paths to files to delete, where
> the expectation is
> * if a path is a file: delete
> * if a path is a dir, outcome undefined
> For s3 that'd let us build these into DeleteRequest objects, and submit, 
> without any probes first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to