Hi Pifta,
The problems with renames and deletes with multiple file will be fixed
via HDDS-2939.
There is an attached design doc on the jira which lists the problem.
Another followup question, was there any significant performance
difference between o3fs vs ofs ?
Thanks,
Mukul
On 28/05/20 7:15 pm, István Fajth wrote:
Hello everyone,
recently I am working on to test the o3fs/ofs implementation with Hive, and
with some other things as well.
I have ran into a few surprisingly slow operations and some interesting
file system states during data preparation, all of which seems to be a real
problem in at least some of the data loading scenarios. Let see them one by
one:
1. When you load data to Hive by copying a data source to Ozone, and use it
as an external table to load it into an other table with either create
table as select, or insert based on a select, Hive does temporary files
first, and (even though it think it is a problem) it renames the temporary
data folder 3-4 times during getting it to the final location. If the table
is partitioned to a lot of files, this can get to extremes... (In one run,
it took 4 hours to get through this stage for some tables).
2. When you have a folder with a lot of files in it, deleting the folder
(also dropping the table from beeline, or deleting it via rm) is blocking
the client and the request does not get a response until the deletion of
the blocks are happening (or maybe until the last batch is processed by
SCM) as it seems. A folder that was created during 4 hours of renames by
Hive got deleted in like 30 minutes or so.
3. During the rename of a folder that contains a large number of files, the
filesystem is in an interesting state for other clients. An ls -R running
on the parent of the folder being renamed, throws a FileNotFoundException
for the path. So the listStatus seems to contain the path being renamed,
while getting the status or accessing the path being renamed throws the
FNFException.
For 1 and 2, there is HDDS-1301 which was aiming to optimise these APIs in
OM though the patch never got committed, as it is not clear what load will
this put on OM, and how long it will lock things there. I don't think there
is an easy solution for the problem with the current architecture, but
would like to kickoff a discussion in the community. Also if you can help
me with some documents about possible solutions if already exists some, I
would be very happy to check into those.
For 3, One would think for the first sight that we should at least do a
change in the listStatus API to do not add the folders being renamed to the
response, as they can not be accessed at the moment. Though this is
problematic, as if someone would like to create the same folder during the
rename, then that will fail as the path already exists, but the folder will
not be usable after such a failure and the path will not exists after the
rename finishes... This is a problem now as well but I haven't seen this
causing troubles so far. So... there might not be a good solution for this
unless the renames are getting atomic in our filesystem implementations
somehow, which goes back to the previous point.
I would like to hear your opinion on this, as I am hesitant and can not
decide on whether we should do anything about this problem. Probably this
was discussed earlier and there is a definite answer I am unaware of, that
is fine for me as well if someone can share it, or point me to a design doc
that mentions our approach on this phenomenon.
---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org