Do you have security enabled? We did some preliminary benchmarks around webhdfs (i really want to revisit it again) and with security enabled, a lot of overhead is between client and KDC (SPENGO). Try run webhdfs using delegation tokens should help remove that bottleneck.
On Sat, Oct 8, 2022 at 8:26 PM Abhishek <ahk12...@gmail.com> wrote: > Hi, > We want to backup large no of hadoop small files (~1mn) with webhdfs API > We are getting a performance bottleneck here and it's taking days to back > it up. > Anyone know any solution where performance could be improved using any xml > settings? > This would really help us. > v 3.1.1 > > Appreciate your help !! > > -- > > > > > > > > > > > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > *Abhishek...* >