Re: job history server
2020-02-18 09:44:45,227 ERROR org.apache.flink.runtime.webmonitor.hist/ry.HistoryServerArchiveFetcher - Failure while fetching/process ing job archive for job eaf0639027aca1624adaa100bdf1332e. java.nio.file.FileSystemException: /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/062e4d80ed1d4bdafd24e46 2245c5926/subtasks/86/attempts/0.json: No space left on device and there it is: 42103b5b-5410-d2d8-6a0b-21757e4a0fbc ~ 0 % df -iH Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg00-rootlv00 132k 13k 119k 10% / tmpfs ` 508k 465k 43k 92% /dev/shm Thanks for the tip. On Mon, Feb 17, 2020 at 8:08 PM Richard Moorhead wrote: > I did not know that. > > I have since wiped the directory. I will post when I see this error again. > > On Mon, Feb 17, 2020 at 8:03 PM Benchao Li wrote: > >> `df -H` only gives the sizes, not inodes information. Could you also show >> us the result of `df -iH`? >> >> Richard Moorhead 于2020年2月18日周二 上午9:40写道: >> >>> Yes, I did. I mentioned it last but I should have been clearer: >>> >>> 22526:~/ $ df -H >>> >>> >>> [18:15:20] >>> FilesystemSize Used Avail Use% Mounted on >>> /dev/mapper/vg00-rootlv00 >>> 2.1G 777M 1.2G 41% / >>> tmpfs 2.1G 753M 1.4G 37% /dev/shm >>> >>> On Mon, Feb 17, 2020 at 7:13 PM Benchao Li wrote: >>> Hi Richard, Have you checked that inodes of the disk partition were full or not? Richard Moorhead |richard.moorh...@gmail.com> 于2020年2月18日周二 上午8:16写道: > I see the following exception often: > > 2020-02-17 18:13:26,796 ERROR > org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher - > Failure while fetching/processing job archive for job > eaf0639027aca1624adaa100bdf1332e. > java.nio.file.FileSystemException: > /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6ab&3ed37d1a5e48f2786b832033f074/subtasks/86/attempts: > No space left on device > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) > at java.nio.file.Files.createDirectory(Files.java:674) > at > java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) J> at java.nio.file.Files.createDirectories(Files.java:767) > at > org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher$JobArchiveFetcherTask.run(HistoryServerArchiveFetcher.java:186) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > > Unfortunately the partition listed does not appear to be full or > anywhere near full? > > Is there ! workaround to this? > > -- Benchao Li School of Electronics Engineering and Computer Science, Peking University Tel:+86-15650713730 Email: libenc...@gmail.com; libenc...@pku.edu.cn >> >> -- >> >> Benchao Li >> School of Electronics Engineering and Computer Science, Peking University >> Tel:+86-15650713730 >> Email: libenc...@gmail.com; libenc...@pku.edu.cn >> >>
Re: job history server
I did not know that. I have since wiped the directory. I will post when I see this error again. On Mon, Feb 17, 2020 at 8:03 PM Benchao Li wrote: > `df -H` only gives the sizes, not inodes information. Could you also show > us the result of `df -iH`? > > Richard Moorhead 于2020年2月18日周二 上午9:40写道: > >> Yes, I did. I mentioned it last but I should have been clearer: >> >> 22526:~/ $ df -H >> >> >> [18:15:20] >> FilesystemSize Used Avail Use% Mounted on >> /dev/mapper/vg00-rootlv00 >> 2.1G 777M 1.2G 41% / >> tmpfs 2.1G 753M 1.4G 37% /dev/shm >> >> On Mon, Feb 17, 2020 at 7:13 PM Benchao Li wrote: >> >>> Hi Richard, >>> >>> Have you checked that inodes of the disk partition were full or not? >>> >>> Richard Moorhead 于2020年2月18日周二 上午8:16写道: >>> I see the following exception often: 2020-02-17 18:13:26,796 ERROR org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher - Failure while fetching/processing job archive for job eaf0639027aca1624adaa100bdf1332e. java.nio.file.FileSystemException: /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6abf3ed37d1a5e48f2786b832033f074/subtasks/86/attempts: No space left on device at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) at java.nio.file.Files.createDirectory(Files.java:674) at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) at java.nio.file.Files.createDirectories(Files.java:767) at org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher$JobArchiveFetcherTask.run(HistoryServerArchiveFetcher.java:186) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Unfortunately the partition listed does not appear to be full or anywhere near full? Is there a workaround to this? >>> >>> -- >>> >>> Benchao Li >>> School of Electronics Engineering and Computer Science, Peking University >>> Tel:+86-15650713730 >>> Email: libenc...@gmail.com; libenc...@pku.edu.cn >>> >>> > > -- > > Benchao Li > School of Electronics Engineering and Computer Science, Peking University > Tel:+86-15650713730 > Email: libenc...@gmail.com; libenc...@pku.edu.cn > >
Re: job history server
`df -H` only gives the sizes, not inodes information. Could you also show us the result of `df -iH`? Richard Moorhead 于2020年2月18日周二 上午9:40写道: > Yes, I did. I mentioned it last but I should have been clearer: > > 22526:~/ $ df -H > > >[18:15:20] > FilesystemSize Used Avail Use% Mounted on > /dev/mapper/vg00-rootlv00 > 2.1G 777M 1.2G 41% / > tmpfs 2.1G 753M 1.4G 37% /dev/shm > > On Mon, Feb 17, 2020 at 7:13 PM Benchao Li wrote: > >> Hi Richard, >> >> Have you checked that inodes of the disk partition were full or not? >> >> Richard Moorhead 于2020年2月18日周二 上午8:16写道: >> >>> I see the following exception often: >>> >>> 2020-02-17 18:13:26,796 ERROR >>> org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher - >>> Failure while fetching/processing job archive for job >>> eaf0639027aca1624adaa100bdf1332e. >>> java.nio.file.FileSystemException: >>> /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6abf3ed37d1a5e48f2786b832033f074/subtasks/86/attempts: >>> No space left on device >>> at >>> sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) >>> at >>> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) >>> at >>> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) >>> at >>> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) >>> at java.nio.file.Files.createDirectory(Files.java:674) >>> at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) >>> at java.nio.file.Files.createDirectories(Files.java:767) >>> at >>> org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher$JobArchiveFetcherTask.run(HistoryServerArchiveFetcher.java:186) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>> at >>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) >>> at >>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >>> at java.lang.Thread.run(Thread.java:748) >>> >>> >>> Unfortunately the partition listed does not appear to be full or >>> anywhere near full? >>> >>> Is there a workaround to this? >>> >>> >> >> -- >> >> Benchao Li >> School of Electronics Engineering and Computer Science, Peking University >> Tel:+86-15650713730 >> Email: libenc...@gmail.com; libenc...@pku.edu.cn >> >> -- Benchao Li School of Electronics Engineering and Computer Science, Peking University Tel:+86-15650713730 Email: libenc...@gmail.com; libenc...@pku.edu.cn
Re: job history server
Yes, I did. I mentioned it last but I should have been clearer: 22526:~/ $ df -H [18:15:20] FilesystemSize Used Avail Use% Mounted on /dev/mapper/vg00-rootlv00 2.1G 777M 1.2G 41% / tmpfs 2.1G 753M 1.4G 37% /dev/shm On Mon, Feb 17, 2020 at 7:13 PM Benchao Li wrote: > Hi Richard, > > Have you checked that inodes of the disk partition were full or not? > > Richard Moorhead 于2020年2月18日周二 上午8:16写道: > >> I see the following exception often: >> >> 2020-02-17 18:13:26,796 ERROR >> org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher - >> Failure while fetching/processing job archive for job >> eaf0639027aca1624adaa100bdf1332e. >> java.nio.file.FileSystemException: >> /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6abf3ed37d1a5e48f2786b832033f074/subtasks/86/attempts: >> No space left on device >> at >> sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) >> at >> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) >> at >> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) >> at >> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) >> at java.nio.file.Files.createDirectory(Files.java:674) >> at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) >> at java.nio.file.Files.createDirectories(Files.java:767) >> at >> org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher$JobArchiveFetcherTask.run(HistoryServerArchiveFetcher.java:186) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> at >> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >> >> Unfortunately the partition listed does not appear to be full or anywhere >> near full? >> >> Is there a workaround to this? >> >> > > -- > > Benchao Li > School of Electronics Engineering and Computer Science, Peking University > Tel:+86-15650713730 > Email: libenc...@gmail.com; libenc...@pku.edu.cn > >
Re: job history server
Hi Richard, Have you checked that inodes of the disk partition were full or not? Richard Moorhead 于2020年2月18日周二 上午8:16写道: > I see the following exception often: > > 2020-02-17 18:13:26,796 ERROR > org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher - > Failure while fetching/processing job archive for job > eaf0639027aca1624adaa100bdf1332e. > java.nio.file.FileSystemException: > /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6abf3ed37d1a5e48f2786b832033f074/subtasks/86/attempts: > No space left on device > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) > at java.nio.file.Files.createDirectory(Files.java:674) > at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) > at java.nio.file.Files.createDirectories(Files.java:767) > at > org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher$JobArchiveFetcherTask.run(HistoryServerArchiveFetcher.java:186) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > > Unfortunately the partition listed does not appear to be full or anywhere > near full? > > Is there a workaround to this? > > -- Benchao Li School of Electronics Engineering and Computer Science, Peking University Tel:+86-15650713730 Email: libenc...@gmail.com; libenc...@pku.edu.cn
job history server
I see the following exception often: 2020-02-17 18:13:26,796 ERROR org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher - Failure while fetching/processing job archive for job eaf0639027aca1624adaa100bdf1332e. java.nio.file.FileSystemException: /dev/shm/flink-history-server/jobs/eaf0639027aca1624adaa100bdf1332e/vertices/6abf3ed37d1a5e48f2786b832033f074/subtasks/86/attempts: No space left on device at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) at java.nio.file.Files.createDirectory(Files.java:674) at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) at java.nio.file.Files.createDirectories(Files.java:767) at org.apache.flink.runtime.webmonitor.history.HistoryServerArchiveFetcher$JobArchiveFetcherTask.run(HistoryServerArchiveFetcher.java:186) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Unfortunately the partition listed does not appear to be full or anywhere near full? Is there a workaround to this?