Bug#522726: kernel problem after a simple 'rm' command: RESERVE_SPACE(805) failed in function encode_lookup
Hi Ben, No, I haven't got a chance to check if the bug exists in newer version. We changed our NFS server from Linux to OpenSolaris. But it was a major problem. It re-occurred every time a user would attempt a filesystem operation where the filename was very long (e.g. 500 characters). Any fs write operation (rm, create new file) would cause the kernel panic. The crash happened several times a year. In all cases it was when someone would antecedently pass data instead of a filename to a peace of code that expects filenames. Alex On Mon, Apr 5, 2010 at 3:45 PM, Ben Hutchings wrote: > On Sun, 2009-04-05 at 23:08 -0700, Aleksandr Levchuk wrote: >> Package: nfs-kernel-server >> Version: 1:1.0.10-6+etch.1 >> Severity: important >> >> My very stable server crashed as a result of a 'rm' command in an >> NFS-mounted home directory. The 'rm' command was a file name (with >> newlines) but that file did not exist. > [...] > > Sorry for the delay in replying to this. The nfs-kernel-server package > only contains supporting scripts, but the bug is clearly in the kernel > itself (linux-image-* packages). > > The system you reported this bug from was apparently running Linux > 2.6.22. I assume that is the same version in which you saw this bug. > Have you seen the bug reoccur in any more recent kernel version? > > Ben. > > -- > Ben Hutchings > Once a job is fouled up, anything done to improve it makes it worse. > -- - Aleksandr Levchuk Administrator of Bioinformatic Systems and Databases Homepage: http://biocluster.ucr.edu/~alevchuk/ Cell Phone: (951) 368-0004 Lab Phone: (951) 905-5232 Institute for Integrative Genome Biology University of California, Riverside - -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#522726: kernel problem after a simple 'rm' command: RESERVE_SPACE(805) failed in function encode_lookup
On Sun, 2009-04-05 at 23:08 -0700, Aleksandr Levchuk wrote: > Package: nfs-kernel-server > Version: 1:1.0.10-6+etch.1 > Severity: important > > My very stable server crashed as a result of a 'rm' command in an > NFS-mounted home directory. The 'rm' command was a file name (with > newlines) but that file did not exist. [...] Sorry for the delay in replying to this. The nfs-kernel-server package only contains supporting scripts, but the bug is clearly in the kernel itself (linux-image-* packages). The system you reported this bug from was apparently running Linux 2.6.22. I assume that is the same version in which you saw this bug. Have you seen the bug reoccur in any more recent kernel version? Ben. -- Ben Hutchings Once a job is fouled up, anything done to improve it makes it worse. signature.asc Description: This is a digitally signed message part
Bug#522726: kernel problem after a simple 'rm' command: RESERVE_SPACE(805) failed in function encode_lookup
Package: nfs-kernel-server Version: 1:1.0.10-6+etch.1 Severity: important My very stable server crashed as a result of a 'rm' command in an NFS-mounted home directory. The 'rm' command was a file name (with newlines) but that file did not exist. The NFS client and the NFS server were the same machine. Surprisingly, this cause a big problem inside the Kernel - the stack trace shows a large amount of NFS system calls. Here is what I did and what I got in response: alevc...@biocluster:~/.html/cellwall$ rm 'source_fasta_tair-v20080412-seq---_downloaded-2009-04-04 > source_fasta_tair-v20080412-pep---_downloaded-2009-04-04 > source_fasta_tair-v20080412-cds---_downloaded-2009-04-04 > source_fasta_tair-v20080412-cdna--_downloaded-2009-04-04 > source_fasta_tair-v20080229-igenic_downloaded-2009-04-04 > source_fasta_tair-v20080228-intron_downloaded-2009-04-04 > source_fasta_tigr-v6-0-all-seq_downloaded-2009-04-04 > source_fasta_tigr-v6-0-all-pep_downloaded-2009-04-04 > source_fasta_jgi-poptr-v1-1_prot--_downloaded-2009-04-04 > source_fasta_jgi-phypa-v1-1_trans-_downloaded-2009-04-04 > source_fasta_jgi-phypa-v1-1_prot--_downloaded-2009-04-04 > source_fasta_uniprot-v14-9-_tremb-_downloaded-2009-04-04 > source_fasta_uniprot-v14-9-_sprot-_downloaded-2009-04-04 > source_fasta_jgi-poptr-v1-1_trans-_downloaded-2009-04-04' Segmentation fault Message from sysl...@biocluster at Sat Apr 4 23:06:56 2009 ... biocluster kernel: [ cut here ] Message from sysl...@biocluster at Sat Apr 4 23:06:56 2009 ... biocluster kernel: invalid opcode: [1] SMP Message from sysl...@biocluster at Sat Apr 4 23:06:56 2009 ... biocluster kernel: invalid opcode: [1] SMP Message from sysl...@biocluster at Sat Apr 4 23:06:56 2009 ... biocluster kernel: [ cut here ] Here is what /var/log/messages showed immediately after: Apr 4 22:39:40 biocluster -- MARK -- Apr 4 22:59:40 biocluster -- MARK -- Apr 4 23:06:56 biocluster kernel: RESERVE_SPACE(805) failed in function encode_lookup Apr 4 23:06:56 biocluster kernel: CPU 15 Apr 4 23:06:56 biocluster kernel: Modules linked in: tcp_diag inet_diag nfsd exportfs button ac battery autofs4 ib_ipoib ipv6 nfs lockd nfs_acl sunrpc quota_v1 ext2 ext3 jbd mbcache dm_snapshot dm_mirror dm_mod qla2xxx mppVhba mppUpper sg rdma_ucm rdma_cm ib_cm iw_cm ib_sa ib_addr ib_umad ib_ipath ib_uverbs mlx4_ib ib_mad ib_core loop psmouse serio_raw i2c_i801 i2c_core shpchp pci_hotplug pcspkr mlx4_core igb evdev xfs ide_cd cdrom ata_generic sd_mod ata_piix libata piix generic ide_core ehci_hcd uhci_hcd firmware_class scsi_transport_fc mptsas mptscsih mptbase e1000 scsi_transport_sas scsi_mod thermal processor fan Apr 4 23:06:56 biocluster kernel: Pid: 12459, comm: rm Not tainted 2.6.22-3-amd64 #1 Apr 4 23:06:56 biocluster kernel: RIP: 0010:[] [] :nfs:encode_lookup+0x34/0x5c Apr 4 23:06:56 biocluster kernel: RSP: 0018:81053e8b38d8 EFLAGS: 00010292 Apr 4 23:06:56 biocluster kernel: RAX: 0037 RBX: 031d RCX: 804afd28 Apr 4 23:06:56 biocluster kernel: RDX: 804afd28 RSI: 0092 RDI: 804afd20 Apr 4 23:06:56 biocluster kernel: RBP: 0325 R08: 804afd28 R09: Apr 4 23:06:56 biocluster kernel: R10: 0046 R11: 8100010ceb40 R12: 81070967edb0 Apr 4 23:06:56 biocluster kernel: R13: 810e2c4343a8 R14: 88408091 R15: 81070967edb0 Apr 4 23:06:56 biocluster kernel: FS: 2b5b8bc496e0() GS:810f0463a6c0() knlGS: Apr 4 23:06:56 biocluster kernel: CS: 0010 DS: ES: CR0: 8005003b Apr 4 23:06:56 biocluster kernel: CR2: 00403940 CR3: 000b7e1ee000 CR4: 06e0 Apr 4 23:06:56 biocluster kernel: Process rm (pid: 12459, threadinfo 81053e8b2000, task 810c73dad020) Apr 4 23:06:56 biocluster kernel: Stack: 810e2c4343a8 81053e8b3a38 81063849b884 884080f3 Apr 4 23:06:56 biocluster kernel: 81063849b8ac 810e2c4343b0 81063849ba38 810e2c4343b0 Apr 4 23:06:56 biocluster kernel: 0004 81063849b884 Apr 4 23:06:56 biocluster kernel: Call Trace: Apr 4 23:06:56 biocluster kernel: [] :nfs:nfs4_xdr_enc_lookup+0x62/0x85 Apr 4 23:06:56 biocluster kernel: [] :sunrpc:call_transmit+0x1c1/0x22d Apr 4 23:06:56 biocluster kernel: [] :sunrpc:__rpc_execute+0x7d/0x234 Apr 4 23:06:56 biocluster kernel: [] :sunrpc:rpc_call_sync+0x75/0x9c Apr 4 23:06:56 biocluster kernel: [] touch_atime+0xbe/0x101 Apr 4 23:06:56 biocluster kernel: [] :nfs:nfs4_proc_lookup+0xe5/0x25c Apr 4 23:06:56 biocluster kernel: [] get_page_from_freelist+0x363/0x4de Apr 4 23:06:56 biocluster kernel: [] :nfs:nfs_lookup+0xf6/0x262 Apr 4 23:06:56 biocluster kernel: [] do_lookup+0x63/0x1ae Apr 4 23:06:56 biocluster kernel: [] dput+0x1c/0x10b Apr 4 23:06:56 biocluster kernel: [] current_fs_time+0x3b/0x40 Apr 4 23:06:5