Re: Is fdisk broken?
On Fri, 22 Mar 2013 mla_str...@att.net wrote: I recently bought a 4 TB usb disk drive and discovered that it reported a sector size of 4096 bytes instead of the traditional 512 bytes. This is apparently necessary because there may be a 32 bit sector number field somewhere in the usb mass storage protocols. It turns out that disk drive manufacturers have been producing disks with large sector sizes for some years now. The feature goes by the name Advanced Format and other things. Look it up in Wikipedia. FreeBSD seems to use the sector size information when interpreting MBR partition offsets and sizes. Unfortunately, when I try to use fdisk to print out the partition table on my new disk drive, fdisk just says fdisk: could not detect sector size. It has the following gratuitous breakage at 2K for its probe of the sector size: #define MAX_SEC_SIZE 2048 /* maximum section size that is supported */ #define MIN_SEC_SIZE 512/* the sector size to start sensing at */ I used 64K for the probe maximum limit when I fixed fsck_msdosfs (fsck_msdosfs doesn't has a probe and only supports sector sizes of 512 in -current). Most file systems in FreeBSD have gratuitous limits on the size in their probe for there superblock, but the limit is mostly larger than 4K. Most of them don't need to know the sector size and don't have a probe, but they read a fixed size that is larger than their superblock size, so they fail if this size is smaller than the the sector size. Otherwise the MBR partition table seems to work correctly and newfs seems to have done the right thing. (It made the file system fragment size a multiple of the sector size and I am not getting any weird error messages out of the disk driver.) It would be nice if fdisk also worked. I do have to share the disk with other operating systems that might not understand other partition table schemes. Is may analysis of what is going on essentially correct? Can fdisk be made happy again? (At least for a few more years?) Changing the above should fix fdisk for FreeBSD. A sector size of 4K gives a limit of 16TB for the partition table data structure, which is enough for a few more years with single disks. After that, double the sector size to 8K to work for another year or two. However, to share the disk you need all the other operating systems and BIOS to agree that _this_ partition table scheme (with units of 4K sectors) is what the partition table records. Bruce ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: kernel profiling: spinlock_exit consumes 36% CPU time.
On Tue, 7 Oct 2008, John Baldwin wrote: On Tuesday 07 October 2008 07:44:00 am wrote: Hi, folks, I did kernel profiling when a single thread client sends UDP packets to a single thread server on the same machine. In the output kernel profile, the first few kernel functions that consumes the most CPU time are listed below: granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds % cumulative self self total time seconds secondscalls ms/call ms/call name 42.4 10.8810.880 100.00% __mcount [1] 36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4] 4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40] 1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43] 1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48] 1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3] 0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46] 0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9] It is very strange that spinlock_exit consumes over 36% CPU time while it seems a very simple function. It's because the intr_restore() re-enables interrupts and the resulting time spent executing the handlers for any pending interrupts are attributed to spinlock_exit(). This is one of many defects that are not present in high resolution kernel profiling (kgmon -B instead of kgmon -b; availaible on amd64 and i386). However, high resolution kernel profiling doesn't work right with SMP, and was completely broken by gcc-4. Ordinary profiling was less completely broken by gcc-4, and you can recover the old behaviour by turning off new optimizations (mainly -funit-at-a-time and/or -finline-functions-called-once and or all of -O2). Bruce___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Realtek 8111B LAN Chipset
On Sat, 19 Jan 2008, Greg Mars wrote: I'm buying parts for a computer and want to make sure that the core components are as freebsd friendly as possible. So far, I've decided on a core 2 quad q6600 and I'm choosing the motherboard now. Me2 (unless I wait for a newer generation of CPUs). However it seems many of the popular motherboards have Realtek ALC888 as built-in audio and Realtek 8111B as built-in LAN. I read at: http://www.freebsd.org/relnotes/CURRENT/hardware/i386/article.html that the sound should work but I couldn't find any info on the LAN. Does anyone on the list have any experience with it? By the way, I'm going to run FreeBSD 7. I also want a cheap PCI/e NIC that works well with drivers back to FreeBSD-4 like my plain PCI bge and em NICs do. I doubt that any popular motherboard will have anything better than a cheap PCI/e NIC. Bruce ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: (S)ATA performance in FBSD 6.2/7.0
On Fri, 2 Mar 2007, Brooks Davis wrote: Also, you should time the actual copy and do the math to verify that vmstat is actually producing valid results. It's possible there's a bug in vmstat or the underlying statistics it uses. There is certainly a bug in the underlying statistics. For ATA disks, at least with the ata driver, the maximum transfer size in DMA mode is 64K, so any reports of a block size of 128K for SATA disks are wrong. The block size of 128K reported by vmstat is actually a virtual size. For most or types of disks, the GEOM layer virtualizes the physical maximum size MAXPHYS = 128K so that layers above GEOM including statistics gathering and file systems cannot see the physical size. For writing large files, this normally confuses ffs and vfs clustering into producing contiguous writes of 128K. This is good for efficiency, but it is not what the hardware sees or what you want for statistics. The contiguous writes of 128K get split up into 2 sequential writes of 64K. However, 64K is more than large enough for efficiency, so the bug in the underlying statistics doesn't matter, at least if vmstat reports only 128K blocks. If it reported 64K-blocks then you would have to worry about the contiguous block sizes being a mixture of 128K and much smaller blocks, with the much smaller blocks (actually, more the seeks across gaps to get to the smaller blocks) being very inefficient. Bruce ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: very big files on cd9660 file system
On Fri, 19 Aug 2005, Mikhail Teterin wrote: I have a cd9660 image with several files on it. One of the files is very large (above 4Gb). When I mount the image, the size of this file is shown as realsize % 4Gb -- 758876749 bytes instead of 5053844045. What should I blame: 1) The software, that created the image (modified mkisofs) 2) cd9660 part of the FreeBSD kernel 3) ISO-9660 standard Mostly (b). Sizes are 64 bits in the standard, but FreeBSD has always silently discarded the highest 32 bits and corrupted the next highest bit to a sign bit, so the file size limit is at most 2GB or 4GB (depending on whether the sign bit gets corrupted back to a value bit). From cd9660_vfsops.c: % ip-i_size = isonum_733(isodir-size); This reads the size from the directory entry. From iso.h: % u_char size [ISODCL (11, 18)]; /* 733 */ This says that the size is in bytes 11-18 (option base 1) in the directory entry. All 733 entries are 8 bytes. The others are for other sizes and the extent (the starting block number for a file). % static __inline int % isonum_733(p) % u_char *p; % { % return *p|(p[1] 8)|(p[2] 16)|(p[3] 24); % } This says that the the highest 32 bits are discarded for all 733 entries and the sign bit in p[3] is corrupted, first by shifting it and then by assigning the result to an int. i_size has type long, unlike in most file systems in FreeBSD where it is uint64_t or uint32_t, so I think the sign bit stays corrupted but doesn't cause further problems by being converted to 33 top unsigned bits, giving a limit of 2GB. The file size limit is hit before the others. 31-bit block numbers with 2K-blocks work up to 4TB. There are likely to be overflow bugs at 1TB before the 4TB limit is hit. We still have the even closer limit of 4GB on media sizes. From cd9660_node.c: % ino_t % isodirino(isodir, imp) % struct iso_directory_record *isodir; % struct iso_mnt *imp; % { % ino_t ino; % % ino = (isonum_733(isodir-extent) + isonum_711(isodir-ext_attr_length)) % imp-im_bshift; % return (ino); % } This fakes the inode number as the byte offset of the directory entry. ino_t is uint32_t, so this fails if the byte offset exceeds 4GB. The eventual 32nd bit overflows to become a sign bit in the shift but then gets overflows back to a correct bit in the assignent, so offsets between 2GB and 4GB work accidentally. Since the limit is on the offsets of directory entries, media larger than 4GB can be used for cd9660 under FreeBSD iff all directroy entries are below the limit, which happens automatically for the non-multi-session case only. See revs.1.77 and 1.99 for other bugs caused by isodirino(). Bruce ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Anyway to extract a large file from EXT2FS filesystem?
On Tue, 17 Feb 2004, Kris Kennaway wrote: 5BOn Tue, Feb 17, 2004 at 11:16:50AM +0100, Stefan Krantz wrote: On Tue, 17 Feb 2004, Kris Kennaway wrote: On Tue, Feb 17, 2004 at 10:49:47AM +0100, Stefan Krantz wrote: Hi! I would like to extract a large (11GB) tar file on an ext3 filesystem. But it shows only to be about 3gb large: yabba# ls -la pictures.tar -rw-r--r-- 1 root wheel 3317055488 Feb 15 19:03 pictures.tar Is there any possible way to extract the file? It shouldn't be appearing truncated. Are you certain that this size is incorrect, and the file has a different size when viewed from another OS? Yes. Yesterday I tested the archive with tar tvf (11gb) in Linux and it tested OK. In FBSD it says unexpected EOF. If I could i would just boot linux and split the file. But I can nolonger boot linux =/ (migrated to fbsd 5.2 ;). I'm CC'ing tjr and bde, who might have some idea about the problem. ext2fs under FreeBSD is missing support for files larger than Linux's old limit of 4GB. Fixing this should be relatively easy (start by using i_size_high when converting the Linux disk inode to a FreeBSDish in-core inode). I don't have any patches for this. Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Anyway to extract a large file from EXT2FS filesystem?
On Tue, 17 Feb 2004, Tim Robbins wrote: 5BOn Tue, Feb 17, 2004 at 11:16:50AM +0100, Stefan Krantz wrote: I would like to extract a large (11GB) tar file on an ext3 filesystem. But it shows only to be about 3gb large: yabba# ls -la pictures.tar -rw-r--r-- 1 root wheel 3317055488 Feb 15 19:03 pictures.tar Is there any possible way to extract the file? Try this patch and let me know how it goes. You'll have to specify the file name of /sys/gnu/ext2fs/ext2_inode_cnv.c to patch(1) manually, then either buildkernel or rebuild only ext2fs.ko. If the file shows up with the correct size in a directory listing, make sure you can actually read data past 4 GB. //depot/user/tjr/freebsd-tjr/src/sys/gnu/ext2fs/ext2_inode_cnv.c#1 - /p4/tjr/src/sys/gnu/ext2fs/ext2_inode_cnv.c @@ -77,6 +77,8 @@ */ ip-i_mode = ei-i_links_count ? ei-i_mode : 0; ip-i_size = ei-i_size; + if (S_ISREG(ip-i_mode)) + ip-i_size |= ((u_int64_t)ei-i_size_high) 32; ip-i_atime = ei-i_atime; ip-i_mtime = ei-i_mtime; ip-i_ctime = ei-i_ctime; @@ -112,6 +114,8 @@ */ ei-i_dtime = ei-i_links_count ? 0 : ip-i_mtime; ei-i_size = ip-i_size; + if (S_ISREG(ip-i_mode)) + ei-i_size_high = ip-i_size 32; ei-i_atime = ip-i_atime; ei-i_mtime = ip-i_mtime; ei-i_ctime = ip-i_ctime; The feature stuff needs to be handled for writing. The feature stuff is slightly broken for reading. Large file support is a read-only compatibility feature (it is indicated by the EXT2_FEATURE_RO_COMPAT_LARGE_FILE flag in the s_feature_ro_compat field in the superblock), but we didn't support it without the first hunk in the above patch so we should have rejected even r/o mounts of file systems that have this flag set. We only reject r/w mounts of such file systems. I suppose this isn't a problem in Linux implementations of ext2fs because implementations that don't support large files in ext2fs don't support large files anywhere, so files larger than the old limit of 4GB are handled as correctly as possible at read time so their presence need not prevent mounting. Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Anyway to extract a large file from EXT2FS filesystem?
On Wed, 18 Feb 2004, Tim Robbins wrote: On Wed, Feb 18, 2004 at 11:37:26AM +1100, Bruce Evans wrote: The feature stuff needs to be handled for writing. I discovered that a few minutes after posting the patch :-) I decided to take the lazy way out for now and to return EFBIG if we would need to upgrade the filesystem to EXT2_DYNAMIC_REV or set ..._RO_COMPAT_LARGE_FILE. I think what's most important here is being able to read large files from Linux ext2 filesystems, and I don't like the current ext2 code enough to implement superblock updating etc. The ext2 code seems to do a little more than necessary. Anyway, we shouldn't copy it, to keep the the superblock update parts of FreeBSD's ext2fs free of the copyleft :-). Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: problems with filesystems 1TB
On Wed, 21 Jan 2004, Eric wrote: i have been trying (for many moons now) to create a filesystem larger than 1TB. I've had a variety of RAID controllers in my boxes, and I have 250GB drives, so it adds up quick. I've also tried doing this with vinum, but that fails too. i've searched for help on this topic, and i've found lots of info, but nothing substantial. I've read everything from it being a sysinstall issue, to needing new versions of the CLI tools (newfs, dd, disklabel), to newfs using the wrong variable type to store fssize, to having to update to fbsd 5.x to use UFS2. This requires FreeBSD-5.x and either UFS2 or fixing an overflow bug in UFS1 (and possibly other bugs). Only file system sizes much larger than 1TB require UFS2 (UFS1 starts losing at 4TB but but can handle 128TB (poorly)). FreeBSD 4.x has a limit of 2^31 blocks of size 512 for i/o. This gives a limit of 1TB. UFS1 has a limit of 2^31 blocks of size fs block size. I forget whether relevant block size is what is called the block size or the fragment size in newfs. Probably the latter; I will assume this in the following examples. This gives the same limit of 1TB if the fragment size is 512. However, with the default fragment size of 2K the limit is 4TB, and UFS1 can reasonably support a few more doublings of the file system size using a few more doublings of the fragment size. However2, UFS1 has an overflow bug converting fs block numbers to i/o block numbers. Overflow occurs at i/o block number 2^31 so there is the same 1TB limit as in systems that have a limit of 2^31 on the i/o block number. Other reports say it's a softlimit imposed somewhere, some say to make the frag size in newfs to 1024B for a 2TB max volume, it has to be dedicated, it has to be non-dedicated... the list of suggestions goes on and on. For UFS1, this only works in FreeBSD-5.x with an overflow bug (and possibly other bugs) fixed. Fix for an overflow bug: %%% Index: fs.h === RCS file: /home/ncvs/src/sys/ufs/ffs/fs.h,v retrieving revision 1.40 diff -u -2 -r1.40 fs.h --- fs.h16 Nov 2003 07:08:27 - 1.40 +++ fs.h16 Nov 2003 11:30:26 - @@ -491,5 +491,5 @@ * This maps filesystem blocks to device size blocks. */ -#define fsbtodb(fs, b) ((b) (fs)-fs_fsbtodb) +#definefsbtodb(fs, b) ((daddr_t)(b) (fs)-fs_fsbtodb) #definedbtofsb(fs, b) ((b) (fs)-fs_fsbtodb) %%% Bruce ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Followup to fallback to PIO mode on dual processor AMD systems
On Thu, 2 Jan 2003, Bruce Campbell wrote: At present, I don't suspect bad media because the error message is WRITE command timeout tag=0 serv=0 which doesn't suggest a specific sector/track etc, and running with UDMA33 instead of UDMA100 makes the problem appear to vanish. The fallback is clearly wrong because it turns isolated media errors into pessimized i/o for the whole disk at best, system hangs during resets next best, and system crashes at worst. I keep a disk with bad media on line for testing some of this, and zap the fallback using the following patch (hope this is complete; it was edited from a larger patch). %%% Index: ata-disk.c === RCS file: /home/ncvs/src/sys/dev/ata/ata-disk.c,v retrieving revision 1.139 diff -u -2 -r1.139 ata-disk.c --- ata-disk.c 17 Dec 2002 16:26:22 - 1.139 +++ ata-disk.c 18 Dec 2002 01:03:37 - @@ -597,5 +606,5 @@ else { ata_dmainit(adp-device, ata_pmode(adp-device-param), -1, -1); - printf( falling back to PIO mode\n); + printf( NOT falling back to PIO mode\n); } TAILQ_INSERT_HEAD(adp-device-channel-ata_queue, request, chain); @@ -603,4 +612,5 @@ } +#if 0 /* if using DMA, try once again in PIO mode */ if (request-flags ADR_F_DMA_USED) { @@ -613,4 +623,5 @@ return ATA_OP_FINISHED; } +#endif request-flags |= ADR_F_ERROR; %%% Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-questions in the body of the message