There is a web site with presentations from Hitachi, Intel, LSI, Microsoft, Seagate, and WD discussing the topic: http://www.bigsector.org.
-- Rob Elliott, [EMAIL PROTECTED] Hewlett-Packard Industry Standard Server Storage Advanced Technology https://ecardfile.com/id/RobElliott > -----Original Message----- > From: K.G. [mailto:[EMAIL PROTECTED] > Sent: Friday, November 04, 2005 8:22 AM > To: Elliott, Robert (Server Storage); [EMAIL PROTECTED]; > Sven Luther > Cc: [EMAIL PROTECTED]; [email protected] > Subject: Re: parted - larger logical and physical block sizes > on GPT disks > > Hi, > > Only my opinion: > > On Wed, 2 Nov 2005 15:18:53 -0600 "Elliott, Robert (Server > Storage)" <[EMAIL PROTECTED]> wrote: > > I noticed a few things in parted's source code that might warrant > > fixing, particularly for the GUID Partition Table (GPT) > partition format > > used by Extensible Firmware Interface (EFI) systems. The > Unified EFI > > Specification will discuss these issues. > > > > 1. Logical block sizes are not necessary 512 bytes; they > could be 1024, > > 2048, or 4096 bytes (at least). Both the ATA and the SCSI > block command > > sets support this. ATA devices typically do not implement it; SCSI > > logical units sometimes do. The code has "512" sprinkled > throughout, > > which will probably cause problems. > > Well, the sector size is a known problem in Parted. Until recently we > didn't receive much reports about it, because probably about > 99.9999999% > of people are using 512 bytes sectors. But now multiples of 512 bytes > are beginning to be seen sometimes (in raid systems? very big > disk?) and > Parted is mostly unusable when that happens. > I believe this should be fixed in the whole program, but unfortunately > this probably would involve a lot of work. > Also some file system or disk labels are only described for 512 bytes > sectors, so this might be a problem. In the disk_atari.c I've > recently written, I explicitly discard atari disklabel probing if the > sector size isn't 512; I guess we should probably add those kinds of > tests for problematic FS/disklabels. > > > 2. Even if the logical block size is 512 bytes, the > underlying physical > > block size may be a multiple of that. The drive performs > > read-modify-write when a full physical block is not > accessed, incurring > > a performance hit but maintaining compatibility with > software that uses > > 512 byte logical blocks. > > > > Serial ATA disks are expected to start doing this soon; > their physical > > block may contain 1, 2, 4, or 8 logical blocks (the ATA > IDENTIFY DEVICE > > command indicates how many). SCSI doesn't have a way to report this > > type of behavior yet (it has always assumed that software > would support > > a larger logical block size) but it might be added to match ATA. > > Interesting. I guess this divides data structures in 2 sets: old ones > which aren't aware of the logical vs physical disk block size > issue will > only consider logical sizes - and new ones like GPT which > handle it fine > with size fields and backward compatibility with systems that > don't probe > the physical sector size, right? > (indeed there's a third set: the ones that just assume 512 > bytes sectors) > > > In this situation, it is important to align important > structures like > > partition boundaries on the physical block boundaries; if they are > > unaligned, then accesses that are aligned to the start of > the partition > > will actually result in excessive read-modify-writes by the disk. > > > > For the GPT partition format, the first partition naturally > starts on > > LBA 34, which is fine for 512 and 1024 byte physical block > sizes but not > > good for 2048 or 4096 byte physical block sizes. Partition > tools like > > parted should, unless specifically requested otherwise by a > > knowledgeable user, start aligning their GPT partitions on larger > > boundaries (e.g. 128KiB would suffice for many years). > > I believe this could be easily done with a constraint in > *_partition_align > functions. I think that when we get the logical and physical > disk block > size and handle the logical size cleanly, we should put that kind of > alignment for most disklabels (even if they know nothing about > logical/physical sector sizes) but this might be a problem if > "cylinder" > alignment is needed. > > > Excerpts from disk_gpt.c that might have problems: > > typedef struct _GuidPartitionTableHeader_t { > > uint64_t Signature; > > uint32_t Revision; > > uint32_t HeaderSize; > > uint32_t HeaderCRC32; > > uint32_t Reserved1; > > uint64_t MyLBA; > > uint64_t AlternateLBA; > > uint64_t FirstUsableLBA; > > uint64_t LastUsableLBA; > > efi_guid_t DiskGUID; > > uint64_t PartitionEntryLBA; > > uint32_t NumberOfPartitionEntries; > > uint32_t SizeOfPartitionEntry; > > uint32_t PartitionEntryArrayCRC32; > > uint8_t Reserved2[512 - 92]; > > } __attribute__ ((packed)) GuidPartitionTableHeader_t; > > > > Comment: The header (and its Reserved2 field) actually fills up the > > entire logical block, not just 512. > > > > data_start = 2 + GPT_DEFAULT_PARTITION_ENTRY_ARRAY_SIZE / 512; > > data_end = dev->length - 2 > > - GPT_DEFAULT_PARTITION_ENTRY_ARRAY_SIZE / 512; > > > > Comment: The logical block size is not always 512 bytes. > > Comment: This probably leads to the first partition > starting at LBA 34, > > which is not aligned for 2048 or 4096 byte sectors. > > > > > > if (!ped_device_read (dev, gpt, sector, > > sizeof (GuidPartitionTableHeader_t) / > > 512)) > > > > Comment: The logical block size is not always 512 bytes. > > > > if ((PedSector) PED_LE64_TO_CPU (gpt.AlternateLBA) > > < disk->dev->length - 1) { > > char zeros[512]; > > > > #ifndef DISCOVER_ONLY > > if (ped_exception_throw ( > > PED_EXCEPTION_ERROR, > > PED_EXCEPTION_FIX | > > PED_EXCEPTION_CANCEL, > > _("The backup GPT table is not at the end of the disk, > > as it " > > "should be. This might mean that another operating > > system " > > "believes the disk is smaller. Fix, by moving the > > backup " > > "to the end (and removing the old backup)?")) > > == PED_EXCEPTION_CANCEL) > > goto error; > > > > write_back = 1; > > memset (zeros, 0, 512); > > ped_device_write (disk->dev, zeros, > > PED_LE64_TO_CPU > > (gpt.AlternateLBA), > > 1); > > #endif /* !DISCOVER_ONLY */ > > Comment: The logical block size is not always 512 bytes. > > > > ... > > etc. (search on "512" to find likely problems) > > As I said before, disk_gpt.c is only a small part of the problem... :/ > > Cheers, > Guillaume Knispel > _______________________________________________ Bug-parted mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-parted
