Re: Anonymous vnodes?

2023-06-26 Thread Taylor R Campbell
> Date: Mon, 26 Jun 2023 18:13:17 -0400
> From: Theodore Preduta 
> 
> Is it possible to create a vnode for a regular file in a file system
> without linking the vnode to any directory, so that it disappears when
> all open file descriptors to it are closed?  (As far as I can tell, this
> isn't possible with any of the vn_* or VOP_* functions?)
> 
> If this idea is indeed not possible, should/could I implement something
> like this?  (If so, how?)
> 
> For context, I'm currently working on implementing memfd_create(2), and
> thought this might be a shortcut.  Otherwise, I'll have to implement it
> in terms of uvm operations (which is fine, just more work).

For a syscall, you should implement it in terms of uvm anonymous
objects:

- memfd_create: fd_allocfile, uao_create
- .fo_close: uao_detach
- .fo_read/write: ubc_uiomove
  (tricky bit: if the offset pointer is >f_offset, you must
  acquire and release fp->f_lock around it)
- .fo_mmap: just take a reference and return the object

Should be easy, and similar to what kern_ksyms.c already does.  No
need for vnodes to be involved at all.


Re: Anonymous vnodes?

2023-06-26 Thread RVP

On Mon, 26 Jun 2023, Theodore Preduta wrote:


Is it possible to create a vnode for a regular file in a file system
without linking the vnode to any directory, so that it disappears when
all open file descriptors to it are closed?  (As far as I can tell, this
isn't possible with any of the vn_* or VOP_* functions?)

If this idea is indeed not possible, should/could I implement something
like this?  (If so, how?)



You could extend what shm_open() currently does on NetBSD: create a
unique temp. file in /var/shm; immediately unlink it, return the fd.

-RVP



Re: Anonymous vnodes?

2023-06-26 Thread Jason Thorpe


> On Jun 26, 2023, at 3:13 PM, Theodore Preduta  wrote:
> 
> Is it possible to create a vnode for a regular file in a file system
> without linking the vnode to any directory, so that it disappears when
> all open file descriptors to it are closed?  (As far as I can tell, this
> isn't possible with any of the vn_* or VOP_* functions?)

There isn't a general way to do this.

> For context, I'm currently working on implementing memfd_create(2), and
> thought this might be a shortcut.  Otherwise, I'll have to implement it
> in terms of uvm operations (which is fine, just more work).

Seems like these objects should be implemented above the file system ... just 
create a new descriptor type and interface directly with UVM.

-- thorpej



Anonymous vnodes?

2023-06-26 Thread Theodore Preduta
Is it possible to create a vnode for a regular file in a file system
without linking the vnode to any directory, so that it disappears when
all open file descriptors to it are closed?  (As far as I can tell, this
isn't possible with any of the vn_* or VOP_* functions?)

If this idea is indeed not possible, should/could I implement something
like this?  (If so, how?)

For context, I'm currently working on implementing memfd_create(2), and
thought this might be a shortcut.  Otherwise, I'll have to implement it
in terms of uvm operations (which is fine, just more work).

Thanks,

Theo(dore)


Boot on GPT in RAID [PATCH]

2023-06-26 Thread Emmanuel Dreyfus
Hello 

Some time ago, I added code in x86 secondary bootstrap to allow
booting on GPT partitions inside a RAID 1. It seems I forgot to
commit the support in primary bootstrap, and in the meantime I
deleted the tree with it. I just rewrote the thing, the patch
is below for comment.

The boot priority is as described in x86/boot(8): 
1) the first parition with bootme attribute set
2) the partition we booted from, irrelevant here
3) the first bootable partition (FFS, LFS, CCD, CGD) 
4) the first partititon

I handled the case where GPT entries are not 128 bytes long,
as UEFI specification allows that. That makes the code larger
and painful to read. Is it worth it?

Index: sys/arch/i386/stand/bootxx/boot1.c
===
RCS file: /cvsroot/src/sys/arch/i386/stand/bootxx/boot1.c,v
retrieving revision 1.21
diff -U4 -r1.21 boot1.c
--- sys/arch/i386/stand/bootxx/boot1.c  24 Jun 2021 01:23:16 -  1.21
+++ sys/arch/i386/stand/bootxx/boot1.c  26 Jun 2023 14:31:22 -
@@ -36,10 +36,12 @@
 #include 
 #include 
 
 #include 
+#include 
 #include 
 #include 
+#include 
 #include /* For RF_PROTECTED_SECTORS */
 
 #define XSTR(x) #x
 #define STR(x) XSTR(x)
@@ -48,8 +50,9 @@
 
 static struct biosdisk_ll d;
 
 const char *boot1(uint32_t, uint64_t *);
+static daddr_t gpt_lookup(daddr_t);
 extern void putstr(const char *);
 
 extern struct disklabel ptn_disklabel;
 
@@ -89,8 +92,17 @@
bios_sector += RF_PROTECTED_SECTORS;
fd = ob();
if (fd != -1)
goto done;
+
+   /*
+* Test for a GPT inside the RAID
+*/
+   bios_sector += gpt_lookup(bios_sector);
+   fd = ob();
+   if (fd != -1)
+   goto done;
+
/*
 * Nothing at the start of the MBR partition, fallback on
 * partition 'a' from the disklabel in this MBR partition.
 */
@@ -143,4 +155,145 @@
return EIO;
 
return 0;
 }
+
+static int
+is_unused(struct gpt_ent *ent)
+{
+   const struct uuid unused = GPT_ENT_TYPE_UNUSED;
+
+   return (memcmp(ent->ent_type, , sizeof(unused)) == 0);
+}
+
+static int
+is_bootable(struct gpt_ent *ent)
+{
+   /* GPT_ENT_TYPE_NETBSD_RAID omitted as we are already in a RAID */
+   const struct uuid bootable[] = {
+   GPT_ENT_TYPE_NETBSD_FFS,
+   GPT_ENT_TYPE_NETBSD_LFS,
+   GPT_ENT_TYPE_NETBSD_CCD,
+   GPT_ENT_TYPE_NETBSD_CGD,
+   };
+   int i;
+
+   for (i = 0; i < sizeof(bootable) / sizeof(*bootable); i++) {
+   if (memcmp(ent->ent_type, [i],
+   sizeof(struct uuid)) == 0)
+   return 1;
+   }
+
+   return 0;
+}
+
+
+static daddr_t
+gpt_lookup(daddr_t sector)
+{
+   char buf[DEV_BSIZE];
+   struct mbr_sector *pmbr;
+   const char gpt_hdr_sig[] = GPT_HDR_SIG;
+   struct gpt_hdr *hdr;
+   struct gpt_ent *ent;
+   uint32_t nents;
+   uint32_t entsz;
+   uint32_t entries_per_sector;
+   uint32_t sectors_per_entry;
+   uint64_t firstpart_lba = 0;
+   uint64_t bootable_lba = 0;
+   uint64_t bootme_lba = 0;
+   int i, j;
+
+   /*
+* Look for a PMBR
+*/
+   if (readsects(, sector, 1, buf, 1) != 0)
+   return 0;
+
+   pmbr = (struct mbr_sector *)buf;
+
+   if (pmbr->mbr_magic != htole16(MBR_MAGIC))
+   return 0;
+
+   if (pmbr->mbr_parts[0].mbrp_type != MBR_PTYPE_PMBR)
+   return 0;
+
+   sector++; /* skip PMBR */
+
+   /*
+* Look for a GPT header
+* Space is scarce, we do not check CRC.
+*/
+   if (readsects(, sector, 1, buf, 1) != 0)
+   return 0;
+
+   hdr = (struct gpt_hdr *)buf;
+
+   if (memcmp(gpt_hdr_sig, hdr->hdr_sig, sizeof(hdr->hdr_sig)) != 0)
+   return 0;
+
+   if (hdr->hdr_revision != htole32(GPT_HDR_REVISION))
+   return 0;
+
+   if (le32toh(hdr->hdr_size) > DEV_BSIZE)
+   return 0;
+
+   nents = le32toh(hdr->hdr_entries);
+   entsz = le32toh(hdr->hdr_entsz);
+
+   sector++; /* skip GPT header */
+
+   /*
+* Read partition table
+*
+* According to UEFI specification, entries are 128 * (2^n)
+* bytes long. The most common scenario is 128 bytes (n = 0)
+* where there are 4 entries per sector. If n > 2, then entries
+* spans multiple sectors, but they remain sector-aligned.
+*/
+   entries_per_sector = DEV_BSIZE / entsz;
+   if (entries_per_sector == 0)
+   entries_per_sector = 1;
+
+   sectors_per_entry = entsz / DEV_BSIZE;
+   if (sectors_per_entry == 0)
+   sectors_per_entry = 1;
+
+   for (i = 0; i < nents; i += entries_per_sector) {
+   if (readsects(, sector, 1, buf, 1) != 0)
+   return 0;
+
+   sector += sectors_per_entry;
+