Re: Request for testing: Fedora 37 pre-Beta validation tests
On Mon, Aug 29, 2022, at 8:36 PM, Josh Berkus wrote: > On 8/29/22 17:22, Adam Williamson wrote: >> It would be really great if we can get the validation tests run now so >> we can find any remaining blocker bugs in good time to get them fixed. >> Right now the blocker list looks short, but there are definitely some >> tests that have not been run. > > Last I checked, flatpak was still broken. Will retest this week. What's broken with flatpak? I've been using several flatpaks OK since 'dnf system-upgrade' a week ago. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: do we need Plymouth?
On Tue, Aug 9, 2022, at 11:29 AM, Neal Gompa wrote: > Plymouth is used to provide the interface for decrypting disks and > presenting information about software/firmware updates, so I'd be > loath to remove it. On desktops yes, but I think we can modify systemd-ask-password-plymouth.service so that it omits --plymouth from systemd-tty-ask-password-agent? -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
do we need Plymouth?
cc: cloud@, server@ fpo Hi, When troubleshooting early boot issues with a console, e.g. virsh console, or the virt-manager console, or even a server's remote management console providing a kind of virtual serial console... the boot scroll is completely wiped. This is a new behavior in the last, I'm not sure, 6-12 months? Everything before about 3 seconds is cleared as if the console reset command was used, as in it wipes my local scrollback. I captured this with the script command, and when I cat this 76K file, it even wipes the local console again. So there is some kind of control character that's ordering my local console to do this. The file itself contains the full kernel messages. I just can't cat it. I have to open it in a text editor that ignores this embedded console reset command. With the help of @glb, we discovered that this is almost certainly Plymouth. When I boot with parameter plymouth.enable=0 the problem doesn't happen. And hence the higher level question if we really even need Plymouth in Server or Cloud editions? I suppose ideally we'd track down the problem and fix plymouth, so that existing installations get fixed. Whereas if we remove plymouth, we have to ponder whether and how to remove plymouth from existing installations. Unless we flat out aren't using it at all. Any ideas? Plymouth is in the @core group in fedora-comps, so pretty much everything gets it. https://pagure.io/fedora-comps/blob/main/f/comps-f37.xml.in#_635 -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Re: Proposing a new PRD
On Tue, May 24, 2022 at 11:28 AM Duncan wrote: > > Hi everyone. > > I have updated the PRD and I would like to request additions comments > and improvements to this first draft. > > 3 Participants > == > > Currently the following people are involved in the Cloud Working > group. > > - [David Duncan] > - [Dusty Mabe] > - [Major Hayden] > - [Neal Gompa] > - [Davida Cavalca] > - [Michel Salim] > - [Amy Marrich] > - [Joe Doss] Chris Murphy : chrismurphy -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: [Fedocal] Reminder meeting : Fedora Cloud Workgroup
On Tue, Nov 23, 2021 at 10:00 AM wrote: > > Dear all, > > You are kindly invited to the meeting: >Fedora Cloud Workgroup on 2021-11-25 from 15:00:00 to 16:00:00 UTC >At fedora-meetin...@irc.libera.chat Since this is Thanksgiving day in the U.S., it's best to assume this meeting isn't going to happen. Next meeting scheduled for Dec 9. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Fedora 35 Cloud Base Images for Amazon Public Cloud aarch64 AMIs
On Fri, Nov 12, 2021 at 2:58 PM Dick Marinus wrote: > > Hi, > > The list for Fedora 35 Cloud Base Images for Amazon Public Cloud aarch64 > AMIs is empty at: > > https://alt.fedoraproject.org/cloud/ > > Is there a problem building the aarch64 images for AWS, can I be of any > help? > Thanks for the report. I've filed this issue https://pagure.io/fedora-web/websites/issue/220 Looks like the images are available in AWS, but just aren't listed at alt.fedoraproject.org/cloud -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Fedora Cloud Meeting Minutes 2021-11-11
On Thu, Nov 11, 2021 at 1:21 PM David Duncan wrote: > > Minutes: > https://meetbot-raw.fedoraproject.org/teams/fedora_cloud_meeting/fedora_cloud_meeting.2021-11-11-15.00.html > Minutes (text): > https://meetbot-raw.fedoraproject.org/teams/fedora_cloud_meeting/fedora_cloud_meeting.2021-11-11-15.00.txt > Log: > https://meetbot-raw.fedoraproject.org/teams/fedora_cloud_meeting/fedora_cloud_meeting.2021-11-11-15.00.log.html Sorry I missed the meeting, but yeah +1 to (re)making Cloud Edition officially as an edition -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Fix fallocate issue on cloud-init 19.4 for Fedora 33 cloud images
tl;dr Creating a swapfile with fallocate should work on XFS now. Found this thread from about 3 years ago https://www.spinics.net/lists/linux-mm/msg147100.html There were a few things that needed work to fix it, and I'm not sure which one finally did it, but this 2019 patch was part of that series: https://lore.kernel.org/linux-xfs/20191008071527.29304-16-...@lst.de/ That is in 5.7, which is what Fedora 33 shipped with. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/xfs/xfs_aops.c?h=v5.7 Anyway...even though I can't nail down exactly when it got fixed, today I asked the XFS maintainer about all of this and he said fallocate'd swapfiles should work. And I also tested it with kernel 5.11.16 and 5.12 and it does work. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Fix fallocate issue on cloud-init 19.4 for Fedora 33 cloud images
On Mon, Apr 26, 2021 at 8:18 AM Federico Ressi wrote: > > Hello all, > > I am writing to this list because I found out F33 cloud image cloud init > support for creating swap files looks to be broken probably because of a > known Linux 5.7+ kernel issue [1]. > > The problem is cloud-init is trying to create a new swap file by using > fallocate command that is not working well (kernel is complaining the file > has holes when executing swapon command just later). The easy workaround for > this issue is to use dd command instead of fallocate command in cloud-init. fallocate default mode is zero, and doesn't create holes. If there are holes, it's a kernel bug, and it needs to be fixed and kernel updated. It's also worth making sure cloud-init is using fallocate's default mode of zero. The simplest work around is to just create a swap partition instead of a swapfile, when using cloud images that have the buggy kernel. Or alternatively don't create either one, and instead write a config to /etc/systemd/zram-generator.conf so the installation uses swap on a compressed zram device. > Because I don't know the whole procedure required for submitting a patch I am > writing to you in the hope you can help me in having the F33 cloud image > fixed. I don't think cloud images get reissued after release. But you can reproduce that process and make your own cloud image that's identical to the Fedora one, except for the kernel. That way you can use a newer kernel that doesn't have the bug. Or like Dusty suggests, give the Fedora 34 Cloud images a whirl. Out tomorrow! -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: preview of swap on ZRAM feature/change
On Mon, Jun 1, 2020 at 2:55 PM Michael Hall wrote: > > I joined this list because I'm interested in learning more about the specific > requirements and features of cloud images. > > So I'm wondering what ZRAM adds or what problem ZRAM solves for a cloud image? Simplest answer: better utilization of a limited resource, memory. It does this by making it possible for the kernel to evict anonymous pages. Therefore avoiding repetitive reclaiming of file pages when under any sort of memory pressure. And it's faster than swap to disk. And since it doesn't require a preallocation, if the workload were to never need swap, it has a nearly zero cost. If you aren't using some swap, aren't you in a sense overprovisioning memory? -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: preview of swap on ZRAM feature/change
On Mon, Jun 1, 2020 at 12:44 PM Simo Sorce wrote: > > On Mon, 2020-06-01 at 10:37 -0600, Chris Murphy wrote: > > Thanks for the early feedback! > > > > On Mon, Jun 1, 2020 at 7:58 AM Stephen Gallagher > > wrote: > > > * Reading through the Change, you write: > > > "using a ZRAM to RAM ratio of 1:2, and capped† to 4GiB" and then you > > > talk about examples which are using 50% of RAM as ZRAM. Which is it? A > > > ratio of 1:2 implies using 33% of RAM as ZRAM. > > > > This ratio is just a fraction, part of whole, where RAM is the whole. > > This convention is used in the zram (package). > > > > Note that /dev/zram0 is a virtual block device, similar to the > > 'lvcreate -V' option for thin volumes, size is a fantasy. And the ZRAM > > device size is not a preallocation of memory. If the compression ratio > > 2:1 (i.e. 200%) holds, then a ZRAM device sized to 50% of RAM will not > > use more than 25% of RAM. > > What happen if you can't compress memory at all ? > Will zram use more memory? Or will it simply become useless (but > hopefully harmless) churn ? It is not a no op. There is CPU and memory consumption in this case. It actually reduces available memory for the workload. I haven't yet seen this in practice and haven't come up with a synthetic test - maybe something that just creates a bunch of anonymous pages using /dev/urandom? Since the device by default is small, it ends up performance wise being a no op. You either arrive at the same oom you would have, in the no swap at all case; or it'll just start spilling over into a swap-on-disk if you have one still. I have done quite a lot of testing of the webkit gtk compile case, where it uses ncpus + 2 for the number of jobs by default, and where it gets to a point eventually needing up to 1.5 GiB per job. Super memory hungry. 8 GiB RAM + no swap = this sometimes triggers kernel oomkiller quickly, but sometimes just sits there for a really long time before it triggers, it does always eventually trigger. With earlyoom enabled (default on Workstation) the oom happens faster, usually within 5 minutes. 8 GiB RAM + 8 GiB swap-on-disk = this sometimes but far less often results in kernel oomkiller trigger; most often it sits in pageout/pagein for 30+ minutes with a totally frozen GUI. With earlyoom enabled, is consistently killed inside of 10 minutes. 8 GiB RAM + 8 GiB swap-on-ZRAM = exact reverse: sometimes but less often results in 30+ minute hangs with frozen GUI, usually results in kernel oom killer within 5 minutes. With earlyoom enabled consistently is killed inside of 5 minutes. 8 GiB RAM + 16 GiB swap-on-disk = consistently finishes the compile. 8 GiB RAM + 16 GiB swap-on-ZRAM = my log doesn't have this test. I thought I had done it. But I think it's a risky default configuration because it's if you don't get 2:1 compression and the task really needs this much RAM, it's not just IO churn like with a disk based swap. It's memory and CPU, and if it gets wedged in, it's a forced power off. That's basically where we're at with Workstation edition before earlyoom, which is not good, but not a huge problem like it is with servers where you have to send someone to go hit a power button. In these cases, sshd often will not respond before timeout. So no sysrq+b unless you have that command pretyped out and ready to hit enter. The scenario where you just don't have the budget for the correct memory for the workload, and you have to use swap contrary to the "in defense of swap" article referenced in the change? I think it's maybe better use case for zswap? I don't have tests that conclusively prove that zswap's LRU basis for eviction from the zswap memory pool to the disk swap is better than how the kernel deals with two swaps (zram and disk case). But in theory the LRU basis is smarter. Making it easier for folks to experiment with this I think is maybe undersold in the proposal. But the main idea is to convey that the proposed defaults are safe. Later in the proposal I propose they might be too safe, with the 4GiB cap. That might be refused in favor of 50% RAM across the board. But that could be a future enhancement if this proposal is accepted. > > > I'll try to clear this up somehow; probably avoid using the term ratio > > and just go with fraction/percentage. And also note the use of > > 'zramctl' to see the actual compression ratio. > > > > > * This Change implies the de facto death of hibernation in Fedora. > > > Good riddance, IMHO. It never worked safely. > > > > UEFI Secure Boot put us on this path. There's still no acceptable > > authenticated encrypted hibernation image scheme, and the SUSE > > developer working on it told me a few months ago that the status is > > the same as last year and there's no ETA for
Re: preview of swap on ZRAM feature/change
Thanks for the early feedback! On Mon, Jun 1, 2020 at 7:58 AM Stephen Gallagher wrote: > > * Reading through the Change, you write: > "using a ZRAM to RAM ratio of 1:2, and capped† to 4GiB" and then you > talk about examples which are using 50% of RAM as ZRAM. Which is it? A > ratio of 1:2 implies using 33% of RAM as ZRAM. This ratio is just a fraction, part of whole, where RAM is the whole. This convention is used in the zram (package). Note that /dev/zram0 is a virtual block device, similar to the 'lvcreate -V' option for thin volumes, size is a fantasy. And the ZRAM device size is not a preallocation of memory. If the compression ratio 2:1 (i.e. 200%) holds, then a ZRAM device sized to 50% of RAM will not use more than 25% of RAM. I'll try to clear this up somehow; probably avoid using the term ratio and just go with fraction/percentage. And also note the use of 'zramctl' to see the actual compression ratio. > * This Change implies the de facto death of hibernation in Fedora. > Good riddance, IMHO. It never worked safely. UEFI Secure Boot put us on this path. There's still no acceptable authenticated encrypted hibernation image scheme, and the SUSE developer working on it told me a few months ago that the status is the same as last year and there's no ETA for when he gets the time to revisit it. > * Can the upgrade process be made to detect the lack of existing swap > and not enable the zswap in that case? (We're not using zswap at all in this implementation. zram!=zswap - easy mistake.) I expect in the "already has a swap partition" case, no one will complain about getting a 2nd swap device that's on /dev/zram0, because it'll just be faster and "spill over" to the bigger swap-on-disk. It's the use case where "I explicitly did not create a swap device because I hate swap thrashing" that I suspect there may be complaints. We can't detect that sentiment. All we could do is just decide that no upgrades get the feature. > Generally, we should probably assume (given existing defaults) that > anyone who has no swap running chose that explicitly and to change it > would lead to complaints. Perhaps. But I assert their decision is based on both bad information (wrong assumptions), and prior bad experience. Even if in the end the decision is to not apply the feature on upgrades, I think it's worth some arguing to counter wrong assumptions and bad experiences. > * If you're going to do the Supplements:, you need to do `Supplements: > fedora-release-common` or you won't get everyone. The `fedora-release` > package is for non-Edition/Spin installs. Right. Fixed. I suggest these three lines in the configuration (I've updated the change proposal how to test section to include this): [zram0] memory-limit = none zram-fraction = 0.5 There is no cap functionality yet in the generator. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
preview of swap on ZRAM feature/change
Hi, This topic has been discussed a couple times on devel@ over the past year - related to resource control, and better interactivity in low memory situations. And now I have a preview of the change proposal ready. https://fedoraproject.org/wiki/Changes/SwapOnZRAM The proposal aims for default partitioning, for all Fedora editions and spins, to not create a swap-on-disk partition. And instead create a compression-based RAM disk, called ZRAM, and use that for swap. Previous conversations with cloud and server folks suggests it's somewhat common to not have swap at all. Hopefully I can change your minds. :D Fast swap is good. I'm confident a one-size-fits all size for the ZRAM device is possible, as a fraction of RAM, with a max size (cap). This should be aggressive enough for low memory devices, while also not expending as much overhead for the systems with a lot of memory. It might be an option to ship different configurations, if necessary. There is a test day planned. But I'd like to get solid buy-in from cloud and server folks before then. Thanks, -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
OOM managers, resource control
Hi, Workstation working group continues to evaluate oom managers and seek input from domain experts on the subject. I've come across this video from the All Systems Go conference, on the larger subject of resource control. Its server+cloud+container oriented. So I think it speaks directly to your use cases. Resource Control (2019) Dan Schatzberg https://www.youtube.com/watch?v=30i7SamZxRU I'm looking into setting up a discussion session with Dan, and do some Q I'll report back when I know more about that. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: earlyoom by default
On Mon, Jan 13, 2020 at 10:51 AM Dusty Mabe wrote: > > > > On 1/8/20 5:21 PM, Chris Murphy wrote: > > On Mon, Jan 6, 2020 at 7:56 PM Dusty Mabe wrote: > >> > >> For cloud at least it's very common to not have swap. I'd argue for servers > >> you don't want them swapping either but resources aren't quite as elastic > >> as > >> in the cloud so you might not be able to burst resources like you can in > >> the cloud. > > > > There's also discussion about making oomd a universal solution for > > this; but I came across this issue asserting PSI (kernel pressure > > stall information) does not work well without swap. > > https://github.com/facebookincubator/oomd/issues/80 > > > > Ignoring whether+what+when a workaround may be found for that, what do > > you think about always having swap-on-ZRAM enabled in these same > > environments? The idea there is a configurable size /dev/zram block > > device (basically a compressible RAM disk) on which swap is created. > > Based on discussions with anaconda, IoT, Workstation, and systemd > > folks - I think there's a potential to converge on systemd-zram > > generator (rust) to do this. > > https://github.com/systemd/zram-generator > > > > Workstation wg is mulling over the idea of dropping separate swap > > partitions entirely, and using a swap-on-ZRAM device instead; possibly > > with a dynamically created swapfile for certain use cases like > > hibernation. So I'm curious if this might have broader appeal, and get > > systemd-zram generator production ready. > > > > > Seems like an interesting concept. Since it doesn't require any disk setup > it's easy to turn it off or configure it I assume. > > +1 Yes. My suggestion is to install this generator distribution wide. The on/off switch is the existence of a configuration file. If there's no config, the generator is a no op. And it won't run in containers regardless. Next, the discussion is whether the distribution default is with config, or without config. Either way it's overridable. I think a reasonable universal default would be something like a zram:RAM ratio of 1:2 or 1:1. And cap it to somewhere around 2-4G. The rationale: - Fedora IoT folks use swap on zram by default out of the box (via zram package, not this zram-generator) for a long time, maybe since the beginning. - Upstream zram kernel devs say it's reasonable to go up to 2:1 because compression ratios are about 2:1, but it's pointless to go above that. Therefore 1:1 is quite conservative. 0.5 is even more conservative but still useful - 1:1 is consistent with existing defaults (Anaconda, anyway) - The cap means systems with a lot of RAM will only use it incidentally. Any time swap thrashing happens with traditional swap is IO bound, but becomes CPU bound on a zram device (because of all the compression/decompression hits). So making it small avoids too much of that. - Considers upgrade behavior, where existing traditional swap on a partition is being used; create the swap on zram device with a high priority, so it's used first. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: earlyoom by default
On Mon, Jan 6, 2020 at 7:56 PM Dusty Mabe wrote: > > For cloud at least it's very common to not have swap. I'd argue for servers > you don't want them swapping either but resources aren't quite as elastic as > in the cloud so you might not be able to burst resources like you can in the > cloud. There's also discussion about making oomd a universal solution for this; but I came across this issue asserting PSI (kernel pressure stall information) does not work well without swap. https://github.com/facebookincubator/oomd/issues/80 Ignoring whether+what+when a workaround may be found for that, what do you think about always having swap-on-ZRAM enabled in these same environments? The idea there is a configurable size /dev/zram block device (basically a compressible RAM disk) on which swap is created. Based on discussions with anaconda, IoT, Workstation, and systemd folks - I think there's a potential to converge on systemd-zram generator (rust) to do this. https://github.com/systemd/zram-generator Workstation wg is mulling over the idea of dropping separate swap partitions entirely, and using a swap-on-ZRAM device instead; possibly with a dynamically created swapfile for certain use cases like hibernation. So I'm curious if this might have broader appeal, and get systemd-zram generator production ready. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
earlyoom by default
Hi server@ and cloud@ folks, There is a system-wide change to enable earlyoom by default on Fedora Workstation. It came up in today's Workstation working group meeting that I should give you folks a heads up about opting into this change. Proposal https://fedoraproject.org/wiki/Changes/EnableEarlyoom Devel@ discussion https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.org/message/YXDODS3G4YCS7MT4J2QJMJ7EXCVR7NQ2/ The main issue on a workstation, heavy swap leading to an unresponsive system, is perhaps not as immediately frustrating on a server. But the consequences of indefinite hang or the kernel oom-killer triggering, which is a SIGKILL, are perhaps worse. On the plus side, earlyoom is easy to understand, and its first attempt is a SIGTERM rather than SIGKILL. It uses oom_score, same as kernel oom-killer, to determine the victim. The SIGTERM is issued to the process with the highest oom_score only if both memory and swap reach 10% free. And SIGKILL is issued to the process with the highest oom_score once memory and swap reach 5% free. Those percentages can be tweaked, but the KILL percentage is always 1/2 of the TERM percentage, so it's a bit rudimentary. One small concern I have is, what if there's no swap? That's probably uncommon for servers, but I'm not sure about cloud. But in this case, SIGTERM happens at 10% of RAM, which leaves a lot of memory on the table, and for a server with significant resources it's probably too high. What about 4%? Maybe still too high? One option I'm thinking of is a systemd conditional that would not run earlyoom on systems without a swap device, which would leave these systems no worse off than they are right now. [i.e. they eventually recover (?), indefinitely hang (likely), or oom-killer finally kills something (less likely).] I've been testing earlyoom, nohang, and the kernel oom-killer for > 6 months now, and I think it would be completely sane for Server and Cloud products to enable earlyoom by default for fc32, while evaluating other solutions that can be more server oriented (e.g. nohang, oomd, possibly others) for fc33/fc34. What is clear: this isn't going to be solved by kernel folks, the kernel oom-killer only cares about keeping the kernel alive, it doesn't care about user space at all. In the cases where this becomes a problem, either the kernel hangs indefinitely or does SIGKILL for your database or whatever is eating up resources. Whereas at least earlyoom's first attempt is a SIGTERM so it has a chance of gracefully quitting. There are some concerns, those are in the devel@ thread, and I expect they'll be adequately addressed or the feature will not pass the FESCo vote. But as a short term solution while evaluating more sophisticated solutions, I think this is a good call so I thought I'd just mention it, in case you folks want to be included in the change. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: F32, enable fstrim.timer by default
On Wed, Dec 18, 2019 at 5:32 AM Martin Kolman wrote: > This will also trim thin LVs on thin pools (if any), right ? > > So not just hardware, it can even make "software" storage layouts faster > & potentially even avoid pool exhaustion in some cases. :) Just a reminder, the underlying unit, fstrim.service, uses 'fstrim --fstab' so only fstab file systems are affected. The user would need to change the unit file to use --all instead of --fstab to affect all mounted file systems. I'll include that info in the change wiki. I imagine the best practice is to copy the original unit file, edit it, and use it as a drop in unit file in /etc ? Unfinished change, still in progress... https://fedoraproject.org/wiki/Changes/EnableFSTrimTimer -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: F32, enable fstrim.timer by default
On Wed, Dec 18, 2019 at 11:10 AM Neal Gompa wrote: > > On Wed, Dec 18, 2019 at 12:08 PM Chris Murphy wrote: > > > > On Wed, Dec 18, 2019 at 10:56 AM Chris Murphy > > wrote: > > > > > > One thing I see in the Ubuntu fstrim.service unit file that I'm not > > > seeing in the Fedora fstrim.service unit file, is a conditional for > > > containers (line 4). I'm not sure where to ask about that. Maybe > > > upstream systemd? > > > > Found it. I'm not sure if util-linux would typically be found in a > > container base image? Probably no point in calling fstrim in that > > case, but also doesn't hurt. > > > > $ sudo dnf provides /usr/lib/systemd/system/fstrim.timer > > util-linux-2.34-3.fc31.x86_64 : A collection of basic system utilities > > Repo: @System > > Matched from: > > Filename: /usr/lib/systemd/system/fstrim.timer > > > > Yeah, util-linux is pretty common in some types of containers. It > probably makes sense to send a PR to add that conditional. Appears to be in the upstream version already. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: F32, enable fstrim.timer by default
On Wed, Dec 18, 2019 at 10:56 AM Chris Murphy wrote: > > One thing I see in the Ubuntu fstrim.service unit file that I'm not > seeing in the Fedora fstrim.service unit file, is a conditional for > containers (line 4). I'm not sure where to ask about that. Maybe > upstream systemd? Found it. I'm not sure if util-linux would typically be found in a container base image? Probably no point in calling fstrim in that case, but also doesn't hurt. $ sudo dnf provides /usr/lib/systemd/system/fstrim.timer util-linux-2.34-3.fc31.x86_64 : A collection of basic system utilities Repo: @System Matched from: Filename: /usr/lib/systemd/system/fstrim.timer -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
Re: F32, enable fstrim.timer by default
readd cloud@ list On Wed, Dec 18, 2019 at 5:32 AM Martin Kolman wrote: > > On Wed, 2019-12-18 at 13:11 +0100, Martin Pitt wrote: > > Hello Chris, > > > > Chris Murphy [2019-12-17 22:23 -0700]: > > > This desktop@ thread [1] about a slow device restored by enabling > > > fstrim.service, got me thinking about enabling fstrim.timer [2] by > > > default in Fedora Workstation. But I'm curious if it might be > > > desirable in other Fedora Editions, and making it a system-wide > > > change? > > > > This is a function/property of hardware, so it's IMO not desktop specific at > > all. Servers suffer just as well from hard disks becoming slower. > This will also trim thin LVs on thin pools (if any), right ? Correct. > So not just hardware, it can even make "software" storage layouts faster > & potentially even avoid pool exhaustion in some cases. :) Maybe. I'm not sure either way if someone would actually notice performance improvements, but it wouldn't make them worse. And yes, potentially avoid pool exhaustion, in particular in under provisioned cases. I forgot to mention: with qemu/kvm/libvirt VM's, the trim would not get passed down to the backing storage due to default settings. The discard mode "unmap" is supported with a SCSI disk using virtio SCSI controller; I see some curious works/doesn't work with "plain" virtio disk. But when it works, it does pass down to underlying thinp LV and raw files. If anyone is doing raw file backups, or otherwise paying for storage, it could save them some coins. And when it doesn't work, literally nothing happens: the file doesn't get holes punched out, but there's also no corruption (I test these things with a Btrfs scrub; it will consistently always complain if a single metadata or data checksum mismatches). Anyway, it's an optimization. Pretty well tested elsewhere at this point. And offhand not aware of any liabilities, but thought I'd ask about it before writing up a system wide change proposal. One thing I see in the Ubuntu fstrim.service unit file that I'm not seeing in the Fedora fstrim.service unit file, is a conditional for containers (line 4). I'm not sure where to ask about that. Maybe upstream systemd? $ cat /lib/systemd/system/fstrim.service [Unit] Description=Discard unused blocks on filesystems from /etc/fstab Documentation=man:fstrim(8) ConditionVirtualization=!container [Service] Type=oneshot ExecStart=/sbin/fstrim --fstab --verbose --quiet ProtectSystem=strict ProtectHome=yes PrivateDevices=no PrivateNetwork=yes PrivateUsers=no ProtectKernelTunables=yes ProtectKernelModules=yes ProtectControlGroups=yes MemoryDenyWriteExecute=yes SystemCallFilter=@default @file-system @basic-io @system-service chris@chris-Standard-PC-Q35-ICH9-2009:~$ -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
F32, enable fstrim.timer by default
Hi, This desktop@ thread [1] about a slow device restored by enabling fstrim.service, got me thinking about enabling fstrim.timer [2] by default in Fedora Workstation. But I'm curious if it might be desirable in other Fedora Editions, and making it a system-wide change? I've checked recent versions of openSUSE and Ubuntu, and they have it enabled. Therefore I estimate the likelihood of running into cons (below) is pretty remote. Most people won't notice anything. The pros: + when passed down to flash drives that support trim, it provides a hint to the drive firmware about erase blocks ready for erasure. Some devices will have improved wear leveling and performance as a result, but this is firmware specific. LVM [3] and dm-crypt [4] passdown appears to be enabled by default on Fedora. + with LVM thin provisioning, it will cause unused LV extents to be returned to the thin pool for use by other LVs, kind of a nifty work around for XFS not supporting fs shrink resize. The gotchas: + Few, but highly visible, reports of buggy SSDs that corrupt or lose data soon after trim being issued. By now, most have been blacklisted in the kernel, and/or have manufacturer firmware updates. We shouldn't run into this problem unless someone has older hardware that hasn't been update and for some reason also hasn't been blacklisted in the kernel. + Older SSD's have only non-queued trim support, which also can result in a brief hang while the command is processed. This is highly variable based on the device firmware, and the workload. But using weekly fstrim is preferred for these devices, instead of using the discard mount option in /etc/fstab. + Possible exposure of fs locality pattern may be a security risk for some workflows. [4] [5] [1] https://lists.fedoraproject.org/archives/list/desk...@lists.fedoraproject.org/message/UHINXYYGEYD727HIUHF3DQ7ZPCZHXWOK/ [2] fstrim.timer, if enabled, runs fstrim.service weekly, specifically Monday at midnight local time; and if the system isn't available at that time, it runs during or very soon after the next startup. The command: ExecStart=/usr/sbin/fstrim --fstab --verbose --quiet fstab means only file systems in fstab are included; verbose reports the mount point and bytes potentially discarded and is recorded in the systemd journal; quiet suppresses errors which is typical for file systems and devices that don't support fstrim, e.g. the EFI System partition, which is FAT16/32; and USB flash "stick" drives, and hard drives. [3] /etc/lvm/lvm.conf, if I'm reading it correctly, file system discards are passed down: # This configuration option has an automatic default value. # thin_pool_discards = "passdown" Due to this Fedora 27 feature; trim is passed down by dm-crypt as well for LUKS volumes. Curiously because Fedora neither sets the discard mount option for any file system, nor enables fstrim.timer, this feature isn't being taken advantage of. [4] https://fedoraproject.org/wiki/Changes/EnableTrimOnDmCrypt [5] Trim on LUKS/dm-crypt note from upstream, section 5.19 https://gitlab.com/cryptsetup/cryptsetup/-/wikis/FrequentlyAskedQuestions#5-security-aspects -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org
[atomic-wg] Issue #281: Figure out comprehensive strategy for atomic host container storage
chrismurphy added a new comment to an issue you are following: `` I read a list of problems already with negative arguments, not supplied by me. And I've presented something that obviates literally all of them. I see it as advice, not debate. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/281 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #281: Figure out comprehensive strategy for atomic host container storage
chrismurphy added a new comment to an issue you are following: `` Yeah I wasn't considering anything we don't have in anaconda, but then also anything not already in the Fedora kernel for some time now. Plus ZFS lacks fs shrink, and so you can't remove block devices arbitrarily, it also lacks online replication and seeding. So I even if it weren't for licensing it wouldn't be the direction I'd go in. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/281 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #281: Figure out comprehensive strategy for atomic host container storage
chrismurphy added a new comment to an issue you are following: `` Ahh sorry, I kinda figured realistically there are only three options: ext4, XFS, and Btrfs, and the only one not mentioned so far is Btrfs. There's some hits of people using it in AWS contexts, but I have not yet run across Btrfs + overlayfs. So, I started a thread on linux-bt...@vger.kernel.org to see if anyone's using containers with Btrfs + overlayfs. Insofar as I'm aware it's not a pathological combination, I'm just gonna guess to the vast majority it seems redundant, but they each bring different things to the table. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/281 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #281: Figure out comprehensive strategy for atomic host container storage
chrismurphy added a new comment to an issue you are following: `` All the partitioning, sizing, and resizing concerns mentioned in this issue vanish with a certain other filesystem, which does all resizes (grow, shrink, add and remove devices) online and atomically and typically in a single command. Whether scripted or user issued, the commands are shorter, easier to understand, complete faster and are safer. Gotcha though is I haven't used it with overlayfs. A cursory search yields no hits. But it seems sane to allow Docker to continue to use overlayfs for the shared page cache benefit, and even snapshotting (if Docker supports that overlayfs feature now?). But the main pro is that you can have separate fstrees read-only or read-write mounted, but they share the same storage pool, without hard barriers between them. *shrug* `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/281 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[fedora-atomic] Issue #64: wireless firmware not included on ISO installations
chrismurphy reported a new issue against the project: `fedora-atomic` that you are following: `` Version: I tested this Fedora-Atomic-ostree-x86_64-26-20170619.n.0.iso on an Intel NUC (a baremetal installation). Problem, Actual results: The installation media and environment has wireless firmware, wireless connects fine. But on reboot, no networking, and kernel messages indicates the problem is due to firmware not being installed. Expected results: Wifi firmware should be included in this ostree repo for baremetal installation; or alternatively make it more clear that these images aren't intended for baremetal installation. `` To reply, visit the link below or just reply to this email https://pagure.io/fedora-atomic/issue/64 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #257 `atomic host tree in F26 does not boot properly`
chrismurphy added a new comment to an issue you are following: `` OK this seems bad plymouth-start.service: Executing: /usr/sbin/plymouthd --mode=boot --pid-file=/var/run/plymouth/pid --attach-to-session [3.372107] general protection fault: [#1] SMP [3.372654] Modules linked in: virtio_console(+) parport snd_timer qemu_fw_cfg snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace 8139too qxl drm_kms_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel serio_raw virtio_pci virtio_ring 8139cp mii virtio ata_generic pata_acpi sunrpc scsi_transport_iscsi [3.374009] CPU: 1 PID: 667 Comm: systemd-udevd Not tainted 4.11.0-0.rc1.git0.1.fc26.x86_64 #1 [3.374009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 [3.374009] task: a0b87521 task.stack: af5a40758000 [3.374009] RIP: 0010:vp_modern_find_vqs+0x39/0x70 [virtio_pci] [3.374009] RSP: 0018:af5a4075ba68 EFLAGS: 00010282 [3.374009] RAX: af5a403ed000 RBX: 2d306f6974726976 RCX: [3.374009] RDX: 00fc RSI: af5a403ed01c RDI: 0001 [3.374009] RBP: af5a4075ba88 R08: 0001c8e0 R09: 9341ea29 [3.374009] R10: e08981dd0c40 R11: R12: a0b834bda708 [3.374009] R13: R14: a0b834bda400 R15: 001f [3.374009] FS: 7fd9211438c0() GS:a0b87fd0() knlGS: [3.374009] CS: 0010 DS: ES: CR0: 80050033 [3.374009] CR2: 55f2eeb94728 CR3: 7610c000 CR4: 003406e0 [3.374009] Call Trace: [3.374009] init_vqs+0x1a0/0x2e0 [virtio_console] [3.374009] virtcons_probe+0xb9/0x360 [virtio_console] [3.374009] virtio_dev_probe+0x144/0x1e0 [virtio] [3.374009] driver_probe_device+0x106/0x450 [3.374009] __driver_attach+0xa4/0xe0 [3.374009] ? driver_probe_device+0x450/0x450 [3.374009] bus_for_each_dev+0x6e/0xb0 [3.374009] driver_attach+0x1e/0x20 [3.374009] bus_add_driver+0x1d0/0x270 [3.374009] ? virtio_cons_early_init+0x1d/0x1d [virtio_console] [3.374009] driver_register+0x60/0xe0 [3.374009] ? virtio_cons_early_init+0x1d/0x1d [virtio_console] [3.374009] register_virtio_driver+0x20/0x30 [virtio] [3.374009] init+0x9f/0xfe3 [virtio_console] [3.374009] do_one_initcall+0x50/0x1a0 [3.374009] ? free_hot_cold_page+0x19a/0x300 [3.374009] ? kmem_cache_alloc_trace+0x15f/0x1c0 [3.374009] ? do_init_module+0x27/0x1e6 [3.374009] do_init_module+0x5f/0x1e6 [3.374009] load_module+0x22b7/0x2820 [3.374009] ? __symbol_put+0x60/0x60 [3.374009] SYSC_init_module+0x16f/0x1a0 [3.374009] SyS_init_module+0xe/0x10 [3.374009] do_syscall_64+0x67/0x170 [3.374009] entry_SYSCALL64_slow_path+0x25/0x25 [3.374009] RIP: 0033:0x7fd91fda53da [3.374009] RSP: 002b:7ffdf7f18d38 EFLAGS: 0246 ORIG_RAX: 00af [3.374009] RAX: ffda RBX: 55f2eeb6f6a0 RCX: 7fd91fda53da [3.374009] RDX: 7fd9208da9c5 RSI: b37b RDI: 55f2eeb893a0 [3.374009] RBP: 7fd9208da9c5 R08: 55f2eeb74e80 R09: 0078 [3.374009] R10: 7fd92005fb00 R11: 0246 R12: 55f2eeb893a0 [3.374009] R13: 55f2eeb6c140 R14: 0002 R15: 55f2ede5dfca [3.374009] Code: 54 53 49 89 fe e8 78 0d 00 00 85 c0 41 89 c5 75 44 49 8b 9e 08 03 00 00 4d 8d a6 08 03 00 00 4c 39 e3 74 31 49 8b 86 38 03 00 00 <0f> b7 7b 28 48 8d 70 16 e8 3a e6 15 d3 49 8b 86 38 03 00 00 bf [3.374009] RIP: vp_modern_find_vqs+0x39/0x70 [virtio_pci] RSP: af5a4075ba68 [3.410405] ---[ end trace a110b926d7e8d96b ]--- `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/257 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #185 `November 21 ISO is not bootable on UEFI`
chrismurphy added a new comment to an issue you are following: `` VM install of Fedora-Atomic-ostree-x86_64-25-20170118.1.iso to a clean LV succeeds, for both BIOS and UEFI firmware. Using default partitioning, the required layout is created. Both installations completely startup, I can login, docker-pool has been created, and docker is running. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/185 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #186 `switch to overlay2`
chrismurphy added a new comment to an issue you are following: `` Sorry for the confusing report. docker-root-lv was created automatically when /etc/sysconfig/docker-storage-setup contained >STORAGE_DRIVER=overlay2 DOCKER_ROOT_VOLUME=yes> Upon stopping docker and issuing atomic storage reset, this LV is removed. If I don't make changes to /etc/sysconfig/docker-storage-setup then a docker-pool LV (which is actually a dm thin pool) is created; and upon stopping docker and issuing atomic storage reset, this pool is likewise removed. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/186 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #186 `switch to overlay2`
chrismurphy added a new comment to an issue you are following: `` >dustymabe >Am I missing something? Did I make some bad assumptions somewhere in this test? Nope, works for me as well. /var is still a directory on the ext4 rootfs, but it looks like a new LV Is created at 40% of the free space in the VG, formatted XFS, and var-lib-docker.mount mounts it at /var/lib/docker; that mount file is created by the code triggered by DOCKER_ROOT_VOLUME=yes. I did additionally try a migrate from devicemapper to overlay2 using atomic storage export + reset + modify + import and it does work. There is no automatic space recapture of the docker-root-lv LV however that could be deleted by the user after the modify step, reboot so docker-storage-setup sets up the dm-thin pool, and then do the import. I'm assuming in any case that there needs to be temp space somewhere for the exported containers. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/186 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #186 `switch to overlay2`
chrismurphy added a new comment to an issue you are following: `` > dustymabe > with DOCKER_ROOT_VOLUME and overlayfs using that then all of /var/lib/docker > would be taken care of. Please let me know if I'm wrong. It'll work on a conventional installation. I'm skeptical it'll work on an rpm-ostree installation because /var is already a bind mount performed by ostree during the startup process. So I'm pretty sure ostree is going to have to know about the "true nature" of a separate var partition, mount it, then bind mount it correctly. >I tend to think more about the cloud use case where you spin up a >preconfigured image. What I was referring to is having docker-storage-setup be >able to make the switch for us. I don't have a strong opinion on where the proper hinting belongs to indicate which driver to use. The user already has to setup #cloud-config so maybe the hint belongs in there, and either it does something to storage which is then understood by docker-storage-setup, or the hint is just a baton to docker-storage-config to do it, just depends on which is more flexible and maintainable. > This means we can essentially look at if the user provided overlay or DM and > do whatever they asked. > - If they provided overlay then we can just extend the root partition and go > on our merry way. > - If they also specified DOCKER_ROOT_VOLUME=yes then they want overlay on > another partition, did they specify a partion? yes, use that one. no, create > an LV. > - If they provided DM then create new LVs and set it up just like we have > been doing before this discussion started. Seems reasonable. But I have zero confidence at the moment that ostree can handle a separate /var file system; it's a question for Colin what assumptions are being made and I think it assumes it's directory that it bind mounts somewhere, and if it's really a separate volume, then something has to mount it first before it can be bind mounted elsewhere. An additional trick is testing any changes against Btrfs where mounting subvolumes explicitly is actually a bind mount behind the scene. That should just work but... `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/186 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #186 `switch to overlay2`
chrismurphy added a new comment to an issue you are following: `` >vgoyal >IIUC, you are saying that use a thin LV for rootfs to work around xfs shrink >issue? People have tried that in the past and there have been talks about that >many a times. There are still issues with xfs on top of thin lv and how no >space situation is handled etc. Bottom line, we are not there yet. You mean thin pool exhaustion? Right now the atomic host default uses the docker devicemapper driver which is XFS on a dm-thin pool. So I don't understand why one is OK and the other isn't. >So if we can't use rootfs on thin LV and if xfs can't be shrinked, then only >way to flip back to devicemapper is don't allow rootfs to use all free space. When hosted in the cloud, isn't it typical to charge for allocated space whether it's actively used or not? >jberkus >If that reason is invalid, we should again consider making "one big partition" >the default for Overlay2 installations. Yes. It's the same effort to add more space (partition, LV, raw/qcow2), make it an LVM PV, and add to the VG and then let docker-storage-setup create a docker-pool thin pool from that extra space. >dwalsh >We have tools that allow you to switch back to devicemapper if their is >partioning, which is why we want to keep partitioning. If this was easy to >switch from no partioning to partitioned, then I would agree with just default >to overlay without partitions. My interpretation of jberkus "one big partition" is a rootfs LV that uses all available space in the VG, reserving nothing. But it's still possible to add a PV to that VG and either grow rootfs for continued use of overlay2; or to fallback to devicemapper. I don't interpret it literally to mean dropping LVM. You'd probably want some way of doing online fs resize as an option, and that requires rootfs on LVM or Btrfs, not a plain partition. I think it's a coin toss having this extra space already available in the VG, vs expecting the admin to enlarge the backing storage or add an additional device, which is then added to the VG, which can then grow rootfs (overlay2) or be used as fallback with the Docker devicemapper driver. >dustymabe >I would like to also point out that one other benefit would be to prevent >containers from cannibalizing your root partition. Not possible by making /var a separate file system, you'd have to use quotas. Ostree owns /var, it must be a directory on rootfs at present. >I prefer overlay2 and would like to see there be only one option so that we >can have less confusion in the future. However, giving users the choice is >nice as well. Maybe there is a way to achieve both on startup. You could have two kickstarts: overlay2 and devicemapper, and each kickstart is specified using a GRUB menu entry on the installation media. The devicemapper case uses the existing kickstart and depends on the existing docker-storage-setup "use 40% of VG free space for a dm-thin pool"; the overlay2 kickstart would cause the installer to use all available space for rootfs, leaving no unused space in the VG. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/186 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #186 `switch to overlay2`
chrismurphy added a new comment to an issue you are following: `` Flipping from one to the other will take free space somewhere for the 'atomic storage export/import' operation to temporarily store docker images and containers to. A way around the xfs lack of shrink issue is to put the filesystem containing /var onto a thinly provisioned LV (be it a dir on rootfs or its own volume). After 'atomic storage reset' wipes the docker storage, issue fstrim, and all the previously used extents will be returned to the thin pool, which can then be returned to the VG, which can then be reassigned to a new docker thin pool. Convoluted in my opinion, but doable. The problem I'm having migrating from devicemapper to overlay is add /var to fstab isn't working. Systemd picks it up, but no mount command is issued. Seems like there's a problem making sure it happens after ostree switchroot as there's no /var directory prior to the ostree rootfs being setup. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/186 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #186 `switch to overlay2`
chrismurphy added a new comment to an issue you are following: `` dwalsh mentioned the a way to flip between them https://www.spinics.net/linux/fedora/fedora-cloud/msg07620.html Missing from that sequence is actually configuring the new storage if it doesn't exist yet. I think putting custom partitioning into the hands of users, and then supporting those arbitrary layouts, is asking for endless trouble. Pick your battles, ignore the rest. The more versatile production solution for dealing with runaway usage of space is quotas. But lack of familiarity causes people to keep running back to the familiar torture of fs resize and repartitioning. I'm hopeful the storaged and Cockpit folks will one day help solve this. Partitioning to solve these problems is so last century. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/186 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Fedora 26 change: using overlayfs as default
On Tue, Dec 13, 2016 at 8:01 AM, Daniel J Walsh <dwa...@redhat.com> wrote: > > The only way to change from one storage to the other is to use > > atomic storage export > change the config > atomic storage reset > atomic storage import Nifty. A migration tool would have to juggle the potential for insufficient space in /var for the export; or sufficient space for export but then not importing. And then there's cleanup of otherwise dead space used by device mapper. So possibly more than one fs resize is necessary. I'd say probably leave things alone for upgrades, but documenting a strategy for migrating to overlay is OK. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Fedora 26 change: using overlayfs as default
On Mon, Dec 12, 2016 at 3:13 PM, Josh Berkus <jber...@redhat.com> wrote: > On 12/12/2016 02:12 PM, Dusty Mabe wrote: >> >> After I get a bug[1] fixed and out the door I'm going to publish >> a blog post/docs on setting up Fedora 25 Atomic host and/or Cloud >> base to use overlay2 as the storage driver for docker. >> >> I'd like for everyone that can to test this out and to start running >> their container workloads with overlay2 with selinux enabled and let's >> file bugs and get it cleaned up for Fedora 26 release. >> >> Should we file this as a "change" for Fedora 26? > > I'd say so, yes. I suggest it be discussed by all the work groups, on devel@. It might turn out that Fedora Atomic Host goes first, and there may be some variation (Atomic Host has no need for LVM although it doesn't hurt, where Server would almost certainly want to keep it, and Workstation could flip a coin). > Also, someone needs to test the case of migrating an existing system and > how that looks. It'd need a test for enough free space on /var, which first needs an estimate of every single container image in the thinly provisioned storage; stop docker and change the configuration to use overlay instead of device mapper driver; start docker, import all the tar'd containers. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #185 `November 21 ISO is not bootable`
chrismurphy added a new comment to an issue you are following: `` [![program.log](/atomic-wg/issue/raw/files/cb8148271c4c88af8e1abebf3d2b725e0f44eaa2ffc65b837c6424c061eb4755-program.log)](/atomic-wg/issue/raw/files/cb8148271c4c88af8e1abebf3d2b725e0f44eaa2ffc65b837c6424c061eb4755-program.log) [![storage.log](/atomic-wg/issue/raw/files/50959389d3087c198eea954198fb25e775139a28dc23ee0d7f0e38dbc99c6d06-storage.log)](/atomic-wg/issue/raw/files/50959389d3087c198eea954198fb25e775139a28dc23ee0d7f0e38dbc99c6d06-storage.log) [![anaconda.log](/atomic-wg/issue/raw/files/20f87acfd264c8a61b384500b5a326831ce96b4d661ecef50bab1e290d7a41b1-anaconda.log)](/atomic-wg/issue/raw/files/20f87acfd264c8a61b384500b5a326831ce96b4d661ecef50bab1e290d7a41b1-anaconda.log) `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/185 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
[atomic-wg] Issue #185 `November 21 ISO is not bootable`
chrismurphy added a new comment to an issue you are following: `` Fedora-Atomic-ostree-x86_64-25-20161207.0.iso in virt-manager set to use UEFI; and default automatic partitioning. program.log 12:04:03,606 INFO program: Running... efibootmgr 12:04:03,647 INFO program: EFI variables are not supported on this system. 12:04:03,648 DEBUG program: Return code: 2 12:04:03,648 INFO program: Running... efibootmgr -c -w -L Fedora -d /dev/vda -p 1 -l \EFI\fedora\shim.efi 12:04:03,658 INFO program: EFI variables are not supported on this system. 12:04:03,659 DEBUG program: Return code: 2 12:04:03,660 INFO program: Running... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg 12:04:04,055 INFO program: /usr/bin/grub2-editenv: error: cannot rename the file /boot/grub2/grubenv.new to /boot/grub2/grubenv: No such file or directory. 12:04:04,056 INFO program: /sbin/grub2-mkconfig: line 247: /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory 12:04:04,057 DEBUG program: Return code: 1 However, if I get to a vt and run efibootmgr there is no error. So I'm not sure why anaconda has a problem running it. The last two errors likewise don't make sense on their own, so to try and reproduce the problem I tried: # chroot /mnt/sysimage chroot: failed to run command '/bin/sh': No such file or directory Huh. So that usually works on netinstalls and lives. And /bin/sh does exist, it's a symlink to bash and /bin/bash does exist also. So I'm still confused. `` To reply, visit the link below or just reply to this email https://pagure.io/atomic-wg/issue/185 ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: [atomic-wg] Issue #185 `November 21 ISO is not bootable`
Using virt-manager set to use UEFI program.log 12:04:03,606 INFO program: Running... efibootmgr 12:04:03,647 INFO program: EFI variables are not supported on this system. 12:04:03,648 DEBUG program: Return code: 2 12:04:03,648 INFO program: Running... efibootmgr -c -w -L Fedora -d /dev/vda -p 1 -l \EFI\fedora\shim.efi 12:04:03,658 INFO program: EFI variables are not supported on this system. 12:04:03,659 DEBUG program: Return code: 2 12:04:03,660 INFO program: Running... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg 12:04:04,055 INFO program: /usr/bin/grub2-editenv: error: cannot rename the file /boot/grub2/grubenv.new to /boot/grub2/grubenv: No such file or directory. 12:04:04,056 INFO program: /sbin/grub2-mkconfig: line 247: /boot/efi/EFI/fedora/grub.cfg.new: No such file or directory 12:04:04,057 DEBUG program: Return code: 1 However, if I get to a vt and run efibootmgr there is no error. So I'm not sure why anaconda has a problem running it. The last two errors likewise don't make sense on their own, so to try and reproduce the problem I tried: # chroot /mnt/sysimage chroot: failed to run command '/bin/sh': No such file or directory Huh. So that usually works on netinstalls and lives. And /bin/sh does exist, it's a symlink to bash and /bin/bash does exist also. So I'm still confused. Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: List of F26 features from Atomic Working Group
It might be beyond the scope of Fedora 26, but I'd like to evaluate the liabilities (pros, cons and gotchas) of supporting all possible user defined layouts (within reason) out of the box. That is, ext4, XFS, Btrfs, overlay(fs), dm thin. Surely this is a boolean problem, and the setup just needs to know what's being used, and automagically do the right thing. The most obvious flaw with this idea, is the move to overlayfs is intended to shed the baggage of docker-storage-setup and LVM thin, but ideally I'd like to see atomic support a few sane (whatever that's defined to be) layouts and automatically use them, so we can better figure out what works well, and what works poorly, for various use cases. Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Cloud and Server Q
On Wed, Oct 5, 2016 at 11:57 AM, Josh Berkus <jber...@redhat.com> wrote: > On 10/04/2016 01:38 PM, Matthew Miller wrote: >> On Tue, Oct 04, 2016 at 12:58:05PM -0700, Josh Berkus wrote: >>> What this is sounding like is a huge discrepancy between what the >>> Council, PRD group, etc. think we should be doing and what we can >>> actually do. >>> >>> Given that, I think I should tell the designer to push the design >>> changes back. >> >> I don't see how that follows. In the ideal — and I think most likely, >> since the bugs making F25 not work are being knocked off — case, we'll >> have Atomic built on F25 at F25 GA date. In the less ideal case, we'll >> keep shipping the F24-based one, but there's no reason that can't work >> with the new Atomic-focused design. For that matter, we could launch >> that _before_ the GA. > > So, I'm looking at this from a user perspective. > > * F25 is announced > * User goes to getfedora.org, sees new "atomic" icon. > * User clicks through > * User sees that Atomic is still F24. > > From that point, one of two things happens: > > 1. User files a bug, and we're flooded with "atomic download page not > updated" bugs, or > > 2. user decides that Atomic isn't a real thing and never goes back. > > I really don't see a flow that results in the user checking back two > weeks later to see if Atomic has been updated yet. Especially since > we're dealing with a substantial issue with SELinux and it's not > guaranteed that there will be an F25 atomic release 2 weeks later, either. > > You are the Project Leader, and you can certainly say "do it anyway". > But please understand why I think it's not a great idea. There's roughly 5 weeks to GA to get atomic stuff sorted out, which sounds like there's some padding available. Option A: Burn the midnight oil and commit to the Atomic landing page and its deliverables. Option B: Ask design folks if they're willing and able to be prepared with contingency: swap out the planned new Atomic landing page for an updated version of the current Cloud landing page, if Atomic isn't ready by X days before GA. IF you have to pull the contingency, I think at most anytime even within Fedora 25's life time, you can swap out Cloud for Atomic to underscore the new emphasis. I think it's right to say instead of pressure cooker May and November, that it's a lighter effort more broadly distributed. I don't see a big problem with a change in branding midstream for a release, but I'm not a marketing type. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Cloud and Server Q
On Tue, Oct 4, 2016 at 11:59 AM, Josh Berkus <jber...@redhat.com> wrote: > On 10/04/2016 08:13 AM, Colin Walters wrote: >> >> On Tue, Oct 4, 2016, at 09:46 AM, Paul W. Frields wrote: >>> > >>> > I think mattdm would agree we don't want to potentially, >>> > *indefinitely* block a six-month release with a deliverable that can >>> > be fixed and re-released in two weeks. >> It's not that simple - this is a messy topic. What I think this >> is about isn't delaying or blocking - it's *prioritization*. If >> an issue comes up in Anaconda or systemd or whatever >> that affects the "next AH", we need those teams to priortize >> those fixes the same as they do for Workstation or Server. > > Yes, this is exactly the problem I'm raising. We've had an issue with > F25-base Atomic not booting for a couple weeks now, and until the last > couple of days, nobody has been working on it. It seems to be a simple > fact of the Fedora release cycle that if something isn't > release-blocking, it doesn't get done. This isn't new, it's an issue > which has plagued Fedora Atomic for, as far as I can tell, its entire > existence. Perhaps in some cases, but it's not always true. Spins get done even though they're not release blocking. The issue with release blocking status is it compels the expert in the particular area of failure to become involved. And that is a limited resource. Possibly a big part of the reason for Atomic failures is there's a lack of documentation across the board, both ostree stuff as well as releng's processes, and then when ostree failures happen the logs are often lacking in such detail that a Tarot card reader might have a better chance of guessing what's going on than the logs indicate. This makes it difficult to get contributors involved. And makes it damn near impossible any of them would want to become even intermediately competent - it's a heavy investment. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Current Fedora 25 Atomic images are failing to boot
On Mon, Oct 3, 2016 at 1:06 PM, Colin Walters <walt...@verbum.org> wrote: > On Mon, Oct 3, 2016, at 02:57 PM, Dusty Mabe wrote: > >> There is a kernel panic happening early in boot. Here is the serial >> console log from one of those boots: > > https://bugzilla.redhat.com/show_bug.cgi?id=1380866 Does this need to be marked as a freeze exception to back the change out for beta candidate images to work? -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Cloud and Server Q
On Fri, Sep 30, 2016 at 3:31 PM, Josh Berkus <jber...@redhat.com> wrote: > On 09/30/2016 02:01 PM, Josh Boyer wrote: > >> 16:44:56 Cloud base image is the only blocking deliverable. >> 16:44:59 Atomic is not. >> >> I realize this WG is in the middle of rebooting itself, but to have >> clearly conflicting information from the WG members is a bit >> concerning. > > Kushal? > > Based on my attendence at the Cloud WG meetings, I had the understanding > that Atomic was becoming our main deliverable. If that's not true, then > I need to pull a whole bunch of changes and put them on ice until Fedora 26. What also matters is the understanding of others who needed to understand this. To me it sounds like a baton was dropped. But moving forward... What does release blocking mean? There are a bunch of QA criteria and test cases that help make sure those criteria are met. There are no atomic host specific criteria or test cases that I'm aware of. I expect QA probably can't provide significant assistance in QAing the atomic qcow2 image for this release. How big of a problem is that? Is there a Fedora policy that requires a default download product to be QA'd somehow, or to be release blocking? Can Cloud WG take the lead QA'ing the atomic qcow2 image? What are the releng implications of it not being release blocking? For example, during the Fedora 24 cycle there was a neat bug in the compose process that caused some images to fail. It wasn't possible to just do another compose and cherry pick the working ISOs from two different composes (I forget why). Is there anything like that here, or is there sufficiently good isolation between ostree images and other images? What happens if release is a go for everything else, but atomic qcow2 is not working? What I've heard is "fix the problem and remake the image" similar to the current two week cycle. Does releng agree, and will there be time between a Thursday "go" and Tuesday (whatever day it is) "release" to get an atomic qcow2 built and on getfedora? What if there isn't? What if it's a week after release before there's a working one? If the liabilities there can be sorted out satisfactorily I'd say proceed with Atomic on getfedora. Next issue is Cloud Base images. Cloud WG needs to decide if these are going to be created and if so how they're going to get linked to and from where. Is there a designed landing page for these already? If not, my thought is have a side bar link to a basic directory listing for them, rather than the fancy landing page that currently exists for Fedora 24 Cloud Base images. And demote the Cloud Base images to non-release blocking. And then whatever contingency for that side bar link if the Cloud Base images aren't available for release day. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
atomic dvd 25 default install crash
Still hitting this crash with a default installation of Fedora-Atomic-dvd-x86_64-25-20160921.n.0.iso https://bugzilla.redhat.com/show_bug.cgi?id=1375702 -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: Fedora Atomic Host Two Week Release Announcement
This notification still seems broken. That page says: The latest two week build did not meet our testing criteria. The images available are from over 22 days ago. Check the Project Atomic blog for updates and information about Atomic blocker bugs. And the image available for download is https://download.fedoraproject.org/pub/alt/atomic/stable/Atomic/x86_64/iso/Fedora-Atomic-dvd-x86_64-24-20160820.0.iso On Wed, Sep 21, 2016 at 10:45 AM, <nore...@fedoraproject.org> wrote: > > A new update of Fedora Cloud Atomic Host has been released and can be > downloaded at: > > Images can be found here: > > https://getfedora.org/en/cloud/download/atomic.html > > Respective signed CHECKSUM files can be found here: > https://alt.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-24-20160921.0/CloudImages/x86_64/images/Fedora-CloudImages-24-20160921.0-x86_64-CHECKSUM > https://alt.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-24-20160921.0/Atomic/x86_64/iso/Fedora-Atomic-24-20160921.0-x86_64-CHECKSUM > > Thank you, > Fedora Release Engineering > > ___ > cloud mailing list -- cloud@lists.fedoraproject.org > To unsubscribe send an email to cloud-le...@lists.fedoraproject.org > -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: overlayfs for AFTER Fedora 25
Just in case this poor horse isn't suitably beaten yet. 1. Create 4 qcow2 files per qemu-img create -f qcow2 *.qcow2 120g Each qcow2 starts out 194K (not preallocated). q 2. Format each qcow2 mkfs.ext4 mkfs.ext4 -i 4096 mkfs.xfs mkfs.btrfs 3. mount each fs (mainly to be fair since ext4 does lazy init) and wait until the qcow2 stops growing. 5.5M -rw-r--r--. 1 qemu qemu 5.9M Sep 19 20:40 bios_btrfs.qcow2 2.1G -rw-r--r--. 1 root root 2.1G Sep 19 20:27 bios_ext4_default.qcow2 7.7G -rw-r--r--. 1 root root 7.7G Sep 19 20:33 bios_ext4_i4096.qcow2 62M -rw-r--r--. 1 qemu qemu 62M Sep 19 20:40 bios_xfs.qcow2 Btrfs and XFS take seconds to completely initialize. Ext4 defaults took 6 minutes, and with -i 4096 it took 8 minutes to complete lazy init. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: overlayfs for AFTER Fedora 25
On Fri, Sep 16, 2016 at 2:15 PM, Chris Murphy <li...@colorremedies.com> wrote: > You'd need to > run all this by them and see if there's a way to do a mkfs.ext4 -i > 4096 for just Atomic Host installations, there's no point doing that > for workstation installations. Or just use XFS. Another possibility is an AH specific /etc/mke2fs.conf file on the installation media only. [defaults] base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr default_mntopts = acl,user_xattr enable_periodic_fsck = 0 blocksize = 4096 inode_size = 256 inode_ratio = 16384 By changing inode_ratio = 4096, it achieves the same outcome as -i 4096 without having to pass that flag at mkfs time. And it'd only affect installation time file systems (including /boot and / as well as the persistent storage for overlayfs and containers). So... yeah. FWIW, you're basically already using XFS with the dm-thin docker-storage-setup you've got going on right now. It doesn't get mounted anywhere, but $ docker info [chris@localhost ~]$ sudo docker info [sudo] password for chris: Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 1.10.3 Storage Driver: devicemapper Pool Name: fedora--atomic-docker--pool Pool Blocksize: 524.3 kB Base Device Size: 10.74 GB Backing Filesystem: xfs So, just use XFS across the board (plus overlayfs on the persistent storage for containers). As for Workstation changing file systems, that's another ball of wax. I'd just say use XFS + overlayfs there too to keep it simple across the various products in the near term. And then presumably the Workstation folks will want Btrfs when it sufficiently stable that the kernel team won't freak out if there's still no Btrfs specific kernel dev on the team or at Red Hat. -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: overlayfs for AFTER Fedora 25
On Fri, Sep 16, 2016 at 10:56 AM, Chris Murphy <li...@colorremedies.com> wrote: > Inode exhaustion? > > If the installer is going to create the file system used for overlayfs > backing storage with ext4, that probably means mkfs.ext4 -i 4096 will > need to be used; so how does that get propagated to only AH installs, > for both automatic and custom partitioning? Or figure out a way to > drop custom/manual partitioning from the UI. Or does using XFS > mitigate this issue? A simple search turns up no inode exhaustion > reports with XFS. The work around for ext4 is at mkfs time, it's not > something that can be changed later. I just did some more digging, and also chatted with Eric Sandeen about this. Here's what I've learned: - Inode exhaustion with mkfs.ext4 defaults can be a real thing with overlayfs [1] - mkfs.ext4 -i 4096 will make 1 inode per 4096 byte block, so 1:1, which is a metric shittonne of inodes - a different -i value might be more practical most of the time, but if the maximum aren't created at mkfs time and are exhausted the fs basically face plants and no more files can be created; and it's only fixable by a.) deleting a bunch of files or b.) creating a new file system to have more inodes preallocated. - mkfs.ext4 hands off the actual creation of the inodes to lazy init at first mount time, it's a lot of metadata being written to do this - XFS doesn't have this issue, its inode allocation is dynamic (there are limits but can be changed with xfs_growfs) - XFS now defaults to -m crc=1, and by extension -n flags=1 which overlayfs wants for putting filetype in the directory entry; Fedora 24 had a sufficiently new xfsprogs for this from the get go. I don't know what the workflow is for creating the persistent storage for the host, whether this will be Anaconda's role or something else? If Anaconda, my experience has been the Anaconda team are reluctant to use non-default mkfs unless there's a UI toggle for it. You'd need to run all this by them and see if there's a way to do a mkfs.ext4 -i 4096 for just Atomic Host installations, there's no point doing that for workstation installations. Or just use XFS. [1] https://github.com/coreos/bugs/issues/264 https://github.com/boot2docker/boot2docker/issues/992 -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: overlayfs for AFTER Fedora 25
On Fri, Sep 16, 2016 at 7:02 AM, Colin Walters <walt...@verbum.org> wrote: > > > On Thu, Sep 15, 2016, at 09:57 AM, Dusty Mabe wrote: > >> That is correct, but changing a default like that might be a bad idea. >> My opinion is that it should happen on a major release boundary. > > One thing this impacts is the AH partitioning - it no longer makes > sense by default with overlayfs. I think we should probably do exactly > the same thing as the Server SIG (and consider doing it for Workstation > too), which actually argues for just fixing the Anaconda defaults. > > Server thread: > https://lists.fedoraproject.org/archives/list/ser...@lists.fedoraproject.org/thread/D7ZK7SILYDYAATRFS6BFWZQWS6KSRGDG/ The genesis of that was me pointing to Cloud Atomic ISO's handling; since Server went with pretty much the identical layout, it managed to get slipped in for Alpha. It was a proven layout. [1] For an Atomic Host overlayfs based layout, there's nothing within Fedora that's a proven layout. For starters, it could be something much simpler than what CoreOS is doing [2]. If the target installation is VM, then dropping LVM stuff makes sense. If it's going to include baremetal, keeping LVM makes sense. I'm a bit unclear on this point, but with a handwave it sorta feels like Cloud->Container WG is far less interested in the baremetal case, where Server is about as interested in baremetal as VM and container cases. If that's true, then the CoreOS layout is a decent starting point, and just needs some simplification to account for ostree deployments rather than partition priority flipping. Inode exhaustion? If the installer is going to create the file system used for overlayfs backing storage with ext4, that probably means mkfs.ext4 -i 4096 will need to be used; so how does that get propagated to only AH installs, for both automatic and custom partitioning? Or figure out a way to drop custom/manual partitioning from the UI. Or does using XFS mitigate this issue? A simple search turns up no inode exhaustion reports with XFS. The work around for ext4 is at mkfs time, it's not something that can be changed later. Release blocking and custom partitioning? Upon AH image becoming release blocking, then "The installer must be able to create and install to any workable partition layout using any file system and/or container format combination offered in a default installer configuration. " applies. Example bug [3] where this fails right now. Does it make sense for AH installations to somehow be exempt from custom partitioning resulting in successful installations? And what would that look like criterion wise (just grant an exception?) or installer wise (drop the custom UI or put up warnings upon entering?) [1] https://lists.fedoraproject.org/archives/list/ser...@lists.fedoraproject.org/thread/PLWNOM6Z5226VZYUHTL6KMS3553VSQ3W/ [2] https://coreos.com/os/docs/latest/sdk-disk-partitions.html Trivial pursuit is this "the GPT priority attribute" which I can find no where else, but I rather like this idea of using an xattr on a directory as the hint for which fs tree the bootloader should use rather than writing out new bootloader configurations. [3] https://bugzilla.redhat.com/show_bug.cgi?id=1289752 -- Chris Murphy ___ cloud mailing list -- cloud@lists.fedoraproject.org To unsubscribe send an email to cloud-le...@lists.fedoraproject.org
Re: overlayfs for AFTER Fedora 25
On Wed, Sep 14, 2016 at 2:45 PM, Jason Brooks <jbro...@redhat.com> wrote: > On Wed, Sep 14, 2016 at 12:14 PM, Dusty Mabe <du...@dustymabe.com> wrote: >> >> In the cloud meeting today I brought up overlayfs and F25. After >> discussing with the engineers closer to the technology they recommend >> waiting to move to overlayfs as the default in F26. >> >> I think this will work well because it will give us some time to allow >> people to "try" overlayfs in F25 (we should provide good docs on this) >> and then give us feedback before we go with it as default in F26. If >> the feedback is bad then maybe we wouldn't even go with it in F26, but >> hopefully that won't be the case. >> >> Thoughts? > > Sounds good to me. I'm uncertain if this is current or needs an update: Evaluate overlayfs with docker https://github.com/kubernetes/kubernetes/issues/15867 If the way forward is a non-duplicating cache then I see a major advantage gone. But that alone isn't enough to promote something else, I'd just say, hedge your bets. Pretty much all the reasons why CoreOS switched from Btrfs to overlay have been fixed, although there's a asstrometric ton of enospc rework landing in kernel 4.8 [1] that will need time to shake out, and if anyone's able to break it, one of the best ways of getting it fixed and avoiding regressions is to come up with an xfstests [2] for it to be cleanly reproduced. The Facebook devs consistently report finding hardware (even enterprise stuff that they use) doing batshit things that Btrfs catches and corrects that other filesystems aren't seeing. And then on the slow downs mainly due to fragmentation when creating and destroying many snapshots over a short period of time, this probably could be mitigated with garbage collection optimization, and I've had some ideas about that if anyone wants to futz around with it. The more conservative change is probably XFS + overlayfs though, since now XFS checksums fs metadata and the journal, which helps catch problems before they get worse. [1] http://www.spinics.net/lists/linux-btrfs/msg53410.html [2] semi random example http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=blob;f=tests/btrfs/060 -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: growpart not working in Fedora 25 cloud base
On Sat, Sep 10, 2016 at 10:45 PM, Dusty Mabe <du...@dustymabe.com> wrote: > > > On 09/11/2016 12:22 AM, Dusty Mabe wrote: >> >> >> On 09/10/2016 12:47 PM, Chris Murphy wrote: >>> On Fri, Sep 9, 2016 at 9:28 PM, Dusty Mabe <du...@dustymabe.com> wrote: >>>> >>>> >>>> Should I open a bug for this? Can we get someone to look at it/work on it? >>> >>> Yes, and I think it needs a dmesg in case partprobe was called but >>> that failed for some reason. And then need to look at the cloud-init >>> code and see if partprobe is being called. This is not the best log, >>> it doesn't report the actual commands its using and the exit code for >>> each command. So we're left wondering if partprobe was called or not. >>> Maybe it's being called but is missing in the image? >> >> I opened a bug here and added some more information to it: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1374968 >> > > and.. this has actually already been reported and the fix is in > updates-testing. > > https://bugzilla.redhat.com/show_bug.cgi?id=1371761 Good catch. Looks like it affects udisks2/storaged as well. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: growpart not working in Fedora 25 cloud base
On Sat, Sep 10, 2016 at 12:05 PM, Chris Murphy <li...@colorremedies.com> wrote: > Could be related this bug, I see the same error there after Disks > changes partitioning, and the old partition table is being used. A > feature for Fedora 25 is udisks is replaced by storaged, so this could > be part of that problem. But I have no idea why could-init would be > using udisks or storaged, so this might be a goose chase. > > Error setting partition type after formatting > https://bugzilla.redhat.com/show_bug.cgi?id=1374334 Could be both cloud-init and this storaged bug are hitting the same lower level bug, causing the kernel to not get an updated partition table (via partprobe)... the journal output in that bug isn't enlightening, there are no kernel messages. One reason it'd fail is if something has mounted any file system on the device that's having its partition modified, that'd make it busy, and partprobe tends to fail in that case. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: growpart not working in Fedora 25 cloud base
Could be related this bug, I see the same error there after Disks changes partitioning, and the old partition table is being used. A feature for Fedora 25 is udisks is replaced by storaged, so this could be part of that problem. But I have no idea why could-init would be using udisks or storaged, so this might be a goose chase. Error setting partition type after formatting https://bugzilla.redhat.com/show_bug.cgi?id=1374334 Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: growpart not working in Fedora 25 cloud base
On Fri, Sep 9, 2016 at 9:28 PM, Dusty Mabe <du...@dustymabe.com> wrote: Stderr: "attempt to resize /dev/sda failed. sfdisk output below:\n| Backup files:\n| MBR (offset 0, size 512): /tmp/growpart.PO8wWI/backup-sda-0x.bak\n| \n| Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors\n| Units: sectors of 1 * 512 = 512 bytes\n| Sector size (logical/physical): 512 bytes / 512 bytes\n| I/O size (minimum/optimal): 512 bytes / 512 bytes\n| Disklabel type: dos\n| Disk identifier: 0x1ef30347\n| \n| Old situation:\n| \n| Device Boot Start End Sectors Size Id Type\n| /dev/sda1 * 2048 6291455 6289408 3G 83 Linux\n| Ok so it starts out as 6289408/2048= 3071MiB (or 3G) >Created a new partition 1 of type 'Linux' and of size 10 GiB.\n| /dev/sda2: \n| New situation:\n| \n| Device Boot Start End Sectors Size Id Type\n| /dev/sda1 * 2048 20971519 20969472 10G 83 Linux\n| \n| New size. But what about sda2? It said it was creating a new partition sda2, but not specifying its size, only specifying the new size of sda1. > The partition table has been altered.\n| Calling ioctl() to re-read partition > table.\n| Re-reading the partition table failed.: Device or resource busy\n| > The kernel still uses the old table. The new table will be used at the next > reboot or after you run partprobe(8) or kpartx(8).\n Something isn't calling partprobe? Or there's a kernel error in re-reading the device? dmesg would help, maybe. * WARNING: Resize failed, attempting to revert **\n512+0 records in\n512+0 records out\n512 bytes copied, 0.000400551 s, 1.3 MB/s\n* Appears to have gone OK \n" And if we are to believe this, it changed the partition table back to the Old Situation. > Sep 10 03:13:17 cloudhost.localdomain cloud-init[645]: [CLOUDINIT] > util.py[DEBUG]: resize_devices took 0.127 seconds > Sep 10 03:13:17 cloudhost.localdomain cloud-init[645]: [CLOUDINIT] > cc_growpart.py[DEBUG]: '/' FAILED: failed to resize: disk=/dev/sda, ptnum=1: > Unexpected error while running command. >Command: ['growpart', > '/dev/sda', '1'] >Exit code: 2 >Reason: - >Stdout: 'FAILED: > failed to resize\n' >Stderr: "attempt to > resize /dev/sda failed. sfdisk output below:\n| Backup files:\n| MBR > (offset 0, size 512): /tmp/growpart.PO8wWI/backup-sda-0x.bak\n| > \n| Disk /dev/sda: 10 GiB, 10737418240 bytes, 20971520 sectors\n| Units: > sectors of 1 * 512 = 512 bytes\n| Sector size (logical/physical): 512 bytes / > 512 bytes\n| I/O size (minimum/optimal): 512 bytes / 512 bytes\n| Disklabel > type: dos\n| Disk identifier: 0x1ef30347\n| \n| Old situation:\n| \n| Device >Boot Start End Sectors Size Id Type\n| /dev/sda1 * 2048 6291455 > 6289408 3G 83 Linux\n| \n| >>> Script header accepted.\n| >>> Script header > accepted.\n| >>> Script header accepted.\n| >>> Script header accepted.\n| > >>> Created a new DOS disklabel with disk identifier 0x1ef30347.\n| Created a > new partition 1 of type 'Linux' and of size 10 GiB.\n| /dev/sda2: \n| New > situation:\n| \n| Device Boot Start End Sectors Size Id Type\n| > /dev/sda1 * 2048 20971519 20969472 10G 83 Linux\n| \n| The partition > table has been altered.\n| Calling ioctl() to re-read partition table.\n| > Re-reading the partition table failed.: Device or resource busy\n| The kernel > still uses the old table. The new table will be used at the next reboot or > after you run partprobe(8) or kpartx(8).\n* WARNING: Resize failed, > attempting to revert **\n512+0 records in\n512+0 records out\n512 bytes > copied, 0.000400551 s, 1.3 MB/s\n* Appears to have gone OK \n" > And that just looks like a 2nd attempt which also fails. > > > Should I open a bug for this? Can we get someone to look at it/work on it? Yes, and I think it needs a dmesg in case partprobe was called but that failed for some reason. And then need to look at the cloud-init code and see if partprobe is being called. This is not the best log, it doesn't report the actual commands its using and the exit code for each command. So we're left wondering if partprobe was called or not. Maybe it's being called but is missing in the image? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Fedora 25
On Sun, Sep 4, 2016 at 6:54 AM, Benson Muite <benson_mu...@emailplus.org> wrote: > Hi, > > If any of you use Fedora Atomic as a desktop, could you add a brief overview > of why and how (workflow) you do so here: > https://fedoraproject.org/wiki/Fedora_25_talking_points > > For the typical Fedora workstation user, what is needed to migrate to Fedora > Atomic as a desktop? Does this make it easier to use remote cloud resources? These might be better asked on desktop@ list where the Workstation WG and users can put in their 2 cents? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Proposal: for F26, move Cloud Base Image to Server WG
On Thu, Aug 25, 2016 at 3:38 PM, Stephen John Smoogen <smo...@gmail.com> wrote: > On 25 August 2016 at 13:34, Matthew Miller <mat...@fedoraproject.org> wrote: >> >> We've talked about this for a while, but let's make it formal. The plan >> is to transition from Cloud as a Fedora Edition to Something Container >> Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO). >> >> But, we still need cloud as a _deploy target_. The FAO-container-thing >> will continue to have cloud image deploy targets (as well as bare >> metal). I think it makes sense to _also_ have Fedora Server as a cloud >> deploy target. >> > > Could we make sure that whatever targets we have are actually getting > tested? The fact that autocloud has said it was broken for months but > the cloud sig wasn't looking or fixing says that before we get to step > 2, we need to say 'is anyone more than 2 people really interested?' It > should be ok to say 'no we aren't.' without people diving into the > fire trying to rescue something that unless it was on fire they > wouldn't have helped. There are a lot of images being produced and I have no idea if they're really needed. That a release blocking image (cloud base qcow2) nearly caused F25 alpha to slip because it was busted at least suggests it probably shouldn't be release blocking anymore. FWIW, cloud base qcow2 now gets grub2 in lieu of extlinux as the work around for the breakage. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
don't use docker 1.10.3-3 or -4
https://bugzilla.redhat.com/show_bug.cgi?id=1330294 https://bugzilla.redhat.com/show_bug.cgi?id=1322909 Beware of 1.10.3-4 in tree 24.16. And possibly -3 also although I don't know what tree that's in. The only way I could fix it was dnf remove docker then dnf reinstall docker; upgrading to -5 doesn't fix the problem, the -4 version had to be removed first. So in particular for atomic users this is not good because neither rollback nor updates will fix the problem. And I don't know what the problem is, so I don' t know how to fix it other than with the dnf remove hammer. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: [Marketing] Re: [MAGAZINE PROPOSAL] Fwd: [DRAFT] Why we're retiring 32-bit Images (was Re: Retiring 32-bit images)
On Tue, Apr 19, 2016 at 2:48 PM, Adam Williamson <adamw...@fedoraproject.org> wrote: > On Tue, 2016-04-19 at 13:48 -0600, Chris Murphy wrote: >> Any i686 package that fails to build means it's failed for all primary >> archs, because i686 is a primary arch. And a failed build means it >> won't be tagged for compose so depending on the package it could hold >> up composes. > > True, though I hadn't actually mentioned that scenario. But indeed. Say > we needed a fix to dracut, pronto, to make the x86_64 cloud base image > boot, but the build with the fix failed on i686: that would have to be > dealt with somehow. Good point. Oh and about terminology, it may be here where "block" gets reused as a term in a confusing way. If dracut build fails on i686, that "blocks" composes. But it's really a kind of claw back: zombie i686 is grabbing the leg of other primary archs, and that stops the workflow. Making i686 secondary would prevent this? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: [Marketing] Re: [MAGAZINE PROPOSAL] Fwd: [DRAFT] Why we're retiring 32-bit Images (was Re: Retiring 32-bit images)
On Tue, Apr 19, 2016 at 2:48 PM, Adam Williamson <adamw...@fedoraproject.org> wrote: > On Tue, 2016-04-19 at 13:48 -0600, Chris Murphy wrote: > >> From my limited perspective, such non-functional failure held up >> release when it violated a release criterion in effect because that >> non-functionality became coupled with image blocking, i.e. if kernel >> doesn't function, then image doesn't function/is DOA, DOA images are a >> release criteria violation, therefore block. Correct? Or is there some >> terminology nuance here that I'm still missing in the sequence? > > No, even in this case there is no release blocking impact, because > nothing release blocking is broken by the bug. The i686 images are not > release blocking, end of story. Even if they are completely DOA, that > does not block release. Yes, I meant i686 in the past tense. OK so I think I get it. i686 is officially primary, but in practice it's at best secondary. And that should be made official. TBD whether there's even enough people power and momentum to support it as secondary. >> It's best to assume I don't understand the terms well enough to use >> them precisely, rather than assume I'm trying to redefine them. > > I was not actually thinking of you there (I just picked your post to > reply to since it was at the top of the pile), more the vagueness in > the thread in general. Got it. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: [Marketing] Re: [MAGAZINE PROPOSAL] Fwd: [DRAFT] Why we're retiring 32-bit Images (was Re: Retiring 32-bit images)
On Tue, Apr 19, 2016 at 1:11 PM, Adam Williamson <adamw...@fedoraproject.org> wrote: > > QA referred the question of whether upgrades from a release where i686 > was 'release blocking' (<24) to releases where i686 is 'non blocking' > (>23) should be considered 'release blocking' to FESCo. i.e. if there > are violations of the release criteria in this upgrade path, should we > treat that as blocking the Beta or Final releases. FESCo's decision was > "no". So no matter what, all i686 images (qcow2, raw, ISOs) are non-blocking. Any i686 package that fails to build means it's failed for all primary archs, because i686 is a primary arch. And a failed build means it won't be tagged for compose so depending on the package it could hold up composes. But the current i686 problems aren't package build failures, rather it's a particular critical path package (or two) that are broadly or entirely non-functional when executed. So what's it called when a critical path package fails to function on a primary arch? And what's done about it? From my limited perspective, such non-functional failure held up release when it violated a release criterion in effect because that non-functionality became coupled with image blocking, i.e. if kernel doesn't function, then image doesn't function/is DOA, DOA images are a release criteria violation, therefore block. Correct? Or is there some terminology nuance here that I'm still missing in the sequence? > I really think it would help if we use these terms carefully and > precisely, and if we're going to re-define them in any way, make that > clear and explicit. It's best to assume I don't understand the terms well enough to use them precisely, rather than assume I'm trying to redefine them. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: [Marketing] Re: [MAGAZINE PROPOSAL] Fwd: [DRAFT] Why we're retiring 32-bit Images (was Re: Retiring 32-bit images)
On Mon, Apr 18, 2016 at 5:31 PM, Dennis Gilmore <den...@ausil.us> wrote: > On Monday, April 18, 2016 2:59:18 PM CDT you wrote: >> On 04/15/2016 05:28 PM, Joe Brockmeier wrote: >> > On 04/15/2016 10:38 AM, Dennis Gilmore wrote: >> >> I would like us to demote them to secondary. >> > >> > Why? We've already decided to drop. I'm not opposed, just curious why. >> > IIRC we were hitting a major problem with kernel compat as well? >> >> Pinging on this - I thought we'd reached a decision and wanted to >> publicize that sooner than later. >> >> If there's a reason to prefer move to secondary, let's discuss. >> >> Best, >> >> jzb > > I prefer to move it to secondary because people could be relying on it still, > it gives us a way to move forward and not be blocked on 32 bit x86. If it does > not work then it will not get shipped. Just dropping them on the floor does > not give as smooth a transition, nor does it give people that want it still > the chance to pick it up and continue to carry it forward. Is the context Cloud, or in general? I think going from primary for all products to totally dropping it is a problem, even if install media is non-blocking. I have no stake in i686 at all, and I think Cloud and Server are less affected by totally dropping i686 than Workstation; but I think quitting i686 cold turkey needs reconsideration. Anyway I think no one has done anything wrong here, but the warnings of the kernel team were maybe considered something like, "oh, we'll get by one more release or two by the skin of our teeth before it blows up" and yet it just turned out that it's blowing up already. If the idea is we should block on i686 in general for upgrading, I'd agree, even though it's a pain. For Cloud, maybe the way forward at worst is to support Cloud Atomic. And the images are i686 only? Of course that assumes any problems with binutil and kernel, or whatever else comes up, is sanely fixable with a best effort. ? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO autopartitioning
On Fri, Mar 11, 2016 at 10:14 AM, Dennis Gilmore <den...@ausil.us> wrote: > On Friday, March 11, 2016 9:58:54 AM CST Chris Murphy wrote: >> Hi, >> >> The installer autopart in Cloud Atomic ISO leaves a bunch of free >> space in the VG, which on first boot is turned into a dm thin pool by >> docker-storage-setup. This is quite cool, so I'm suggesting it for >> Server (minus the auto configuration part), but I can't tell where the >> code is that alters the installer's autopartitioning behavior. The >> kickstart file says it's using autopart, it doesn't have a breakdown >> of what it's asking the installer to do, so I guess by virtue of it >> being a Cloud productized installer it knows to do this. >> >> Suggestions? > > > The code that overrides anaconda's defaults lives in fedora-productimg-atomic Thanks! -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Cloud Atomic ISO autopartitioning
Hi, The installer autopart in Cloud Atomic ISO leaves a bunch of free space in the VG, which on first boot is turned into a dm thin pool by docker-storage-setup. This is quite cool, so I'm suggesting it for Server (minus the auto configuration part), but I can't tell where the code is that alters the installer's autopartitioning behavior. The kickstart file says it's using autopart, it doesn't have a breakdown of what it's asking the installer to do, so I guess by virtue of it being a Cloud productized installer it knows to do this. Suggestions? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: [atomic-devel] Fedora Atomic Host Two Week Release Announcement
On Fri, Jan 29, 2016 at 7:29 AM, Micah Abbott <miabb...@redhat.com> wrote: > AFAICT, the link for the updated ISO from > > https://getfedora.org/en/cloud/download/atomic.html > > ...is working properly this morning. > > > In case one of the mirrors isn't up to speed yet, the direct link is: > > https://download.fedoraproject.org/pub/alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.2.iso > OK it's working after deleting the browser cache. Sometimes I don't know why things work the way they work. Before clearing the cache, the browser was consistently requesting the old wrong name from mirrors even though Fedora's servers were supplying the correct new filename. It's like a 20+ year old bug that makes "clear your browser cache" sound sane. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Fedora Atomic Host Two Week Release Announcement
On Wed, Jan 27, 2016 at 8:38 PM, <nore...@fedoraproject.org> wrote: > > A new update of Fedora Cloud Atomic Host has been released and can be > downloaded at: > > Images can be found here: > > https://getfedora.org/en/cloud/download/atomic.html Clicking on the 64-bit Atomic ISO image results in: http://mirrors.rit.edu/fedora/alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso 404 - Not Found http://mirrors.kernel.org/fedora-alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso Sorry, we cannot find your kernels ##which btw is awesome http://dl.fedoraproject.org/pub/alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso Not Found The requested URL /pub/alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso was not found on this server. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Fedora Atomic Host Two Week Release Announcement
On Thu, Jan 28, 2016 at 1:19 PM, Matthew Miller <mat...@fedoraproject.org> wrote: > On Thu, Jan 28, 2016 at 12:01:34PM -0700, Chris Murphy wrote: >> http://mirrors.rit.edu/fedora/alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso > > Should be fixed now -- there was a missing ".2" in the filename. Reload > the download page? I'm redirected to a different mirror on each attempt, those mirrors still aren't working yet, I bet it'll take a couple hours for them to catch the update. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Fedora Atomic Host Two Week Release Announcement
On Thu, Jan 28, 2016 at 5:25 PM, Matthew Miller <mat...@fedoraproject.org> wrote: > On Thu, Jan 28, 2016 at 03:46:05PM -0700, Chris Murphy wrote: >> >> http://mirrors.rit.edu/fedora/alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso >> > Should be fixed now -- there was a missing ".2" in the filename. Reload >> > the download page? >> I'm redirected to a different mirror on each attempt, those mirrors >> still aren't working yet, I bet it'll take a couple hours for them to >> catch the update. > > Sorry, I wasn't clear -- the _mirrors_ are right, but the _link_ was > bad. Should be > https://download.fedoraproject.org/pub/alt/atomic/stable/Cloud-Images/x86_64/Images/Fedora-Cloud-Atomic-23-20160127.2.x86_64.qcow2 Except not qcow2 since this is for the 64-bit Atomic ISO link. When I click that link, I'm taken to a different mirror each time. It still is failing. Just clicked it now and I'm redirected to: http://mirrors.kernel.org/fedora-alt/atomic/stable/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20160127.iso And it says my kernels can't be found (giggle). So that mirror at least, isn't correct yet. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: rpm-ostree, failed upgrade, failed rollback
This is calling grub2-mkconfig at line 340 https://git.gnome.org/browse/ostree/tree/src/libostree/ostree-bootloader-grub2.c And line 153 says this must have been called from a wrapper script. I'm pretty much thinking grub2-mkconfig is not meant to be directly called by the user either, and is envisioned to only get called by e.g. ostree admin deploy/switch, or rpm-ostree rollback/upgrade, etc. That's fine, it's just not obvious what user space tools belong to the user. Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: rpm-ostree, failed upgrade, failed rollback
On Wed, Dec 23, 2015 at 2:37 PM, Jonathan Lebon <jle...@redhat.com> wrote: > - Original Message - >> OK part of my confusion is that 'grub2-mkconfig' does not work when >> called directly. Doing that results in a malformed grub.cfg. >> >> What is the correct way, on atomic builds, to recreate the grub.cfg? >> How does rpm-ostree do this? > > Yeah, I've been confused by that as well. I haven't bothered > investigating more, but it seems like running grub2-mkconfig > on a fresh boot works, whereas calling it after some > rpm-ostree operation such as upgrade/rebase will cause no > output from 15_ostree. For me, now on 23.39, even after fresh boot, and no matter where I direct -o to write the file, 15_ostree is empty. It's seems like it's not meant to be directly called. I've never had that command from user space produce a correct grub.cfg. But with one exception, within rpm-ostree updates or rollbacks, it produces correct grub.cfgs. So I think something else is calling that script, and also telling it where to put the grub.cfg (which goes in different locations depending on the firmware, because rabbits). I've variably gotten no menu entry grub.cfgs, and ones with linux16/initrd16 instead of linuxefi/initrdefi and I can't tell why this flips around other than there's some kind of state change that doesn't happen when it's correctly called. > >> The closest I get to a command is 'ostree admin instutil >> grub2-generate' but this fails with >> ** >> ERROR:src/libostree/ostree-bootloader-grub2.c:154:_ostree_bootloader_grub2_generate_config: >> assertion failed: (grub2_boot_device_id != NULL) >> Aborted (core dumped) >> >> I'm not sure what it wants. > > It's only meant to be called by the /etc/grub.d/15_ostree script, > which sets up some env vars for it. That said, it should probably > error out more gracefully. Well that sorta answers this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1293986 But on a normal system /etc/default/grub is consumed by grub2-mkconfig, but that's not true on atomic. I've made changes to that file and yet those changes aren't rolled into the grub.cfg. So OK, there's 'ostree admin instutil set-karg' but I run into this problem https://bugzilla.redhat.com/show_bug.cgi?id=1293987 So now I have no idea whether 'ostree admin instutil' is user domain or just meant as helpers for some other scripts. So I think we need to know what the deprecated and new knobs are. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: atomic host grub2 version
On Tue, Dec 22, 2015 at 1:31 PM, Chris Murphy <li...@colorremedies.com> wrote: > On two UEFI systems, one with F23 Workstation, the other with F23 > Cloud Atomic, I'm finding the grubx64.efi do not have the same hash, > even though rpm -q reports the same rpm installed on both. This is > unexpected. I've found the sha256sum for /boot/efi/EFI/fedora/grubx64.efi on a system with atomic tree version 23.38 matches that of the grux64.efi in grub2-efi-2.02-0.23.fc23.x86_64.rpm, despite the fact rpm -q reports grub2-efi-2.02-0.25.fc23.x86_64 is installed. So the grub2-efi package is disconnected with the actual efi binary installed. UEFI bootloader is not updated by rpm-ostree, even though rpm package version suggests otherwise https://bugzilla.redhat.com/show_bug.cgi?id=1293725 -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
atomic host grub2 version
On two UEFI systems, one with F23 Workstation, the other with F23 Cloud Atomic, I'm finding the grubx64.efi do not have the same hash, even though rpm -q reports the same rpm installed on both. This is unexpected. Does the atomic tree include /boot/efi/EFI/fedora? And if not, is that on the future feature list? CVE-2015-8370 is what made me look at this. On BIOS computers, whether conventional or atomic, GRUB2 user space tools are updated with grub2-2.02-0.25.fc23, but that only updates user space tools. The user has to manually run grub2-install to actually fix the problem. On UEFI conventional installations, grubx64.efi is replaced automatically when the RPM is updated; but apparently not on UEFI atomic installations. Using grub2-install fails because grub2-efi-modules isn't installed by default, and even if it were the resulting grubx64.efi is now no longer signed by Fedora so it'll fail UEFI Secure Boot code signing checks. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: rpm-ostree, failed upgrade, failed rollback
OK really weird. Tree 29.34 is deployed, and I just ran: # rpm-ostree rollback This writes out a correct grub.cfg (uses linuxefi/initrdefi) and it wrote the grub.cfg in the correct location (/boot/efi/EFI/fedora/grub.cfg). And this grub.cfg works regardless of which menu entry I pick in GRUB. So the bug affected rpm-ostree upgrade, and possibly only the version of that command in the 23.29 tree. ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: rpm-ostree, failed upgrade, failed rollback
Successfully booted using 'configfile' and editing the grub.cfg to use linuxefi/initrdefi instead of linux16 and initrd16... # bash -x grub2-mkconfig http://fpaste.org/301941/30720114/ That's a bug. I just don't know whose bug it is. This is definitely a UEFI system, the CSM is not used (efibootmgr works, and Secure Boot is enabled). So something's got grub2-mkconfig awfully confused about what kind of firmware this system has. And then it also tells grub the -o destination path incorrectly. Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: rpm-ostree, failed upgrade, failed rollback
At the grub menu, I hit, c to get to a grub shell, and then use 'configfile' command pointed to /boot/loader.0/grub.cfg - it reads that configuration file instead of the one on /boot/efi/EFI/fedora/grub.cfg, and both 23.29 and 23.34 tree menu entries appear. Both contain an error, however. They both use 'linux16' and 'initrd16' instead of 'linuxefi' and 'initrdefi'. Something is very confused about whether this is a BIOS or UEFI system. If I change those commands to linuxefi and initrdefi, I can boot either tree. There is something... maybe. My fstab looks like this: UUID=908cb4df-410b-47e4-afb1-872255bd1244 /boot ext4 defaults1 2 UUID=5956-63D8 /boot/efi vfat umask=0077,shortname=winnt,x-systemd.automount,noauto 0 2 UUID=8b0c4840-4fc7-4782-a4c0-25fec8a40dd4 /btrfsdefaults 0 0 Normally grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg run manually causes /boot/efi to be mounted automatically in a split second. So, is rpm-ostree looking to see first if /boot/efi exists for any reason? What determines whether grub2-mkconfig -o is directed to /boot/efi/EFI/fedora, vs /boot/grub2? Thing is, there is no /boot/grub2/grub.cfg at all... neither of the correct locations got a grub.cfg. The correct grub.cfg (minus the wrong linux command) is in /boot/loader.0. Wonky. Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: rpm-ostree, failed upgrade, failed rollback
rpm-ostree entry .conf http://fpaste.org/301944/45030746/ What translates this file's linux/initrd into either linux16/initrd16 vs linuxefi/initrdefi? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO, missing items for baremetal usage
On Tue, Dec 15, 2015 at 5:03 PM, Matthew Miller <mat...@fedoraproject.org> wrote: > On Wed, Dec 16, 2015 at 10:52:43AM +1100, Philip Rhoades wrote: >> >A. Separate hardware/virt trees; have the installer ISO point at the >> > hardware one by default (but also have the option of virt) >> >B. Finishing Atomic overlay support; making hardware enablement an >> >overlay >> >C. Getting all this stuff to work properly in SPCs >> >D. Something else? >> > >> >* so is software. *sigh* >> Does this mean there would be different hardware trees on the iso or >> that a basic iso would be pulling the appropriate tree via the >> network? > > Well, for "A", I was thinking one for hardware, one for virt/cloud — > not going down the path of different trees for different types of > hardware, because that's definitely the road to madness. > > For "B" (which is only theoretical, and as someone mentioned, may > require cloning Colin), there could be different overlays depending on > needs. > >> What are SPCs? > > Super-privileged containers. Basically, containers that are meant to > manage the host OS. See > https://www.youtube.com/watch?v=eJIeGnHtIYg from DevConf.cz last year. Between ostree, spcs, fs options, and overlays, I think this is a lot to chew off. And a lot of change in a short amount of time. That goes for people doing the work, those who will have to document the differences compared to conventional installs+setup+management, and the users who will have to learn this. A persistent spc to login to, to manage the host, is problematic for a significant minority of use cases where the storage hardware changes and now the container (currently) isn't aware of this for some tools. So I think that needs more investigation and fixes so that we're not having to document exceptions. It seems to me the easier thing to do is tolerate baking more stuff into the images and ISO. Growing that list now, and shrinking it later is a better understood process, can be done faster, and requires fewer resources. And by later, I mean once spcs and overlay stuff are a. more mature b. better understood c. people doing that work have time to do it. The hardware specific utils could go in a metapackage that's enabled for installation by default only on the ISO, and not for images. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Fedora cloud image feedback
On Tue, Dec 15, 2015 at 9:33 PM, Deepak Shetty <dpkshe...@gmail.com> wrote: > Also since cloud image is the most downloaded one, how about > providing a .iso file with pre-loaded user/passwd, so that people > willing to use cloud-image in non-cloud env (local, virt-mgr etc) can use > the iso file as the cloud-init data source ? There is an ISO that can be used to install Fedora Cloud using Anaconda. I suggest using automatic partitioning (avoids problems due to some missing pieces to support custom layouts). https://getfedora.org/en/cloud/download/ In the center of the screen, click on Atomic Images, on the right side is the ISO option. This is an atomic host system, no dnf (except in containers based on images that use it). Another option is to download Workstation or Server ISO *netinstall* and click on Software Selection (in the hub of the installer), and "Fedora Cloud Server" is an option in there. This is a conventionally updated (with dnf) system, it's not an atomic host system. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO, missing items for baremetal usage
On Tue, Dec 15, 2015, 5:49 PM Chris Murphy <li...@colorremedies.com> wrote: It seems to me the easier thing to do is tolerate baking more stuff into the images and ISO. OK I just rewound this in my head, and said WTS out loud. There's one cloud atomic tree, right? So adding a bunch of hardware stuff affects that whole tree, and everything that uses it. OK instead, more clarity on the downloads page the limitations of atomic ISO on baremetal, and offer instead Server ISO or netinstall media (non-atomic install) and picking the Cloud Server option in the installer for a more complete and flexible install on hardware. I do still wonder about decoupling kernel from the tree. Kernel regressions happen. Chris Murphy rowing that list now, and shrinking it later is a better understood process, can be done faster, and requires fewer resources. And by later, I mean once spcs and overlay stuff are a. more mature b. better understood c. people doing that work have time to do it. The hardware specific utils could go in a metapackage that's enabled for installation by default only on the ISO, and not for images. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO, missing items for baremetal usage
On Sun, Dec 13, 2015 at 8:29 PM, Chris Murphy <li...@colorremedies.com> wrote: > It's also best practices to disable the write cache on all drives used > in any kind of RAID. More important, all drives in RAID need SCT ERC set on each drive, which is also not persistent on non-enterprise drives. That requires smartctl -l scterc 70,70 otherwise read failures don't always get fixed correctly, fester, and can needlessly result in the RAID degrading or failing. And now I see mdadm is not installed either on the ISO. I filed this bug to get dosfs-tools included, mainly for UEFI systems. https://bugzilla.redhat.com/show_bug.cgi?id=1290575 Should I just file bugs like that for each one of these other missing components, and set them to block a tracker bug for things to include on the ISO? In my view a baremetal installation is first a server, so it should have basic server tools. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO, missing items for baremetal usage
OK at this moment I'm thinking hdparm and smartmontools just need to go on the ISO, along with iotop. While both hdparm and smartmontools appear to work OK in a container with --privileges=true, any hardware changes are not reflected in that container in a way these two programs can see. [root@3d2386bbd250 /]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:00 931.5G 0 disk |-sda1 8:10 200M 0 part |-sda2 8:20 500G 0 part /etc/hosts |-sda3 8:30 500M 0 part |-sda4 8:40 426.5G 0 part `-sda5 8:50 4.3G 0 part [SWAP] **plug in some drives*** [root@3d2386bbd250 /]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:00 931.5G 0 disk |-sda1 8:10 200M 0 part |-sda2 8:20 500G 0 part /etc/hosts |-sda3 8:30 500M 0 part |-sda4 8:40 426.5G 0 part `-sda5 8:50 4.3G 0 part [SWAP] sdb 8:16 0 698.7G 0 disk sdc 8:32 0 465.8G 0 disk sdd 8:48 0 698.7G 0 disk sde 8:64 0 465.8G 0 disk [root@3d2386bbd250 /]# hdparm -I /dev/sdb /dev/sdb: No such file or directory [root@3d2386bbd250 /]# hdparm -I /dev/sdc /dev/sdc: No such file or directory [root@3d2386bbd250 /]# hdparm -I /dev/sdd /dev/sdd: No such file or directory [root@3d2386bbd250 /]# hdparm -I /dev/sde /dev/sde: No such file or directory [root@3d2386bbd250 /]# smartctl -a /dev/sde smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.6-301.fc23.x86_64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org Smartctl open device: /dev/sde [SAT] failed: No such device Maybe lsblk, by virtue of libblkid, gets some state update for free, I don't know. Clearly that's not the case for hdparm and smartctl, and therefore I have to restart the container or start a new one for the change to be visible to these tools. If I replace or add drives, will I need to restart the container running smartd? If yes, that'd kinda be a regression. Maybe I'm doing something wrong, but at the moment I'm not groking the advantage of running these tools in a container. Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO, missing items for baremetal usage
On Sun, Dec 13, 2015 at 7:07 PM, Joe Brockmeier <j...@redhat.com> wrote: > On 12/13/2015 07:11 PM, Chris Murphy wrote: >> OK at this moment I'm thinking hdparm and smartmontools just need to >> go on the ISO, along with iotop. > > What's the usage scenario you're picturing here? This feels to me like a > "pet" usage scenario where you're caring a whole lot about a single > server install. Any server with any number of drives. Best practices is to have smartd monitor drive heath and report failures by email or text, rather than via a service disruption or irate human. While smartd could be running in a container, if the container doesn't get state updates when drives are swapped or added, then that requires a workaround: periodically restarting that container. So what's the advantage of running this utility in a container? It's also best practices to disable the write cache on all drives used in any kind of RAID. That's not a persistent setting, so it has to happen every boot. Instead of a boot script or service that does this, a container needs to startup shortly after each boot and do this. What's the benefit of that workflow change? I don't understand that. Another use of hdparm is ATA secure erase before dismissal of drives. If the container not being fully aware of state changes is a bug, then that's fine. In that case a super user highly privilegd container running persistently with sshd running can then be used to do all these things. But I still don't know what the advantage is, having to remote into that container for some tasks, and into the host itself for other tasks. Don't you think there should be some considerable advantage, commensurate with the workflow change caused by relocating simple tools commonly available on servers, to running only in containers instead? I do. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Cloud Atomic ISO, missing items for baremetal usage
On Fri, Dec 11, 2015 at 12:33 PM, Joe Brockmeier <j...@redhat.com> wrote: > On 12/11/2015 02:23 PM, Chris Murphy wrote: >> >> These I have running in a fedora container. lspci mostly works, but >> getting full -vvnn detail requires --privileged=true. And the other >> three require it. iotop additionally needs --net=host. I'd be OK with >> them just being available in a container, but it might make more sense >> to just include them in the atomic ISO installation, maybe even >> borrowing a list from the Server product? > > We want, as much as possible, to keep the image small and run all the > things in containers where possible. > > If there's something where that just won't work, or is ludicrously > difficult, we should discuss including it. I think these may be needed in the ISO: cryptsetup - needed to boot encrypted devices rng-tools - this includes rngd, seems useful for all containers esp in a cloud context. Even with --privileged=true I get: # systemctl start rngd Failed to get D-Bus connection: Operation not permitted # systemctl status rngd Failed to get D-Bus connection: Operation not permitted Also, a way to separate kernels from the rest of the current tree. Right now I'm on atomic 23.29, the previous tree I have installed is way back to 23 (because it's an ISO installation), but I'm encountering a kernel regression. It's very suboptimal to have to rollback everything to 23, rather than just the kernel. Stepping the kernel forward independently from the cloud atomic host tree is maybe even better in some instances than rolling back. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
PTY allocation request failed on channel 0
Has anyone else run into this? I've never before run into this problem, not any version of Server or Workstation. Just cloud, and I've run into it several times in just a few days. I'm using public-key authentication to ssh into a Fedora 23 Cloud Atomic ISO installation upgraded to 23.29. This works most of the time, until it doesn't and then I always get this. [chris@f23m ~]$ ssh chris@10.0.0.15 PTY allocation request failed on channel 0 If I happen to have an existing login available, even if I restart sshd the problem isn't fixed. The only fix I've found so far is a reboot which is more than a bit disruptive. Any ideas? In the journal host side, it records a bunch of audit stuff, but three lines that seem particularly relevant yet not illuminating: Dec 12 12:37:10 f23a.localdomain audit[1]: USER_AVC pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='Unknown permission start for class system exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?' [snip] Dec 12 12:27:25 f23a.localdomain sshd[2029]: error: openpty: No such file or directory Dec 12 12:27:25 f23a.localdomain sshd[2032]: error: session_pty_req: session 0 alloc failed systemd-222-8.fc23.x86_64 I'd file a bug but I don't even know what to file it against. The full journal output is here: http://fpaste.org/3001 -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: PTY allocation request failed on channel 0
I have a lead. I'm still working on this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1290691 And when I do this: [root@f23a ~]# docker run --net=host --pid=host -v /dev:/dev --privileged=true fedext /usr/sbin/iotop -d3 Traceback (most recent call last): File "/usr/sbin/iotop", line 17, in main() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 631, in main main_loop() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 621, in main_loop = lambda: run_iotop(options) File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 508, in run_iotop return curses.wrapper(run_iotop_window, options) File "/usr/lib64/python2.7/curses/wrapper.py", line 22, in wrapper stdscr = curses.initscr() File "/usr/lib64/python2.7/curses/__init__.py", line 33, in initscr fd=_sys.__stdout__.fileno()) _curses.error: setupterm: could not find terminal After that traceback, any attempt to ssh to the host is munged as previously described. So somehow that docker command puts the host in a state where subsequent ssh attempts fail. Stopping docker and sshd, then starting sshd then docker, doesn't help. I still can't login. So is this a docker bug? Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: PTY allocation request failed on channel 0
On Sat, Dec 12, 2015 at 1:31 PM, Chris Murphy <li...@colorremedies.com> wrote: > I have a lead. > > I'm still working on this bug: > https://bugzilla.redhat.com/show_bug.cgi?id=1290691 > > > And when I do this: > [root@f23a ~]# docker run --net=host --pid=host -v /dev:/dev OK perfect. It's user error. The -v /dev:/dev wasn't meant to be literal, but rather /dev/sda:/dev/sda, or whatever. So if I stop doing that nonsense, the login breakage no longer happens. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: root on btrfs and lvmthinp for docker backing confusion
On Thu, Dec 10, 2015 at 9:18 PM, Dusty Mabe <du...@dustymabe.com> wrote: > > > On 12/10/2015 08:35 PM, Chris Murphy wrote: >> Followup question: Does Docker directly use a thin pool without >> creating (virtual size) logical volumes? Because I don't see any other >> LV's created, and no XFS filesystem appears on the host using the >> mount command. And yet I see XFS mount and umount kernel messages on >> the host. This is sort of an esoteric question. However, I have no >> access to container files from the host like I can see inside each >> btrfs subvolume when btrfs is the backing method. And that suggests >> possibly rather different backup strategies depending on the backing. > > I believe it chops it up using low level device mapper stuff. I think > you don't see the mounts on your host because they are in a different > mount namespace (part of the magic behind containers). > > For more info on docker + device mapper look at slides 37-44 of [1] > > [1] - http://www.slideshare.net/Docker/docker-storage-drivers I read all the slides. That is really helpful, there's quite a bit of detail considering they're slides. It's definitely more devicemapper than LVM based (makes sense, the driver is "devicemapper" after all). The most that appears in LVM's view is the thin pool, and once Docker owns it, LVM can't make virtual LV's from that pool. As to the obscurity, on the one hand it's a perception because while I'm quite comfortable with LVM tools, I'm not that comfortable with dmsetup; and on the other hand the local backing should probably be considered disposable, without warning, in a production setup anyway. So some regular sweep of container states (if that's important) should be made into images and put elsewhere. Seriously, if the backing store were to faceplant, it's simply faster to start from the most recent image than attempt repairs. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Cloud Atomic ISO, missing items for baremetal usage
Hi, Is there a wish list for things to consider adding to a future ISO? These are things not on the current ISO that often are on other baremetal installs like Workstation and Server products. I don't have enough familiarity with whether these things should just be included in the base installation and those that could be gathered in one or more "util" or "extras" type of Fedora docker image. So far I'm running into: /lib/fimware/ is read-only so I can't add this: [ 14.599501] iwlwifi :02:00.0: request for firmware file 'iwlwifi-7265D-13.ucode' failed. I don't know whether a /var/lib/firmware being bind mounted to /lib/firmware can be done soon enough that it'll be picked up by the kernel. pciutils, which contains lspci hdparm smartmontools iotop These I have running in a fedora container. lspci mostly works, but getting full -vvnn detail requires --privileged=true. And the other three require it. iotop additionally needs --net=host. I'd be OK with them just being available in a container, but it might make more sense to just include them in the atomic ISO installation, maybe even borrowing a list from the Server product? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: root on btrfs and lvmthinp for docker backing confusion
On Thu, Dec 10, 2015 at 9:54 AM, Dusty Mabe <du...@dustymabe.com> wrote: > So you created the VG VG and the docker-pool LV on your own before > docker-storage-setup is run? Yes. > What I would do is leave sda4 blank and then put the following in your > config file: > > DEVS=/dev/sda4 > VG=vgdocker Fails with Dec 10 11:10:13 f23a.localdomain docker-storage-setup[1135]: Partition specification unsupported at this time. > What I think this will do is create a PV out of /dev/sda4 and create a > VG (named vgdocker) on top of it. It will then create the docker-pool > LV for you and have the docker daemon use that as the backing store. > > Let me know if this is what you were looking for or not! Yes. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: root on btrfs and lvmthinp for docker backing confusion
On Thu, Dec 10, 2015 at 11:47 AM, Jason Brooks <jbro...@redhat.com> wrote: > You can ignore docker-storage-setup and edit /etc/sysconfig/docker-storage > yourself. OK I've done "systemctl disable docker-storage-setup" > > Here's what it looks like from the f23 vagrant box: > > DOCKER_STORAGE_OPTIONS=-s devicemapper --storage-opt dm.fs=xfs --storage-opt > dm.thinpooldev=/dev/mapper/atomicos-docker--pool --storage-opt > dm.use_deferred_removal=true Short version: This works. Docker service starts, no errors, and it's not using a loopback device. Followup question: Does Docker directly use a thin pool without creating (virtual size) logical volumes? Because I don't see any other LV's created, and no XFS filesystem appears on the host using the mount command. And yet I see XFS mount and umount kernel messages on the host. This is sort of an esoteric question. However, I have no access to container files from the host like I can see inside each btrfs subvolume when btrfs is the backing method. And that suggests possibly rather different backup strategies depending on the backing. Long version of the question: # systemctl stop docker Copied generic docker-storage and edited as follows: DOCKER_STORAGE_OPTIONS=-s devicemapper --storage-opt dm.fs=xfs --storage-opt dm.thinpooldev=/dev/mapper/vgfedora-docker--pool --storage-opt dm.use_deferred_removal=true # systemctl start docker Dec 10 13:05:02 f23a.localdomain systemd[1]: Starting Docker Application Container Engine... Dec 10 13:05:06 f23a.localdomain docker[1695]: time="2015-12-10T13:05:06.377269077-07:00" level=info msg="Firewalld running: false" Dec 10 13:05:06 f23a.localdomain docker[1695]: time="2015-12-10T13:05:06.791189262-07:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.1/16. Daemon option --bip can be used to set a preferred IP address" Dec 10 13:05:07 f23a.localdomain docker[1695]: time="2015-12-10T13:05:07.572415354-07:00" level=info msg="Loading containers: start." Dec 10 13:05:07 f23a.localdomain docker[1695]: .. Dec 10 13:05:07 f23a.localdomain docker[1695]: time="2015-12-10T13:05:07.576094399-07:00" level=info msg="Loading containers: done." Dec 10 13:05:07 f23a.localdomain docker[1695]: time="2015-12-10T13:05:07.576663930-07:00" level=info msg="Daemon has completed initialization" Dec 10 13:05:07 f23a.localdomain docker[1695]: time="2015-12-10T13:05:07.576770908-07:00" level=info msg="Docker daemon" commit=f7c1d52-dirty execdriver=native-0.2 graphdriver=devicemapper version=1.9.1-fc23 Dec 10 13:05:07 f23a.localdomain docker[1695]: time="2015-12-10T13:05:07.577004141-07:00" level=info msg="API listen on /var/run/docker.sock" Dec 10 13:05:07 f23a.localdomain systemd[1]: Started Docker Application Container Engine. host# docker pull fedora host# docker images REPOSITORY TAG IMAGE IDCREATED VIRTUAL SIZE fedora latest 597717fc21bd2 weeks ago 204 MB So that all works, and then also the thin pool data% is growing after each step, according to lvs. But there is no logical volume, no file system. host# docker run -i -t fedora /bin/bash host# mount | grep xfs selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) host# mount | grep ext /dev/sda3 on /boot type ext4 (rw,relatime,seclabel,stripe=4,data=ordered) It's working. But is Docker directly using the thin pool without creating a thin logical volume and file system? That's unexpected. host# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool vgfedora twi-aotz-- 250.00g 0.11 0.53 host# docker pull ubuntu [...snip pull output...] -bash-4.3# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool vgfedora twi-aotz-- 250.00g 0.20 0.63 Yeah, it seems to directly use the thin pool without create a virtual size LV first. Kernel messages show a bunch of items like this: [ 7818.790685] XFS (dm-4): Mounting V5 Filesystem [ 7819.038211] XFS (dm-4): Ending clean mount [ 7834.219132] XFS (dm-4): Unmounting Filesystem So it's using XFS on something, that isn't appearing in mount on the host or in the container. And multiple containers all appear to use XFS on the same device mapper device (dm-4) which does not appear on the host in the /dev/ directory. So this is really... obscured. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: openQA nightly testing of Cloud Atomic installer image
On Tue, Dec 8, 2015 at 11:22 PM, Adam Williamson <adamw...@fedoraproject.org> wrote: > On Tue, 2015-12-08 at 11:26 -0700, Chris Murphy wrote: >> On Mon, Dec 7, 2015 at 6:39 PM, Adam Williamson >> <adamw...@fedoraproject.org> wrote: >> > Hi folks! For those who aren't aware, Fedora openQA is set up to test >> > the Atomic installer image nightly - that's the image that uses >> > anaconda to deploy a fixed Atomic host payload. >> >> It exists! I've been looking around for such a thing for a day and >> only found old blog posts. It's really non-obvious how to find it in >> koji. I can find lives. I can find other atomics. But somehow this one >> is like the G train in Brooklyn (OK the G *eventually* does show >> itself). > > It's not built by koji, which is why you can't find it there. It's an > installer image, it's built by pungi, just like netinst and DVD images. > > fedfind can find it, though. ;) That's what fedfind does! It finds > fed(ora)! I think I'd be helpful if the releng dashboard listed this build along with the other cloud images. I estimate my recall half-life for fedfind is about 15 days. Does anyone else think it'd be useful to have the nightly atomic installer ISO listed at https://apps.fedoraproject.org/releng-dash/ ? Or maybe even an Atomic specific section on the dashboard? >> > https://openqa.fedoraproject.org/ , you should see a build like >> > 'Build23_Postrelease_20151207' on the front page, >> >> OK I click on this, it shows the test, and that it failed. Any chance >> of it eventually linking to the image it tested so that it's possible >> to fall back to a manual test > > So, answer one: it actually does. The ISO tested is on the Logs & > Assets page, down at the bottom, under Assets. When I click on Postrelease_20151209, I end up here https://openqa.fedoraproject.org/tests/overview?distri=fedora=23=23_Postrelease_20151209=1 There isn't anything down at the bottom, definitely no Logs & Assets page. Same with the other listings on the man openqa page... https://drive.google.com/open?id=0B_2Asp8DGjJ9cUpnY2ZuY0lIV2M >> More likely is if it passes all auto tests to have a link to >> the image so a manual test can try to blow it up, right? > > Eh, my take is that we don't/shouldn't exactly design our manual test > processes around openQA. This testing (of the post-release nightly > cloud images) is kind of a bonus thing I rigged up just because > maxamillion asked and it wasn't too difficult; the main point of openQA > is to aid in pre-release testing, and of course we have a more > developed test process there, where we have the regular 'nomination' of > nightly composes for manual testing, with the wiki pages with download > links and all the rest of it. We could certainly stand to draw up a > proper process for manual testing of post-release images, if we're > going to be releasing them officially, which apparently we are, but I'm > not the guy who's been keeping up on that stuff so I don't want to leap > in, I'm sure some folks already have ideas for doing that. Gotcha. Thanks! -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: openQA nightly testing of Cloud Atomic installer image
On Wed, Dec 9, 2015 at 3:00 PM, Adam Williamson <adamw...@fedoraproject.org> wrote: > That's an overview page. The Logs & Assets tab is available for each > individual *test* page. Here's the overview for today: > > https://openqa.fedoraproject.org/tests/overview?distri=fedora=23=23_Postrelease_20151209=1 > > Click on the green dot and you see the individual test: OOH - OK it's not at all obvious the dot is a link. Is it possible for the test text itself having the link? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: openQA nightly testing of Cloud Atomic installer image
On Wed, Dec 9, 2015 at 2:02 PM, Joe Brockmeier <j...@redhat.com> wrote: > On 12/09/2015 03:48 PM, Chris Murphy wrote: >> Does anyone else think it'd be useful to >> have the nightly atomic installer ISO listed at >> https://apps.fedoraproject.org/releng-dash/ ? Or maybe even an Atomic >> specific section on the dashboard? > > Yes! I totally think it would. https://fedorahosted.org/fedora-infrastructure/ticket/5026 So that probably needs more clarity, now that I've already submitted it. I asked for a Cloud specific section rather than Atomic specific. So if Atomic specific is better organization, add that to the ticket. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: openQA nightly testing of Cloud Atomic installer image
On Wed, Dec 9, 2015 at 3:44 PM, Chris Murphy <li...@colorremedies.com> wrote: > On Wed, Dec 9, 2015 at 3:00 PM, Adam Williamson > <adamw...@fedoraproject.org> wrote: > >> That's an overview page. The Logs & Assets tab is available for each >> individual *test* page. Here's the overview for today: >> >> https://openqa.fedoraproject.org/tests/overview?distri=fedora=23=23_Postrelease_20151209=1 >> >> Click on the green dot and you see the individual test: > > OOH - OK it's not at all obvious the dot is a link. Is it possible for > the test text itself having the link? Or even the word "Details" to the right of the dot, which is the link. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: [cloud] #144: f23 atomic iso configures docker loopback storage
On Wed, Dec 9, 2015 at 12:26 PM, Fedora Cloud Trac Tickets <cloud-t...@fedoraproject.org> wrote: > #144: f23 atomic iso configures docker loopback storage > I looked over the kickstarts, but I'm not clear on what might be causing > this. > Ticket URL: <https://fedorahosted.org/cloud/ticket/144> I'm replying to list to avoid cluttering the ticket. I'm in the vicinity of a related issue, trying to figure out what's doing the setup, because I didn't use the prescribed automatic partitioning layout, therefore don't have any lvmthinp stuff setup yet. The ISO seems to create a system that's somehow making assumptions, but I don't know where it's getting that info. For example, I'm running into this: ● docker-storage-setup.service loaded failed failedDocker Storage Setup ● docker.service loaded failed failed Docker Application Container Engine The cause for storage setup service failing is Dec 09 16:57:46 f23a.localdomain docker-storage-setup[759]: Volume group "sda2" not found Dec 09 16:57:46 f23a.localdomain docker-storage-setup[759]: Cannot process volume group sda2 Dec 09 16:57:46 f23a.localdomain docker-storage-setup[759]: Metadata volume docker-poolmeta already exists. Not creating a new one. Dec 09 16:57:46 f23a.localdomain docker-storage-setup[759]: Please provide a volume group name Dec 09 16:57:46 f23a.localdomain docker-storage-setup[759]: Run `lvcreate --help' for more information. But where is it thinking there'd be a VG called sda2? OK so I do [chris@f23a ~]$ cat /usr/lib/systemd/system/docker-storage-setup.service [Unit] Description=Docker Storage Setup After=cloud-final.service Before=docker.service [Service] Type=oneshot ExecStart=/usr/bin/docker-storage-setup EnvironmentFile=-/etc/sysconfig/docker-storage-setup [Install] WantedBy=multi-user.target That suggests looking at /etc/sysconfig/docker-storage-setup, which does not exist (yet?). I also don't know the significance of the - right after = in that line. There is a /etc/sysconfig/docker-storage file that contains: # This file may be automatically generated by an installation program. # By default, Docker uses a loopback-mounted sparse file in # /var/lib/docker. The loopback makes it slower, and there are some # restrictive defaults, such as 100GB max storage. # If your installation did not set a custom storage for Docker, you # may do it below. # Example: Use a custom pair of raw logical volumes (one for metadata, # one for data). # DOCKER_STORAGE_OPTIONS = --storage-opt dm.metadatadev=/dev/mylogvol/my-docker-metadata --storage-opt dm.datadev=/dev/mylogvol/my-docker-data DOCKER_STORAGE_OPTIONS= So it might be that the Fedora 23 Atomic ISO behavior is the result of upstream behavior, and Fedora 22 had a modifier that no longer exists in Fedora 23? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: openQA nightly testing of Cloud Atomic installer image
On Wed, Dec 9, 2015 at 3:47 PM, Chris Murphy <li...@colorremedies.com> wrote: > On Wed, Dec 9, 2015 at 2:02 PM, Joe Brockmeier <j...@redhat.com> wrote: >> On 12/09/2015 03:48 PM, Chris Murphy wrote: >>> Does anyone else think it'd be useful to >>> have the nightly atomic installer ISO listed at >>> https://apps.fedoraproject.org/releng-dash/ ? Or maybe even an Atomic >>> specific section on the dashboard? >> >> Yes! I totally think it would. > > https://fedorahosted.org/fedora-infrastructure/ticket/5026 > > So that probably needs more clarity, now that I've already submitted > it. I asked for a Cloud specific section rather than Atomic specific. > So if Atomic specific is better organization, add that to the ticket. Fedfind finds these cloud specific products built nightly. The last one, Docker, is in its own category, I guess. So what items in this listing make sense to list in a hypothetical Cloud specific heading on the releng dashboard? Or should it only list Atomic specific builds? https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/i386/Images/Fedora-Cloud-Base-23-20151209.i386.qcow2 https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/i386/Images/Fedora-Cloud-Base-23-20151209.i386.raw.xz https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Atomic-23-20151209.x86_64.qcow2 https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Atomic-23-20151209.x86_64.raw.xz https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Atomic-Vagrant-23-20151209.x86_64.vagrant-libvirt.box https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Atomic-Vagrant-23-20151209.x86_64.vagrant-virtualbox.box https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Base-23-20151209.x86_64.qcow2 https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Base-23-20151209.x86_64.raw.xz https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Base-Vagrant-23-20151209.x86_64.vagrant-libvirt.box https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud-Images/x86_64/Images/Fedora-Cloud-Base-Vagrant-23-20151209.x86_64.vagrant-virtualbox.box https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Cloud_Atomic/x86_64/iso/Fedora-Cloud_Atomic-x86_64-23-20151209.iso https://dl.fedoraproject.org/pub/alt/atomic/testing/23-20151209/Docker/x86_64/Fedora-Docker-Base-23-20151209.x86_64.tar.xz -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: openQA nightly testing of Cloud Atomic installer image
On Mon, Dec 7, 2015 at 6:39 PM, Adam Williamson <adamw...@fedoraproject.org> wrote: > Hi folks! For those who aren't aware, Fedora openQA is set up to test > the Atomic installer image nightly - that's the image that uses > anaconda to deploy a fixed Atomic host payload. It exists! I've been looking around for such a thing for a day and only found old blog posts. It's really non-obvious how to find it in koji. I can find lives. I can find other atomics. But somehow this one is like the G train in Brooklyn (OK the G *eventually* does show itself). > https://openqa.fedoraproject.org/ , you should see a build like > 'Build23_Postrelease_20151207' on the front page, OK I click on this, it shows the test, and that it failed. Any chance of it eventually linking to the image it tested so that it's possible to fall back to a manual test (?) I don't know if that's even useful. If it fails the auto test, all that probably matters is the fail summary. More likely is if it passes all auto tests to have a link to the image so a manual test can try to blow it up, right? -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/cloud@lists.fedoraproject.org
Re: Two-Week Atomic actual deliverables
On Fri, Sep 11, 2015 at 10:59 AM, Adam Miller <maxamill...@fedoraproject.org> wrote: > I'm pretty neutral on B or C. I don't really care and also don't think it > should even remotely be a concern of ours. Not only do we not have > testing for it but we don't even have the building blocks in place to > work towards testing it. VirtualBox is bad and those who use it should > feel bad.[0] I'm curious what you think others should feel when they use VMware ESXi or Fusion, or Microsoft Hyper-V, in particular as it compares to the feeling they should have when using VirtualBox? On Windows and OS X, there is no qemu+kvm+libvirt. So I see VirtualBox as the least bad option on those platforms. When I'm using Fedora I use vmm/virsh because, well yeah VirtualBox is like the booger I can't flick off on OS X, meanwhile on Fedora there's something better. > This is probably not a popular opinion and I'm fine with that, but we > would have to install something that we very publicly speak out > against in order to test this. I'm not yet ready to throw out Fedora's > values for the sake of some OS X user's convenience but that's just > me. OK well considering the UX of Linux on Macs is highly variable between totally utterly frustrating shit, and semi-tolerable except for exhibits A, B, C, and D all of which suck. The incentive, therefore, is to just run proprietary OS X on proprietary hardware and VirtualBox instead of yet more proprietary crap in order to semi-sanely run something that's not crap or proprietary without having to buy additional hardware and all the costs that ensue. *shrug* It's sorta like playing cards and telling someone they should feel bad about the hand they've been dealt. Their choice was really limited to showing up at a particular game in a particular location, not the details of the hand they're dealt. -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/cloud Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Cloud (_Atomic) selinux labels and restorecon
FYI: restorecon changes many file labels following a clean install https://bugzilla.redhat.com/show_bug.cgi?id=1259018 This bug is not Cloud specific, but because Cloud_Atomic is read-only it can't be fixed with restorecon. I mention this in the bug. I don't know the quantity of metadata changes: selinux policy, permissions, all other xattr, happen in the course of a release; but in an "Atomic" context it looks like only option is to duplicate the affected files to uniquely set new metadata on just that file in a particular tree. The alternative, changing the metadata on the hardlink, punches through to the original file in a completely different tree, affecting all trees, and is therefore not atomic. (On Btrfs this duplication can be made efficient with reflinks instead of hardlinks, but that's trivia.) -- Chris Murphy ___ cloud mailing list cloud@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/cloud Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct