Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Matthias Beyer
On 04-06-2015 14:15:12, Marc Weber wrote:
> I can boot this way:
> 
> in grub hit e, add boot.shell_on_fail or shell_on_fail,
> then you can debug the mounting / boot process, eg enter interactive
> shell, then you can mount your partition manually on /root* something
> 
> After exiting I can boot. I don't restart often so I didn't investigate.
> 
> A google search told me it could be a race condition.
> 
> Thus have a look at the shell_on_fail within stage-1 and stage-1 init
> scripts to learn more about the startup process - then you can debug.

Also not really user friendly, is it?

Anyways, I'm not using grub, I'm using gummiboot and I don't know how
to enter the interactive shell with it. Though I got it working,
right... so everything is "fine" again.

-- 
Mit freundlichen Grüßen,
Kind regards,
Matthias Beyer

Proudly sent with mutt.
Happily signed with gnupg.


pgpD7MY9SfpBg.pgp
Description: PGP signature
___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev


Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Matthias Beyer
On 04-06-2015 16:16:27, Jascha Geerds wrote:
> https://github.com/NixOS/nixpkgs/issues/7859 ?
> 

Thanks, that helped a quite bit.

-- 
Mit freundlichen Grüßen,
Kind regards,
Matthias Beyer

Proudly sent with mutt.
Happily signed with gnupg.


pgp14yBqYDkS5.pgp
Description: PGP signature
___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev


Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Jascha Geerds
https://github.com/NixOS/nixpkgs/issues/7859 ?


-- 
  Jascha Geerds
  j...@ekby.de
___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev


Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Linus Arver
Hello Matthias,

> When rebooting, I booted into my newest generation, which was 109 by
> this time. But I got an error in stage 1, telling me that my root
> partition couldn't be mounted as the device did not come up (LUKS
> encrypted SSD, root on /dev/sda2). It asked me
> 
> "dm_mod" loaded?

FWIW, this is probably relevant:
http://lists.science.uu.nl/pipermail/nix-dev/2015-May/017198.html

. I was hit with that regression some weeks ago and had to cherry-pick
the bugfix commit from Nixpkgs.
___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev


Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Marc Weber
I also have a problem with luks - didn't debug it yet.
Some lines appear saying that target link already exist.

I can boot this way:

in grub hit e, add boot.shell_on_fail or shell_on_fail,
then you can debug the mounting / boot process, eg enter interactive
shell, then you can mount your partition manually on /root* something

After exiting I can boot. I don't restart often so I didn't investigate.

A google search told me it could be a race condition.

Thus have a look at the shell_on_fail within stage-1 and stage-1 init
scripts to learn more about the startup process - then you can debug.

Marc Weber

___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev


Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Matthias Beyer
Hi,

I found and solved the issue.

So first, I checked the kernel versions in my revisions:

106 - 3.18.11
107 - 3.18.13
108 - 3.18.13
109 - 3.18.13
110 - 3.18.13
111 - 3.18.13
112 - 3.18.13

(the latest two were other test builds).

I guessed that the issue was the kernel ... because nothing else
changed. There was no channel update and I didn't fiddle around in the
configuration for the system (at least not in these parts, only
packages and containers a bit).

So, I got my local clone of the nixpkgs repo onto the commit which was
used to build generation 106:

2d8cfe76a9e4f05e391d30f1654d45dee5993b8a

And rebuild the system. This worked. Reboot worked too, so the new
revision, 113, is now on kernel 3.18.11 again.

I then tried to change kernel versions to 4.0 and 3.19 (in this order,
because newer is better), but both failed to build because the ati
driver can't be build because of _compiler errors_ (seriousely, what
the hell?)

So, I'm back on 3.18.11. Unfortunately, I have own patches because I
have to wait for the channel update. So I re-applied my patches upon
mentioned commit:

gitolite v3.6.2 -> v3.6.3
snort: version fix (fixes download error)
daq: version fix (fixes download error)

these are non-published commits.

---

That's it. While I don't have any concerns about my machine, because
Nix is great and everything and I can boot in old versions of the
system and fix everything and so on, I really can't consider this as
great user experience. Debugging the issue was not that complicated
(despite my little hangover), but unneccessary. I'd really like to see
this situation improved. And I'd really like to see kernel 4.0 on my
machine, as I'm waiting for the AMD graphic cards fixes which are in
4.0.

I mean, I build my system from my own clone of nixpkgs. Not the way I
want to:

nix-channel --update
nixos-rebuild switch

which should be used. No, I have to

git checkout 
nixos-rebuild switch -I nixpkgs=~/my/clone/of/nixpkgs
# build fail
# try again other commit

I don't want to blame anyone for this or something, this mail is
mainly for documenting the issue, but hey, these things really
shouldn't happen, right?

So, to close this here and now... I hope I see some of you guys next
weekend at tuebix in Tübingen, Germany!

On 04-06-2015 15:06:51, Matthias Beyer wrote:
> Hi,
> 
> I have a problem with some of my generations.
> 
> Today (, after installing chromium, but I don't think this has
> anything to do with it), I noticed that my xterms got redrawn after
> switching to them (I'm using i3, so if a window is not shown but I
> bring it up, a redraw happened).
> 
> So, if I switched to a xterm where I executed something like
> "alsamixer" or "tree /tmp", it got redrawn line by line.
> 
> I did not understand why this happened, but I guessed it was some
> driver problem, where a driver went bad or something. So I decided to
> reboot.
> 
> When rebooting, I booted into my newest generation, which was 109 by
> this time. But I got an error in stage 1, telling me that my root
> partition couldn't be mounted as the device did not come up (LUKS
> encrypted SSD, root on /dev/sda2). It asked me
> 
> "dm_mod" loaded?
> 
> So I tried previous generations and had success with 106 (107-109) did
> not work. I checked my config, and I saw:
> 
> boot.initrd.kernelModules = [ "fbcon" "ext4" "dm_crypt" ];
> 
> that "dm_mod" was missing, indeed. So I changed it to:
> 
> boot.initrd.kernelModules = [ "fbcon" "ext4" "dm_mod" "dm_crypt" ];
> 
> And rebuild the system, resulting in generation 110. I tried to boot
> that, but the same error happened.
> 
> I'm on kernel 3_18_4, if this matters.
> 
> So my problem is, I don't know what went wrong and how to fix it.
> Unfortunately, I don't know which configuration I build generation 106
> from (my config is git-tracked). I'd show you the diff of my
> generation, but well... I don't know which revision it was.
> 
> How can I debug this and more important: How can I fix this?
> 
> -- 
> Mit freundlichen Grüßen,
> Kind regards,
> Matthias Beyer
> 
> Proudly sent with mutt.
> Happily signed with gnupg.



> ___
> nix-dev mailing list
> nix-dev@lists.science.uu.nl
> http://lists.science.uu.nl/mailman/listinfo/nix-dev


-- 
Mit freundlichen Grüßen,
Kind regards,
Matthias Beyer

Proudly sent with mutt.
Happily signed with gnupg.


pgp10Rw_5TXDT.pgp
Description: PGP signature
___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev


Re: [Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Matthias Beyer
Just to append,

the mentioned issue with the redrawing of the terminals is back again.

On 04-06-2015 16:07:14, Matthias Beyer wrote:
> Hi,
> 
> I found and solved the issue.
> 
> So first, I checked the kernel versions in my revisions:
> 
> 106 - 3.18.11
> 107 - 3.18.13
> 108 - 3.18.13
> 109 - 3.18.13
> 110 - 3.18.13
> 111 - 3.18.13
> 112 - 3.18.13
> 
> (the latest two were other test builds).
> 
> I guessed that the issue was the kernel ... because nothing else
> changed. There was no channel update and I didn't fiddle around in the
> configuration for the system (at least not in these parts, only
> packages and containers a bit).
> 
> So, I got my local clone of the nixpkgs repo onto the commit which was
> used to build generation 106:
> 
> 2d8cfe76a9e4f05e391d30f1654d45dee5993b8a
> 
> And rebuild the system. This worked. Reboot worked too, so the new
> revision, 113, is now on kernel 3.18.11 again.
> 
> I then tried to change kernel versions to 4.0 and 3.19 (in this order,
> because newer is better), but both failed to build because the ati
> driver can't be build because of _compiler errors_ (seriousely, what
> the hell?)
> 
> So, I'm back on 3.18.11. Unfortunately, I have own patches because I
> have to wait for the channel update. So I re-applied my patches upon
> mentioned commit:
> 
> gitolite v3.6.2 -> v3.6.3
> snort: version fix (fixes download error)
> daq: version fix (fixes download error)
> 
> these are non-published commits.
> 
> ---
> 
> That's it. While I don't have any concerns about my machine, because
> Nix is great and everything and I can boot in old versions of the
> system and fix everything and so on, I really can't consider this as
> great user experience. Debugging the issue was not that complicated
> (despite my little hangover), but unneccessary. I'd really like to see
> this situation improved. And I'd really like to see kernel 4.0 on my
> machine, as I'm waiting for the AMD graphic cards fixes which are in
> 4.0.
> 
> I mean, I build my system from my own clone of nixpkgs. Not the way I
> want to:
> 
> nix-channel --update
> nixos-rebuild switch
> 
> which should be used. No, I have to
> 
> git checkout 
> nixos-rebuild switch -I nixpkgs=~/my/clone/of/nixpkgs
> # build fail
> # try again other commit
> 
> I don't want to blame anyone for this or something, this mail is
> mainly for documenting the issue, but hey, these things really
> shouldn't happen, right?
> 
> So, to close this here and now... I hope I see some of you guys next
> weekend at tuebix in Tübingen, Germany!
> 
> On 04-06-2015 15:06:51, Matthias Beyer wrote:
> > Hi,
> > 
> > I have a problem with some of my generations.
> > 
> > Today (, after installing chromium, but I don't think this has
> > anything to do with it), I noticed that my xterms got redrawn after
> > switching to them (I'm using i3, so if a window is not shown but I
> > bring it up, a redraw happened).
> > 
> > So, if I switched to a xterm where I executed something like
> > "alsamixer" or "tree /tmp", it got redrawn line by line.
> > 
> > I did not understand why this happened, but I guessed it was some
> > driver problem, where a driver went bad or something. So I decided to
> > reboot.
> > 
> > When rebooting, I booted into my newest generation, which was 109 by
> > this time. But I got an error in stage 1, telling me that my root
> > partition couldn't be mounted as the device did not come up (LUKS
> > encrypted SSD, root on /dev/sda2). It asked me
> > 
> > "dm_mod" loaded?
> > 
> > So I tried previous generations and had success with 106 (107-109) did
> > not work. I checked my config, and I saw:
> > 
> > boot.initrd.kernelModules = [ "fbcon" "ext4" "dm_crypt" ];
> > 
> > that "dm_mod" was missing, indeed. So I changed it to:
> > 
> > boot.initrd.kernelModules = [ "fbcon" "ext4" "dm_mod" "dm_crypt" ];
> > 
> > And rebuild the system, resulting in generation 110. I tried to boot
> > that, but the same error happened.
> > 
> > I'm on kernel 3_18_4, if this matters.
> > 
> > So my problem is, I don't know what went wrong and how to fix it.
> > Unfortunately, I don't know which configuration I build generation 106
> > from (my config is git-tracked). I'd show you the diff of my
> > generation, but well... I don't know which revision it was.
> > 
> > How can I debug this and more important: How can I fix this?
> > 
> > -- 
> > Mit freundlichen Grüßen,
> > Kind regards,
> > Matthias Beyer
> > 
> > Proudly sent with mutt.
> > Happily signed with gnupg.
> 
> 
> 
> > ___
> > nix-dev mailing list
> > nix-dev@lists.science.uu.nl
> > http://lists.science.uu.nl/mailman/listinfo/nix-dev
> 
> 
> -- 
> Mit freundlichen Grüßen,
> Kind regards,
> Matthias Beyer
> 
> Proudly sent with mutt.
> Happily signed with gnupg.



> ___
> nix-dev mailing list
> nix-dev@lists.science.uu.nl
> http://list

[Nix-dev] NixOS won't boot anymore in certain generations, don't know why (Stage-1 error)

2015-06-04 Thread Matthias Beyer
Hi,

I have a problem with some of my generations.

Today (, after installing chromium, but I don't think this has
anything to do with it), I noticed that my xterms got redrawn after
switching to them (I'm using i3, so if a window is not shown but I
bring it up, a redraw happened).

So, if I switched to a xterm where I executed something like
"alsamixer" or "tree /tmp", it got redrawn line by line.

I did not understand why this happened, but I guessed it was some
driver problem, where a driver went bad or something. So I decided to
reboot.

When rebooting, I booted into my newest generation, which was 109 by
this time. But I got an error in stage 1, telling me that my root
partition couldn't be mounted as the device did not come up (LUKS
encrypted SSD, root on /dev/sda2). It asked me

"dm_mod" loaded?

So I tried previous generations and had success with 106 (107-109) did
not work. I checked my config, and I saw:

boot.initrd.kernelModules = [ "fbcon" "ext4" "dm_crypt" ];

that "dm_mod" was missing, indeed. So I changed it to:

boot.initrd.kernelModules = [ "fbcon" "ext4" "dm_mod" "dm_crypt" ];

And rebuild the system, resulting in generation 110. I tried to boot
that, but the same error happened.

I'm on kernel 3_18_4, if this matters.

So my problem is, I don't know what went wrong and how to fix it.
Unfortunately, I don't know which configuration I build generation 106
from (my config is git-tracked). I'd show you the diff of my
generation, but well... I don't know which revision it was.

How can I debug this and more important: How can I fix this?

-- 
Mit freundlichen Grüßen,
Kind regards,
Matthias Beyer

Proudly sent with mutt.
Happily signed with gnupg.


pgplXckNCmEXm.pgp
Description: PGP signature
___
nix-dev mailing list
nix-dev@lists.science.uu.nl
http://lists.science.uu.nl/mailman/listinfo/nix-dev