Re: [gentoo-user] strange behaviour in quite special case

2017-09-19 Thread Francisco Ares
2017-09-18 19:56 GMT-03:00 Peter Humphrey :

> On Monday, 18 September 2017 14:13:44 BST Francisco Ares wrote:
>
> > After days and days struggling,
>
> I know what you mean. I've spent weeks wrestling with KMail. That included
> losing e-mails, falling behind in conversations and so on.
>
> > I finally upgraded to the newest stable kernel and updated every package
> > with a "emerge -e", just in case, twice! Then, rebuilt the kernel again.
> >
> > So, like a charm, everything got back to work as before.
>
> My technique in such cases is to emerge @system, then recompile the kernel,
> reboot on it and emerge -e world --exclude="gcc gentoo-sources". Seems to
> have worked out all right so far. You could omit the exclusion if you're
> even more paranoid than KMail has made me.
>
> --
> Regards,
> Peter.
>
>
>

Hi, Peter.

Thank you for your experience.  In fact, as an "emerge -e" is quite
automatic, and I could let the system alone a whole weekend, I didn't worry
(nor had the time) to do it in parts to try to figure out which one would
succeed, in special because on the following monday it just _should_ be
working, or the launch of the new program version would be delayed (for
who-knows how much time) and would put my neck at risk ;-).

Best Regards,
Francisco


Re: [gentoo-user] strange behaviour in quite special case

2017-09-18 Thread Peter Humphrey
On Monday, 18 September 2017 14:13:44 BST Francisco Ares wrote:

> After days and days struggling,

I know what you mean. I've spent weeks wrestling with KMail. That included 
losing e-mails, falling behind in conversations and so on.

> I finally upgraded to the newest stable kernel and updated every package
> with a "emerge -e", just in case, twice! Then, rebuilt the kernel again.
> 
> So, like a charm, everything got back to work as before.

My technique in such cases is to emerge @system, then recompile the kernel, 
reboot on it and emerge -e world --exclude="gcc gentoo-sources". Seems to 
have worked out all right so far. You could omit the exclusion if you're 
even more paranoid than KMail has made me.

-- 
Regards,
Peter.




Re: [gentoo-user] strange behaviour in quite special case

2017-09-18 Thread Francisco Ares
2017-08-31 4:47 GMT-03:00 Andrew Savchenko :

> Hi,
>
> On Thu, 24 Aug 2017 18:27:22 -0300 Francisco Ares wrote:
> > Hi, All.
> >
> > This is a rather special case, so I don't expect much, but who knows?
> >
> > I've built a Gentoo x86-64 system for an embedded application.
> >
> > Just after a lot of updates, which I am unable to track, it stopped
> working
> > as usual.
> >
> > There is the development system, fully loaded of a lot of packages used
> for
> > development, and the production system, that don't need all of those.
> >
> > There is a line in both systems in /etc/iniitab responsible for
> auto-login
> > the production system user and the programs we need running (in its
> > ".bash_profile" and ".xinitrc"):
> >
> > c6:2345:respawn:/sbin/agetty -a production-user 38400 tty6 linux
> >
> > The development system starts a WindowMaker session, and the production
> > system starts a program that controls the rest of the hardware of this
> > embedded system, with an X11 graphical interface.  That runs normally
> when
> > simulated at the development system.
> >
> > The development system runs smoothly.  The production system, after
> > removing the files from undesirable packages and creating a squashfs
> image
> > of the ripped-off root partition behaves strangely at boot:
> >
> > It shows the initialization messages as expected, but when the auto-login
> > and the controller program start should take place, it completely stalls
> up
> > to I plug a USB keyboard and issue some times some of the key
> combinations
> > to change to a text console and back to X11 (Ctrl-Alt-F1 and
> Ctrl-Alt-F6);
> >  only then the things resume as expected.
> >
> > As you might suspect, there is no keyboard for the production system ;-)
> .
> >
> > As a matter of fact, I don't know where the stall take place, as when I
> try
> > to switch to a text console to see the logs, it switches back to X11 and
> > starts our program.  By the way, the logs just show that the events
> > occurred at latter times than expected.
> >
> > Although the squashfs is read-only, some main directories are arranged
> in a
> > way that, using tmpfs mounts and unionfs with the read-only directory to
> > the read-write tmpfs directory to that main directory provide a way of
> > creating temporary files that has been working for a few years now.
> >
> > For instance, in "/etc/fstab":
> >
> > tmpfs   /.etc.rwtmpfs   defaults,mode=755
> > 0 0
> > union   /etcunionfs
> > default_permissions,allow_other,use_ino,nonempty,suid,
> cow,dirs=/.
> > etc.rw=rw:/.etc.ro=ro  0 0
> >
> > And there is a "/.etc.ro" with a copy of all files present in regular
> > "/etc" , a "/.etc.rw" directory to be mounted tmpfs, and the original
> > "/etc" directory, that needs to be there at boot, even before mounting
> all
> > this.
> >
> > Does anyone have a clue?
>
> Try to dissect your problem. Start with removing squashfs and all
> tmpfs/unionfs manipulations. Create the same image, but on "normal"
> writable file system and see how it goes. It may be fs-related bug,
> may be you removed too many files and some "undesired" packages are
> actually mandatory.
>
> If you have some form on snapshots of your changes, you can try to
> bisect them in a git bisect way.
>
> Another approach is to run X server (or any other app suspected as
> a troublemaker) under strace (or attach strace to a running process)
> and see what is going on. You will have a lot of low level
> information and extensive filtering will be required; strace is
> capable of that, but you will need to dig into its documentation.
>
> Best regards,
> Andrew Savchenko
>


Hi All,

After days and days struggling, I finally upgraded to the newest stable
kernel and updated every package with a "emerge -e", just in case, twice!
Then, rebuilt the kernel again.

So, like a charm, everything got back to work as before.

Unfortunately, will never know what piece of code that issue was.

Thank you, and

Best Regards,
Francisco


Re: [gentoo-user] strange behaviour in quite special case

2017-08-31 Thread Andrew Savchenko
Hi,

On Thu, 24 Aug 2017 18:27:22 -0300 Francisco Ares wrote:
> Hi, All.
> 
> This is a rather special case, so I don't expect much, but who knows?
> 
> I've built a Gentoo x86-64 system for an embedded application.
> 
> Just after a lot of updates, which I am unable to track, it stopped working
> as usual.
> 
> There is the development system, fully loaded of a lot of packages used for
> development, and the production system, that don't need all of those.
> 
> There is a line in both systems in /etc/iniitab responsible for auto-login
> the production system user and the programs we need running (in its
> ".bash_profile" and ".xinitrc"):
> 
> c6:2345:respawn:/sbin/agetty -a production-user 38400 tty6 linux
> 
> The development system starts a WindowMaker session, and the production
> system starts a program that controls the rest of the hardware of this
> embedded system, with an X11 graphical interface.  That runs normally when
> simulated at the development system.
> 
> The development system runs smoothly.  The production system, after
> removing the files from undesirable packages and creating a squashfs image
> of the ripped-off root partition behaves strangely at boot:
> 
> It shows the initialization messages as expected, but when the auto-login
> and the controller program start should take place, it completely stalls up
> to I plug a USB keyboard and issue some times some of the key combinations
> to change to a text console and back to X11 (Ctrl-Alt-F1 and Ctrl-Alt-F6);
>  only then the things resume as expected.
> 
> As you might suspect, there is no keyboard for the production system ;-) .
> 
> As a matter of fact, I don't know where the stall take place, as when I try
> to switch to a text console to see the logs, it switches back to X11 and
> starts our program.  By the way, the logs just show that the events
> occurred at latter times than expected.
> 
> Although the squashfs is read-only, some main directories are arranged in a
> way that, using tmpfs mounts and unionfs with the read-only directory to
> the read-write tmpfs directory to that main directory provide a way of
> creating temporary files that has been working for a few years now.
> 
> For instance, in "/etc/fstab":
> 
> tmpfs   /.etc.rwtmpfs   defaults,mode=755
> 0 0
> union   /etcunionfs
> default_permissions,allow_other,use_ino,nonempty,suid,cow,dirs=/.
> etc.rw=rw:/.etc.ro=ro  0 0
> 
> And there is a "/.etc.ro" with a copy of all files present in regular
> "/etc" , a "/.etc.rw" directory to be mounted tmpfs, and the original
> "/etc" directory, that needs to be there at boot, even before mounting all
> this.
> 
> Does anyone have a clue?

Try to dissect your problem. Start with removing squashfs and all
tmpfs/unionfs manipulations. Create the same image, but on "normal"
writable file system and see how it goes. It may be fs-related bug,
may be you removed too many files and some "undesired" packages are
actually mandatory.

If you have some form on snapshots of your changes, you can try to
bisect them in a git bisect way.

Another approach is to run X server (or any other app suspected as
a troublemaker) under strace (or attach strace to a running process)
and see what is going on. You will have a lot of low level
information and extensive filtering will be required; strace is
capable of that, but you will need to dig into its documentation.

Best regards,
Andrew Savchenko


pgpiwHTGCazGH.pgp
Description: PGP signature


[gentoo-user] strange behaviour in quite special case

2017-08-24 Thread Francisco Ares
Hi, All.

This is a rather special case, so I don't expect much, but who knows?

I've built a Gentoo x86-64 system for an embedded application.

Just after a lot of updates, which I am unable to track, it stopped working
as usual.

There is the development system, fully loaded of a lot of packages used for
development, and the production system, that don't need all of those.

There is a line in both systems in /etc/iniitab responsible for auto-login
the production system user and the programs we need running (in its
".bash_profile" and ".xinitrc"):

c6:2345:respawn:/sbin/agetty -a production-user 38400 tty6 linux

The development system starts a WindowMaker session, and the production
system starts a program that controls the rest of the hardware of this
embedded system, with an X11 graphical interface.  That runs normally when
simulated at the development system.

The development system runs smoothly.  The production system, after
removing the files from undesirable packages and creating a squashfs image
of the ripped-off root partition behaves strangely at boot:

It shows the initialization messages as expected, but when the auto-login
and the controller program start should take place, it completely stalls up
to I plug a USB keyboard and issue some times some of the key combinations
to change to a text console and back to X11 (Ctrl-Alt-F1 and Ctrl-Alt-F6);
 only then the things resume as expected.

As you might suspect, there is no keyboard for the production system ;-) .

As a matter of fact, I don't know where the stall take place, as when I try
to switch to a text console to see the logs, it switches back to X11 and
starts our program.  By the way, the logs just show that the events
occurred at latter times than expected.

Although the squashfs is read-only, some main directories are arranged in a
way that, using tmpfs mounts and unionfs with the read-only directory to
the read-write tmpfs directory to that main directory provide a way of
creating temporary files that has been working for a few years now.

For instance, in "/etc/fstab":

tmpfs   /.etc.rwtmpfs   defaults,mode=755
0 0
union   /etcunionfs
default_permissions,allow_other,use_ino,nonempty,suid,cow,dirs=/.
etc.rw=rw:/.etc.ro=ro  0 0

And there is a "/.etc.ro" with a copy of all files present in regular
"/etc" , a "/.etc.rw" directory to be mounted tmpfs, and the original
"/etc" directory, that needs to be there at boot, even before mounting all
this.

Does anyone have a clue?

Thanks!
Francisco