Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

2015-09-15 Thread Sun, Ning
Did you try the tboot + linux kernel? 

-Original Message-
From: Ross Philipson [mailto:ross.philip...@gmail.com] 
Sent: Tuesday, September 15, 2015 8:02 AM
To: Sun, Ning; tboot-devel@lists.sourceforge.net
Subject: Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

On 09/14/2015 03:54 PM, Sun, Ning wrote:
> Try these commands in a script, and check the print-out after resumed from S3:
> date
> sudo rtcwake -u -s 20 -m mem
> date
>
>
> Thanks,
> -ning
>

I started another thread where I explain what the actual problems are in the 
log compression code. But in order to not ignore this response, I ran the above:

root@xenclient-dom0:/storage# ./do-s3.sh Tue Sep 15 14:37:17 UTC 2015 wakeup 
from "mem" at Tue Sep 15 14:37:38 2015 Tue Sep 15 14:38:57 UTC 2015

It is taking 1m and 40s to come back. 20s are the delay for rtcwake but the 1m 
20s remaining is spent in LZ_Compress.

[snip]


--
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel


Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

2015-09-14 Thread Sun, Ning
Try these commands in a script, and check the print-out after resumed from S3:
date
sudo rtcwake -u -s 20 -m mem
date


Thanks,
-ning

-Original Message-
From: Ross Philipson [mailto:ross.philip...@gmail.com] 
Sent: Sunday, September 13, 2015 11:55 AM
To: Sun, Ning; tboot-devel@lists.sourceforge.net
Subject: Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

On 09/11/2015 08:20 PM, Sun, Ning wrote:
> Actually system resumes from S3 timely in tboot, but I/O (keyboard, mouse, 
> video) looks blocked for seconds in kernel, need more investigations...

When S3 works, it is very timely but that is not the issue. I noted in another 
reply to this thread that when I remove these 2 change sets, the hang and 
subsequent system reset do not happen:

http://hg.code.sf.net/p/tboot/code/rev/9040e000ccc4
http://hg.code.sf.net/p/tboot/code/rev/78713e04bdd9

It is very clearly related to these. I also stated that our suspicion is a 
buffer overflow in or related to this code:

http://hg.code.sf.net/p/tboot/code/rev/9040e000ccc4#l3.62

We do have the logging level set to "all".

Thanks
Ross

>
> Thanks,
> -ning
>
> -Original Message-
> From: Ross Philipson [mailto:ross.philip...@gmail.com]
> Sent: Thursday, September 10, 2015 12:36 PM
> To: tboot-devel@lists.sourceforge.net
> Subject: [tboot-devel] TBOOT 1.8.3 fails to resume from S3
>
> I have been working on moving our project from TBOOT 1.7.0 to 1.8.3. I have 
> discovered that while our 1.7.0 version of TBOOT resumes from S3 just fine, 
> 1.8.3 does not.
>
> The most common symptom seems to be a hang just after TBOOT enters SMX mode. 
> The hang happens at different places so I don't think it is one specific 
> thing that TBOOT is doing to cause the hang. The hang can be short (on the 
> order of seconds) or longs (several minutes). Then suddenly the platform will 
> "unhang". It looks like the platform restarts quickly and goes right back in 
> to TBOOT but it is a little hard to tell exactly what happens right around 
> this point. We see this on every system we have tried it on.
>
> I have backed all our patches out and I still see the problem with a clean 
> 1.8.3 code base. I have also tried using TBOOT with a Debian Jessie install 
> and I get similar problems there. I have been comparing the code between the 
> version and have so far not found anything that makes a difference.
>
> Any help in this matter is appreciated.
> Thanks
> --
> Ross Philipson
>
> --
>  Monitor Your Dynamic Infrastructure at Any Scale With 
> Datadog!
> Get real-time metrics from all of your servers, apps and tools in one place.
> SourceForge users - Click here to start your Free Trial of Datadog now!
> http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
> ___
> tboot-devel mailing list
> tboot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tboot-devel
>


--
Ross Philipson

--
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel


Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

2015-09-11 Thread Ross Philipson
On 09/10/2015 03:35 PM, Ross Philipson wrote:
> I have been working on moving our project from TBOOT 1.7.0 to 1.8.3. I
> have discovered that while our 1.7.0 version of TBOOT resumes from S3
> just fine, 1.8.3 does not.
>
> The most common symptom seems to be a hang just after TBOOT enters SMX
> mode. The hang happens at different places so I don't think it is one
> specific thing that TBOOT is doing to cause the hang. The hang can be
> short (on the order of seconds) or longs (several minutes). Then
> suddenly the platform will "unhang". It looks like the platform restarts
> quickly and goes right back in to TBOOT but it is a little hard to tell
> exactly what happens right around this point. We see this on every
> system we have tried it on.
>
> I have backed all our patches out and I still see the problem with a
> clean 1.8.3 code base. I have also tried using TBOOT with a Debian
> Jessie install and I get similar problems there. I have been comparing
> the code between the version and have so far not found anything that
> makes a difference.
>
> Any help in this matter is appreciated.
> Thanks

I spent some time tracking this down today. If I build without these two 
patches, I no longer get a hang on resume from S3:

http://hg.code.sf.net/p/tboot/code/rev/9040e000ccc4
http://hg.code.sf.net/p/tboot/code/rev/78713e04bdd9

I have not identified the exact cause yet but I wanted to get these 
results posted to the list.

Thanks

-- 
Ross Philipson

--
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel


Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

2015-09-11 Thread Ross Philipson
On 09/11/2015 05:21 PM, Ahmed, Safayet (GE Global Research) wrote:
> The current bleeding-edge version of TBOOT contains a fix for a 
> stack-overflow problem that was present in the code you cited:
>
> http://sourceforge.net/p/tboot/code/ci/a21e913550866591ba838d49fb3aed43b2f6aadd/tree//tboot/common/printk.c?diff=9040e000ccc4cd85fe8e34616f21e0207808604a
>
> The resulting overflow would overwrite strings in the .rodata section of 
> TBOOT memory. I'm wondering if this was the cause of the hangs.

Hmm, well that certainly sounds like a nasty problem. We were curious 
about the gigantic buffer on the stack in that routine. It seems the BSP 
stack size is still not big enough even with this patch?

I will try these fixes out, thanks for pointing that out. I suspect we 
may have another problem though. The previous terminating condition was 
correctly wrapping the log in memlog_write. It seems the new logic for 
whether to try to zip does not provide a correct terminating condition 
and the zipping could overflow the log. This is consistent with our 
problem only showing up on resume from S3 since in this case we are 
adding even more to the buffer that is still present from before S3. 
Anyway, needs more investigation...

Thanks

>
> -Original Message-
> From: Ross Philipson [mailto:ross.philip...@gmail.com]
> Sent: Friday, September 11, 2015 5:16 PM
> To: tboot-devel@lists.sourceforge.net
> Subject: Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3
>
> On 09/10/2015 03:35 PM, Ross Philipson wrote:
>> I have been working on moving our project from TBOOT 1.7.0 to 1.8.3. I
>> have discovered that while our 1.7.0 version of TBOOT resumes from S3
>> just fine, 1.8.3 does not.
>>
>> The most common symptom seems to be a hang just after TBOOT enters SMX
>> mode. The hang happens at different places so I don't think it is one
>> specific thing that TBOOT is doing to cause the hang. The hang can be
>> short (on the order of seconds) or longs (several minutes). Then
>> suddenly the platform will "unhang". It looks like the platform
>> restarts quickly and goes right back in to TBOOT but it is a little
>> hard to tell exactly what happens right around this point. We see this
>> on every system we have tried it on.
>>
>> I have backed all our patches out and I still see the problem with a
>> clean 1.8.3 code base. I have also tried using TBOOT with a Debian
>> Jessie install and I get similar problems there. I have been comparing
>> the code between the version and have so far not found anything that
>> makes a difference.
>>
>> Any help in this matter is appreciated.
>> Thanks
>
> I spent some time tracking this down today. If I build without these two 
> patches, I no longer get a hang on resume from S3:
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__hg.code.sf.net_p_tboot_code_rev_9040e000ccc4=BQICAg=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=lDSLRzn8YUPmwjLuy9Ek9Dy-T15T3uK505eKqf1EFfg=LAxK0DGQNQEDu2T5Lov2U5XNSelVt93H8xUJDDWj864=pJ6y0vwTNPdTGh85FoFOBH4j9WSTx-I4p0Ncr3n1i1I=
> https://urldefense.proofpoint.com/v2/url?u=http-3A__hg.code.sf.net_p_tboot_code_rev_78713e04bdd9=BQICAg=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=lDSLRzn8YUPmwjLuy9Ek9Dy-T15T3uK505eKqf1EFfg=LAxK0DGQNQEDu2T5Lov2U5XNSelVt93H8xUJDDWj864=eszD9yhtQidHK6kMeMRcXUsjs_0u_frsAZENAWez6zA=
>
> I have not identified the exact cause yet but I wanted to get these results 
> posted to the list.
>
> Thanks
>
> --
> Ross Philipson
>
> --
> ___
> tboot-devel mailing list
> tboot-devel@lists.sourceforge.net
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_tboot-2Ddevel=BQICAg=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUrLrDQYWSI=lDSLRzn8YUPmwjLuy9Ek9Dy-T15T3uK505eKqf1EFfg=LAxK0DGQNQEDu2T5Lov2U5XNSelVt93H8xUJDDWj864=UHGtkbuNtbT5_PcWwQKBwJ9kDfHkcpd_iF23GT8LCmg=
>


-- 
Ross Philipson

--
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel


Re: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

2015-09-11 Thread Sun, Ning
Actually system resumes from S3 timely in tboot, but I/O (keyboard, mouse, 
video) looks blocked for seconds in kernel, need more investigations...

Thanks,
-ning

-Original Message-
From: Ross Philipson [mailto:ross.philip...@gmail.com] 
Sent: Thursday, September 10, 2015 12:36 PM
To: tboot-devel@lists.sourceforge.net
Subject: [tboot-devel] TBOOT 1.8.3 fails to resume from S3

I have been working on moving our project from TBOOT 1.7.0 to 1.8.3. I have 
discovered that while our 1.7.0 version of TBOOT resumes from S3 just fine, 
1.8.3 does not.

The most common symptom seems to be a hang just after TBOOT enters SMX mode. 
The hang happens at different places so I don't think it is one specific thing 
that TBOOT is doing to cause the hang. The hang can be short (on the order of 
seconds) or longs (several minutes). Then suddenly the platform will "unhang". 
It looks like the platform restarts quickly and goes right back in to TBOOT but 
it is a little hard to tell exactly what happens right around this point. We 
see this on every system we have tried it on.

I have backed all our patches out and I still see the problem with a clean 
1.8.3 code base. I have also tried using TBOOT with a Debian Jessie install and 
I get similar problems there. I have been comparing the code between the 
version and have so far not found anything that makes a difference.

Any help in this matter is appreciated.
Thanks
--
Ross Philipson

--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel

--
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel


[tboot-devel] TBOOT 1.8.3 fails to resume from S3

2015-09-10 Thread Ross Philipson
I have been working on moving our project from TBOOT 1.7.0 to 1.8.3. I 
have discovered that while our 1.7.0 version of TBOOT resumes from S3 
just fine, 1.8.3 does not.

The most common symptom seems to be a hang just after TBOOT enters SMX 
mode. The hang happens at different places so I don't think it is one 
specific thing that TBOOT is doing to cause the hang. The hang can be 
short (on the order of seconds) or longs (several minutes). Then 
suddenly the platform will "unhang". It looks like the platform restarts 
quickly and goes right back in to TBOOT but it is a little hard to tell 
exactly what happens right around this point. We see this on every 
system we have tried it on.

I have backed all our patches out and I still see the problem with a 
clean 1.8.3 code base. I have also tried using TBOOT with a Debian 
Jessie install and I get similar problems there. I have been comparing 
the code between the version and have so far not found anything that 
makes a difference.

Any help in this matter is appreciated.
Thanks
-- 
Ross Philipson

--
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991=/4140
___
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel