Re: [MTT devel] GDS errors

2010-02-12 Thread Jeff Squyres
Huh.  Still weird then about why it would be 667 bytes long.

How do I diagnose this further?


On Feb 12, 2010, at 12:27 PM, Igor Ivanov wrote:

> We have naming convention that all fields related results are named as 
> data_xxx
> So data_message_size equal message_size for OSU, IMB etc (2,4,8)
> 
> 
> 
> Jeff Squyres wrote:
>> On Feb 12, 2010, at 9:45 AM, Igor Ivanov wrote:
>> 
>>   
>> 
>>> Look at message string.
>>> BadValueError: Property data_message_size is 667 bytes long; it must be 500 
>>> or less. Consider Text instead, which can store strings of any length.
>>> 
>>> 
>> 
>> Ah, ok.
>> 
>> What is data_message_size, and why would my submits have a value that would 
>> be 667 bytes long?  From the variable name, I would assume that it's a 
>> number, in which case I can't imagine that it should be more than a few 
>> bytes long...?
>> 
>>   
>> 
>>> Regards,
>>> Igor
>>> 
>>> 
>>> Jeff Squyres wrote:
>>> 
>>> 
>>>> Looking in the appspot dashboard, I see a bunch of errors when Cisco tried 
>>>> to submit test run data.  There's a few random errors, but a bunch that 
>>>> look like what I pasted below.  How do I diagnose this further?  Clearly, 
>>>> some field is too long -- how do I find out which one?
>>>> 
>>>> -
>>>>• 128.107.241.170 - - [11/Feb/2010:00:51:21 -0800] "POST /client 
>>>> HTTP/1.1" 500 1972 - "MPI Test MTTGDS Reporter,gzip(gfe)" 
>>>> "open-mpi-mtt.appspot.com"
>>>>• E02-11 12:51AM 21.241
>>>> Property data_message_size is 667 bytes long; it must be 500 or less. 
>>>> Consider Text instead, which can store strings of any length.
>>>> Traceback (most recent call last):
>>>>   File 
>>>> "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", 
>>>> line 509, in __call__
>>>> handler.post(*groups)
>>>>   File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", 
>>>> line 961, in post
>>>> status = self._submit();
>>>>   File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", 
>>>> line 485, in _submit
>>>> test_run_phase.put()
>>>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>>>> line 801, in put
>>>> self._populate_internal_entity()
>>>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>>>> line 779, in _populate_internal_entity
>>>> self._entity = self._populate_entity(_entity_class=_entity_class)
>>>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>>>> line 839, in _populate_entity
>>>> self._to_entity(entity)
>>>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>>>> line 1465, in _to_entity
>>>> entity[key] = value
>>>>   File "/base/python_lib/versions/1/google/appengine/api/datastore.py", 
>>>> line 492, in __setitem__
>>>> datastore_types.ValidateProperty(name, value)
>>>>   File 
>>>> "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", 
>>>> line 1290, in ValidateProperty
>>>> prop_validator(name, v)
>>>>   File 
>>>> "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", 
>>>> line 1181, in ValidatePropertyString
>>>> ValidateStringLength(name, value, max_len=_MAX_STRING_LENGTH)
>>>>   File 
>>>> "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", 
>>>> line 1171, in ValidateStringLength
>>>> (name, len(value), max_len))
>>>> BadValueError: Property data_message_size is 667 bytes long; it must be 
>>>> 500 or less. Consider Text instead, which can store strings of any length.
>>>> -
>>>> 
>>>>   
>>>> 
>>>>   
>>>> 
>>> __ Information from ESET NOD32 Antivirus, version of virus 
>>> signature database 4861 (20100212) __
>>> 
>>> The message was checked by ESET NOD32 Antivirus.
>>> 
>>> 
>>> http://www.esetnod32.ru
>>> 
>>> 
>>> 
>> 
>> 
>>   
>> 
> 
> 
> __ Information from ESET NOD32 Antivirus, version of virus signature 
> database 4861 (20100212) __
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.esetnod32.ru


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] MTToGDS

2010-02-12 Thread Jeff Squyres
Great -- many thanks!

On Feb 12, 2010, at 12:32 PM, Igor Ivanov wrote:

> Hi Jeff,
> 
> I have done changes related google account support but not tested them well.
> I will try to send them on Monday.
> 
> Regards,
> Igor
> 
> Jeff Squyres wrote:
>> On Feb 10, 2010, at 9:09 AM, Igor Ivanov wrote:
>> 
>>   
>> 
>>>> I took a swipe at doing this (totally not tested; how does one 
>>>> develop/test this stuff?).  I know just a tiny bit of python, but the code 
>>>> was fairly readable.  Please see the attached patch -- is it anywhere 
>>>> close to correct?
>>>> 
>>>>   
>>>> 
>>> [II] It seems close but you forget about bquery.pl that allows to add a new 
>>> user and related handler (processes bquery.pl --admin) on gds/main.py at 
>>> least.
>>> 
>>> 
>> 
>> Oh, yikes -- good catch.  I'll look into that...
>> 
>> How does one develop / test / debug / deploy changes to this stuff?
>> 
>>   
>> 
> 
> 
> __ Information from ESET NOD32 Antivirus, version of virus signature 
> database 4861 (20100212) __
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.esetnod32.ru


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] MTToGDS

2010-02-12 Thread Igor Ivanov




Hi Jeff,

I have done changes related google account support but not tested them
well.
I will try to send them on Monday.

Regards,
Igor

Jeff Squyres wrote:

  On Feb 10, 2010, at 9:09 AM, Igor Ivanov wrote:

  
  

  I took a swipe at doing this (totally not tested; how does one develop/test this stuff?).  I know just a tiny bit of python, but the code was fairly readable.  Please see the attached patch -- is it anywhere close to correct?

  

[II] It seems close but you forget about bquery.pl that allows to add a new user and related handler (processes bquery.pl --admin) on gds/main.py at least.

  
  
Oh, yikes -- good catch.  I'll look into that...

How does one develop / test / debug / deploy changes to this stuff?

  



__ Information from ESET NOD32 Antivirus, version of virus signature database 4861 (20100212) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru






Re: [MTT devel] GDS errors

2010-02-12 Thread Igor Ivanov




We have naming convention that all fields related results are named as data_xxx
So data_message_size equal message_size for OSU, IMB etc (2,4,8)



Jeff Squyres wrote:

  On Feb 12, 2010, at 9:45 AM, Igor Ivanov wrote:

  
  
Look at message string.
BadValueError: Property data_message_size is 667 bytes long; it must be 500 or less. Consider Text instead, which can store strings of any length.

  
  
Ah, ok.

What is data_message_size, and why would my submits have a value that would be 667 bytes long?  From the variable name, I would assume that it's a number, in which case I can't imagine that it should be more than a few bytes long...?

  
  
Regards,
Igor


Jeff Squyres wrote:


  Looking in the appspot dashboard, I see a bunch of errors when Cisco tried to submit test run data.  There's a few random errors, but a bunch that look like what I pasted below.  How do I diagnose this further?  Clearly, some field is too long -- how do I find out which one?

-
	• 128.107.241.170 - - [11/Feb/2010:00:51:21 -0800] "POST /client HTTP/1.1" 500 1972 - "MPI Test MTTGDS Reporter,gzip(gfe)" "open-mpi-mtt.appspot.com"
	• E02-11 12:51AM 21.241
Property data_message_size is 667 bytes long; it must be 500 or less. Consider Text instead, which can store strings of any length.
Traceback (most recent call last):
  File "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 509, in __call__
handler.post(*groups)
  File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", line 961, in post
status = self._submit();
  File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", line 485, in _submit
test_run_phase.put()
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 801, in put
self._populate_internal_entity()
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 779, in _populate_internal_entity
self._entity = self._populate_entity(_entity_class=_entity_class)
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 839, in _populate_entity
self._to_entity(entity)
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1465, in _to_entity
entity[key] = value
  File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 492, in __setitem__
datastore_types.ValidateProperty(name, value)
  File "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1290, in ValidateProperty
prop_validator(name, v)
  File "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1181, in ValidatePropertyString
ValidateStringLength(name, value, max_len=_MAX_STRING_LENGTH)
  File "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1171, in ValidateStringLength
(name, len(value), max_len))
BadValueError: Property data_message_size is 667 bytes long; it must be 500 or less. Consider Text instead, which can store strings of any length.
-



  


__ Information from ESET NOD32 Antivirus, version of virus signature database 4861 (20100212) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru

  
  

  



__ Information from ESET NOD32 Antivirus, version of virus signature database 4861 (20100212) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru






Re: [MTT devel] MTToGDS

2010-02-12 Thread Jeff Squyres
On Feb 10, 2010, at 9:09 AM, Igor Ivanov wrote:

>> I took a swipe at doing this (totally not tested; how does one develop/test 
>> this stuff?).  I know just a tiny bit of python, but the code was fairly 
>> readable.  Please see the attached patch -- is it anywhere close to correct?
>> 
> [II] It seems close but you forget about bquery.pl that allows to add a new 
> user and related handler (processes bquery.pl --admin) on gds/main.py at 
> least.

Oh, yikes -- good catch.  I'll look into that...

How does one develop / test / debug / deploy changes to this stuff?

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] More GDS questions

2010-02-12 Thread Jeff Squyres
On Feb 12, 2010, at 11:36 AM, Andrew Senin wrote:

> I would also like to add to Igor’s comment that CPU time shown by Google is a 
> sum of all CPUs of distributed system involved in update operation (and who 
> knows how many servers are involved?). Also they “define ‘CPU hour’ in terms 
> of a hypothetical 1.4 GHz processor, whereas the actual processors we use in 
> production vary but are generally faster than this” (see comment of 
> DonSchwarz: 
> http://groups.google.com/group/google-appengine-java/browse_thread/thread/aa9f18638b7bbea9?pli=1).
>  According to the same topic 6.5 CPU hours is about 2.3 minutes real time. I 
> think you may try to remove some of indexes which need to be updated on each 
> new file upload (see Datastore Indexes on Web admin console).

Excellent information.  Google seemed to agree that 2.3 mins of real time 
should be nowhere near 6.5 CPU hours quota and they claimed that they fixed at 
least one issue regarding bulk uploads.

However, this thread does imply that trickling in data over time instead of 
doing bulk uploads is a good idea.

Was the rationale of caching all MTTGDS info during the Submit phase and 
actually uploading it during Finalize just a measure to reduce submission 
latency?

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] MTT GDS -- one more...

2010-02-12 Thread Jeff Squyres
On Feb 12, 2010, at 11:35 AM, Andrew Senin wrote:

> I worked with Igor on the GDS framework (although Igor knows more tech
> details than me). Let me put my two cents to the discussion.

Thanks!

> > 1. It looks like the main benefits of using the Google App Engine --
> specifically for MTT -- is that we can use the GDS and/or we can host an
> application on their web servers.  Is that correct?
> 
> I think yes. Also GDS should work faster than a relational DB on large
> amounts of data.

Cool.  The speed is also a good/important point for us -- our current SQL 
server is kinda creaking under the load.  Josh spent quite a bit of time 
optimizing the database that we have now (you should have seen how slow it used 
to be!), so moving to a faster platform is desirable.

> > 2. In reading through the Google Appengine docs, the GDS stuff looks like
> we mainly can access the data through GQL.  I don't see any mention of doing
> map/reduce kinds of computations (Ethan and I were talking on the phone
> today about MTT Appengine possibilities).  I'm new to all this stuff, so
> it's quite possible that a) I missed it, or b) I just don't understand what
> I'm seeing/reading yet.  Or does GQL do map/reduce on the back end to do its
> magic?  Is GQL the main/only way we have to access GDS?
> 
> As far as I and Igor know there are no way of doing Map/Reduce with GDS. And
> GQL (or filters which is practically synonym) is the main and only way to
> access GDS data.

Ok, good.  Just wanted to make sure we understood that point properly and 
weren't missing anything.

> > 3. Is there a reason that MTTGDS.pm doesn't use the python API to directly
> talk to GDS?  I.e., what is the rationale for using a web app on appengine?
> Is the web app doing stuff that we can't do at the client?  Ditto for
> bquery.pl and breport.pl.  (these questions are partially fueled by my
> curiosity and concern about why we're using so much CPU at Google)
> 
> There are a few reasons of doing it. The first is speed. When we post new
> data we firstly try to find if there is a copy of corresponding MpiInfo,
> ClustreInfo and other *Info classes. If we did it directly from client
> scripts the delays would be higher (depending on Internet connection speed).
> Price of it is additional CPU cycles on google servers.

FWIW, I don't think I'm concerned about the speed of submitting.  MTT runs can 
go for hours.  If it takes 2 seconds to submit or 20, I'm not concerned about 
it -- a few round-trip latencies + some GQL lookups are still a very small 
fraction of the overall MTT run time.  If CPU is going to be an issue, I 
wouldn't mind doing some of these lookups from the client (and potentially even 
caching some of the IDs on the client -- like we do on the SQL submission 
reporter), and then just submitting those IDs in the "main submit".

> The second and more
> important is that when we have such logic on server we (instead of GDS
> clients) are responsible for maintaining correct structure of links between
> objects. If such logic was implemented on client side user could (by mistake
> or on purpose) break links between objects.

Ah yes, this is a very good reason.

I would also imagine that without the web interface, we would be limited to 
talking to the GDS under a single username/password (i.e., the owner of the 
appspot), which is also undesirable.

Thanks for the info!

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] More GDS questions

2010-02-12 Thread Andrew Senin
Hello Jeff, 



I would also like to add to Igor's comment that CPU time shown by Google is
a sum of all CPUs of distributed system involved in update operation (and
who knows how many servers are involved?). Also they "define 'CPU hour' in
terms of a hypothetical 1.4 GHz processor, whereas the actual processors we
use in production vary but are generally faster than this" (see comment of
DonSchwarz:
http://groups.google.com/group/google-appengine-java/browse_thread/thread/aa
9f18638b7bbea9?pli=1). According to the same topic 6.5 CPU hours is about
2.3 minutes real time. I think you may try to remove some of indexes which
need to be updated on each new file upload (see Datastore Indexes on Web
admin console).



Regards, 

Andrew Senin.



From: Igor Ivanov [mailto:igor.iva...@argus-cv.com] 
Sent: Friday, February 12, 2010 6:02 PM
To: Jeff Squyres
Cc: Development list for the MPI Testing Tool; b...@argus-cv.com; Yiftah
Shahar
Subject: Re: More GDS questions



Hi Jeff,

You touched a sore point. App engine forums are in filled  the questions as
yours. 
I can not know clear answer now.

Igor

Jeff Squyres wrote: 

Igor et al. -- 

1. I'm not sure you saw Ethan's and my posts from the past day or so about
GDS on the mtt-devel list; it just occurred to me that I don't know if
you're members of the list or not.  We've posted a few questions and
comments that you may not have received if you're not on the list:

http://www.open-mpi.org/community/lists/mtt-devel/2010/02/index.php

2. I'm still looking into the perl syntax error that caused my Big Submit to
GDS to fail.  But looking at the Google logs, it looks like at least *some*
of my test run results made it up to GDS.  There was a BIG spike in CPU
usage (3.2 hours of CPU time!) when it submitted -- see the attached CPU
usage graph from the apps dashboard.

Does anyone know why it takes so much CPU just to submit data to GDS?  3.2
CPU hours is a LOT!

It makes me a bit concerned that only part of a single Cisco MTT run submit
checked through almost half of our daily CPU quota (6.5 CPU hours/day).  Is
there any way to reduce the amount of CPU necessary just to submit data?





  _  







__ Information from ESET NOD32 Antivirus, version of virus signature
database 4861 (20100212) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru



Re: [MTT devel] MTT GDS -- one more...

2010-02-12 Thread Andrew Senin
Hello Jeff, 

I worked with Igor on the GDS framework (although Igor knows more tech
details than me). Let me put my two cents to the discussion.

> 1. It looks like the main benefits of using the Google App Engine --
specifically for MTT -- is that we can use the GDS and/or we can host an
application on their web servers.  Is that correct?

I think yes. Also GDS should work faster than a relational DB on large
amounts of data.

> 2. In reading through the Google Appengine docs, the GDS stuff looks like
we mainly can access the data through GQL.  I don't see any mention of doing
map/reduce kinds of computations (Ethan and I were talking on the phone
today about MTT Appengine possibilities).  I'm new to all this stuff, so
it's quite possible that a) I missed it, or b) I just don't understand what
I'm seeing/reading yet.  Or does GQL do map/reduce on the back end to do its
magic?  Is GQL the main/only way we have to access GDS?

As far as I and Igor know there are no way of doing Map/Reduce with GDS. And
GQL (or filters which is practically synonym) is the main and only way to
access GDS data.

> 3. Is there a reason that MTTGDS.pm doesn't use the python API to directly
talk to GDS?  I.e., what is the rationale for using a web app on appengine?
Is the web app doing stuff that we can't do at the client?  Ditto for
bquery.pl and breport.pl.  (these questions are partially fueled by my
curiosity and concern about why we're using so much CPU at Google)

There are a few reasons of doing it. The first is speed. When we post new
data we firstly try to find if there is a copy of corresponding MpiInfo,
ClustreInfo and other *Info classes. If we did it directly from client
scripts the delays would be higher (depending on Internet connection speed).
Price of it is additional CPU cycles on google servers. The second and more
important is that when we have such logic on server we (instead of GDS
clients) are responsible for maintaining correct structure of links between
objects. If such logic was implemented on client side user could (by mistake
or on purpose) break links between objects.

Regards, 
Andrew Senin

-Original Message-
List-Post: mtt-devel@lists.open-mpi.org
Date: Thu, 11 Feb 2010 21:43:21 -0500
From: Jeff Squyres 
Subject: [MTT devel] MTT GDS -- one more...
To: Development list for the MPI Testing Tool 
Message-ID: <8a556b10-1618-47ea-96a9-33f22aecd...@cisco.com>
Content-Type: text/plain; charset=us-ascii

Heh... even more questions...

(BTW, Ethan and I have asked s many questions that if it helps, I can
setup a webex and we can all discuss this in person rather than via
1,000,000 annoying emails from us.  :-)  Webex can call you; no one will
need to pay for an international call)

1. It looks like the main benefits of using the Google App Engine --
specifically for MTT -- is that we can use the GDS and/or we can host an
application on their web servers.  Is that correct?

2. In reading through the Google Appengine docs, the GDS stuff looks like we
mainly can access the data through GQL.  I don't see any mention of doing
map/reduce kinds of computations (Ethan and I were talking on the phone
today about MTT Appengine possibilities).  I'm new to all this stuff, so
it's quite possible that a) I missed it, or b) I just don't understand what
I'm seeing/reading yet.  Or does GQL do map/reduce on the back end to do its
magic?  Is GQL the main/only way we have to access GDS?

3. Is there a reason that MTTGDS.pm doesn't use the python API to directly
talk to GDS?  I.e., what is the rationale for using a web app on appengine?
Is the web app doing stuff that we can't do at the client?  Ditto for
bquery.pl and breport.pl.  (these questions are partially fueled by my
curiosity and concern about why we're using so much CPU at Google)

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] MTTGDS issues

2010-02-12 Thread Jeff Squyres
On Feb 12, 2010, at 2:08 AM, Igor Ivanov wrote:

>> *** WARNING: Could not run module
>> MTT::Test::Analyze::Performance::NetPipe:PreReport: Undefined
>> subroutine &MTT::Test::Analyze::Performance::NetPipe::PreReport called
>> at (eval 335838) line 1.
>> 
> [II] It is the same thing as with "analyze_module" warning you got before. 
> There are set of values that we would like to store in datastore but they do 
> not exist in original analyzer. To avoid conflict with original mtt procedure 
> _pre_process_phase call is done to get info from special function integrated 
> into original analyzers as PreReport.

Gotcha.  I just committed some fixes for this (test for PreReport existence 
before trying to call it).

>> *** ERROR: Module aborted: MTT::Reporter::MTTGDS:Finalize: Nested
>> quantifiers in regex; marked by <-- HERE in m/\s[\S/\\]*mpi2c++ <--
>> HERE _test.*/ at /home/jsquyres/svn/mtt/lib/MTT/Reporter/MTTGDS.pm line
>> 498.

I just submitted some fixes for this, too.  I think it's a safer way to extract 
the mca params.

>> Is there a way to re-submit my data to GDS?
>> 
>> [II] Yes. You can upload data (datafile.yaml) from local place to datastore 
>> using bquery.pl --upload. Place for collected data can be controlled by 
>> "repository_tempdir", "repository_dirname_prefix"
> [VBench]
> repository_tempdir=&scratch_root()/gds_data
> repository_dirname_prefix=gds
> submit_failed_results_to_gds=0

Cool; good to know.  I'll just re-run stuff for now and re-submit -- easy 
enough.

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] More GDS questions

2010-02-12 Thread Jeff Squyres
On Feb 12, 2010, at 10:02 AM, Igor Ivanov wrote:

> You touched a sore point. App engine forums are in filled  the questions as 
> yours. 
> I can not know clear answer now.

Ok, bummer.  :-)

>> 2. I'm still looking into the perl syntax error that caused my Big Submit to 
>> GDS to fail.  But looking at the Google logs, it looks like at least *some* 
>> of my test run results made it up to GDS.  There was a BIG spike in CPU 
>> usage (3.2 hours of CPU time!) when it submitted -- see the attached CPU 
>> usage graph from the apps dashboard.
>> 
>> Does anyone know why it takes so much CPU just to submit data to GDS?  3.2 
>> CPU hours is a LOT!
>> 
>> It makes me a bit concerned that only part of a single Cisco MTT run submit 
>> checked through almost half of our daily CPU quota (6.5 CPU hours/day).  Is 
>> there any way to reduce the amount of CPU necessary just to submit data?

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] GDS errors

2010-02-12 Thread Jeff Squyres
On Feb 12, 2010, at 9:45 AM, Igor Ivanov wrote:

> Look at message string.
> BadValueError: Property data_message_size is 667 bytes long; it must be 500 
> or less. Consider Text instead, which can store strings of any length.

Ah, ok.

What is data_message_size, and why would my submits have a value that would be 
667 bytes long?  From the variable name, I would assume that it's a number, in 
which case I can't imagine that it should be more than a few bytes long...?

> Regards,
> Igor
> 
> 
> Jeff Squyres wrote:
>> Looking in the appspot dashboard, I see a bunch of errors when Cisco tried 
>> to submit test run data.  There's a few random errors, but a bunch that look 
>> like what I pasted below.  How do I diagnose this further?  Clearly, some 
>> field is too long -- how do I find out which one?
>> 
>> -
>>  • 128.107.241.170 - - [11/Feb/2010:00:51:21 -0800] "POST /client 
>> HTTP/1.1" 500 1972 - "MPI Test MTTGDS Reporter,gzip(gfe)" 
>> "open-mpi-mtt.appspot.com"
>>  • E02-11 12:51AM 21.241
>> Property data_message_size is 667 bytes long; it must be 500 or less. 
>> Consider Text instead, which can store strings of any length.
>> Traceback (most recent call last):
>>   File 
>> "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 
>> 509, in __call__
>> handler.post(*groups)
>>   File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", 
>> line 961, in post
>> status = self._submit();
>>   File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", 
>> line 485, in _submit
>> test_run_phase.put()
>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>> line 801, in put
>> self._populate_internal_entity()
>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>> line 779, in _populate_internal_entity
>> self._entity = self._populate_entity(_entity_class=_entity_class)
>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>> line 839, in _populate_entity
>> self._to_entity(entity)
>>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", 
>> line 1465, in _to_entity
>> entity[key] = value
>>   File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 
>> 492, in __setitem__
>> datastore_types.ValidateProperty(name, value)
>>   File 
>> "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 
>> 1290, in ValidateProperty
>> prop_validator(name, v)
>>   File 
>> "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 
>> 1181, in ValidatePropertyString
>> ValidateStringLength(name, value, max_len=_MAX_STRING_LENGTH)
>>   File 
>> "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 
>> 1171, in ValidateStringLength
>> (name, len(value), max_len))
>> BadValueError: Property data_message_size is 667 bytes long; it must be 500 
>> or less. Consider Text instead, which can store strings of any length.
>> -
>> 
>>   
>> 
> 
> 
> __ Information from ESET NOD32 Antivirus, version of virus signature 
> database 4861 (20100212) __
> 
> The message was checked by ESET NOD32 Antivirus.
> 
> http://www.esetnod32.ru


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[MTT devel] extracting mca params

2010-02-12 Thread Jeff Squyres
I see blocks of code like this in MTTGDS.pm and some of the new analyzers:

my $mca = $report->{command};
$mca =~ s/^\s*$report->{launcher}//;
$mca =~ s/\s(-n|--n|-np|--np)\s\S+//;
$mca =~ s/\s(-rf|--rankfile)\s\S+//;
$mca =~ s/\s(-hostfile|--hostfile)\s\S+//;
$mca =~ s/\s(-host|--host)\s\S+//;
$mca =~ s/\s*(-x)\s\S+//g;
$mca =~ s/\s[\S\/\\]*$report->{test_name}.*//;
$mca =~ s/\s\s/ /g;
$mca =~ s/^\s+|\s+$//g;
$phase_form->{mpi_mca} = $mca;

The problem I ran into is that at least some of the OMPI test suite executables 
have perl special characters in it (e.g., mpi2c++).  It looks to me like the 
goal of this block is to obtain a list of the MCA parameters on the command 
line.  Right?  If so, I think this is a safer method, and will only get mca 
params (not other random mpirun parameters that don't happen to be listed in 
the regexps above):

my @params;
my $cmd = $report->{command};
while ($cmd =~ s/\s([\-]*-mca)\s+(\S+)\s+(\S+)\s/ /) {
push(@params, "$1 $2 $3");
}
$phase_form->{mpi_mca} = join(' ', @params);

That being said, this *is* an Open MPI-specific field.  MTT was supposed to 
remain MPI-agnostic.  

Is there a reason you guys didn't use the MPI Details fields "parameters" and 
"network" for this purpose?  We have an MPI::OMPI module which is perfect for 
doing OMPI-specific things.  Using the MPI:: modules, you can keep MTT core 
clean so that we can run other MPI implementations through MTT as well.

I just committed:

https://svn.open-mpi.org/trac/mtt/changeset/1348
https://svn.open-mpi.org/trac/mtt/changeset/1349

that puts the above logic in 
MTT::Values::Functions::MPI::OMPI::find_mca_params() and converted MTTGDS and 
all the performance analyzers to call this function.

I also added a check in MTTGDS.pm to ensure that the analyzer pm has PreReport 
before trying to call it.

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [MTT devel] More GDS questions

2010-02-12 Thread Igor Ivanov




Hi Jeff,

You touched a sore point. App engine forums are in filled  the
questions as yours. 
I can not know clear answer now.

Igor

Jeff Squyres wrote:

  Igor et al. -- 

1. I'm not sure you saw Ethan's and my posts from the past day or so about GDS on the mtt-devel list; it just occurred to me that I don't know if you're members of the list or not.  We've posted a few questions and comments that you may not have received if you're not on the list:

http://www.open-mpi.org/community/lists/mtt-devel/2010/02/index.php

2. I'm still looking into the perl syntax error that caused my Big Submit to GDS to fail.  But looking at the Google logs, it looks like at least *some* of my test run results made it up to GDS.  There was a BIG spike in CPU usage (3.2 hours of CPU time!) when it submitted -- see the attached CPU usage graph from the apps dashboard.

Does anyone know why it takes so much CPU just to submit data to GDS?  3.2 CPU hours is a LOT!

It makes me a bit concerned that only part of a single Cisco MTT run submit checked through almost half of our daily CPU quota (6.5 CPU hours/day).  Is there any way to reduce the amount of CPU necessary just to submit data?

  
  
  
  



__ Information from ESET NOD32 Antivirus, version of virus signature database 4861 (20100212) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru






Re: [MTT devel] GDS errors

2010-02-12 Thread Igor Ivanov




Look at message string.
BadValueError: Property data_message_size
is 667 bytes long; it must be 500 or less. Consider Text instead, which
can store strings of any length.

Regards,
Igor


Jeff Squyres wrote:

  Looking in the appspot dashboard, I see a bunch of errors when Cisco tried to submit test run data.  There's a few random errors, but a bunch that look like what I pasted below.  How do I diagnose this further?  Clearly, some field is too long -- how do I find out which one?

-
	• 128.107.241.170 - - [11/Feb/2010:00:51:21 -0800] "POST /client HTTP/1.1" 500 1972 - "MPI Test MTTGDS Reporter,gzip(gfe)" "open-mpi-mtt.appspot.com"
	• E02-11 12:51AM 21.241
Property data_message_size is 667 bytes long; it must be 500 or less. Consider Text instead, which can store strings of any length.
Traceback (most recent call last):
  File "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 509, in __call__
handler.post(*groups)
  File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", line 961, in post
status = self._submit();
  File "/base/data/home/apps/open-mpi-mtt/1.337140739868725607/main.py", line 485, in _submit
test_run_phase.put()
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 801, in put
self._populate_internal_entity()
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 779, in _populate_internal_entity
self._entity = self._populate_entity(_entity_class=_entity_class)
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 839, in _populate_entity
self._to_entity(entity)
  File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1465, in _to_entity
entity[key] = value
  File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 492, in __setitem__
datastore_types.ValidateProperty(name, value)
  File "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1290, in ValidateProperty
prop_validator(name, v)
  File "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1181, in ValidatePropertyString
ValidateStringLength(name, value, max_len=_MAX_STRING_LENGTH)
  File "/base/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1171, in ValidateStringLength
(name, len(value), max_len))
BadValueError: Property data_message_size is 667 bytes long; it must be 500 or less. Consider Text instead, which can store strings of any length.
-

  



__ Information from ESET NOD32 Antivirus, version of virus signature database 4861 (20100212) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru






Re: [MTT devel] MTTGDS issues

2010-02-12 Thread Igor Ivanov





Jeff Squyres wrote:

  1. Can you guys describe what MTTGDS expects from the performance analyzer modules?

I ran a bunch of netpipe results and MTTGDS performance analyzer failed to run -- did you guys change the specifications for the performance analyzer modules?

*** WARNING: Could not run module
MTT::Test::Analyze::Performance::NetPipe:PreReport: Undefined
subroutine &MTT::Test::Analyze::Performance::NetPipe::PreReport called
at (eval 335838) line 1.
  

[II] It is the same thing as with "analyze_module" warning you got
before. There are set of values that we would like to store in
datastore but they do not exist in original analyzer. To avoid conflict
with original mtt procedure _pre_process_phase call is done to get info
from special function integrated into original analyzers as PreReport.

  
2. I ran 24+ hours of MTT tests and the MTTGDS reporter failed to submit the results.  :-(

*** ERROR: Module aborted: MTT::Reporter::MTTGDS:Finalize: Nested
quantifiers in regex; marked by <-- HERE in m/\s[\S/\\]*mpi2c++ <--
HERE _test.*/ at /home/jsquyres/svn/mtt/lib/MTT/Reporter/MTTGDS.pm line
498.

Some of my INI section names have special characters in them (e.g., "mpi2c++"); it looks like this might be what tripped up some regexp.  I'll have a look at this one now...

Is there a way to re-submit my data to GDS?

  

[II] Yes. You can upload data (datafile.yaml) from local place to
datastore using bquery.pl --upload. Place for collected data can be
controlled by "repository_tempdir", "repository_dirname_prefix"
[VBench]
repository_tempdir=&scratch_root()/gds_data
repository_dirname_prefix=gds
submit_failed_results_to_gds=0


__ Information from ESET NOD32 Antivirus, version of virus signature database 4859 (20100211) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru






Re: [MTT devel] 500 Internal Server Error from open-mpi-mtt.appspot.com

2010-02-12 Thread Igor Ivanov




Jeff,

Yes, you are right.
I read this mail after sent answer to original from Ethan:)
So, read my answer.

Igor

Jeff Squyres wrote:

  After looking through the logs, Ethan and I *think* that this was just a query that was too large (i.e., it used too much CPU, and therefore it was killed).

Can someone with a little more knowledge than us have a look at the logs and let us know if we're right?


On Feb 11, 2010, at 2:05 PM, Ethan Mallove wrote:

  
  
Hi,

I'm getting a 500 Internal Server Error using bquery.pl.  I can --ping
successfully:

  $ client/bquery.pl --ping --server=http://open-mpi-mtt.appspot.com/ --password=x --username=sun
  Ping is successful.

But an actual query gets an error:

  $ client/bquery.pl --server=http://open-mpi-mtt.appspot.com/ --password=x --username=sun --query --gqls="select * from TestRunPhase where status=1" --dir="bquery-test"
  Error at http://open-mpi-mtt.appspot.com//client
  500 Internal Server Error

-Ethan
___
mtt-devel mailing list
mtt-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel


  
  

  



__ Information from ESET NOD32 Antivirus, version of virus signature database 4859 (20100211) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru






Re: [MTT devel] 500 Internal Server Error from open-mpi-mtt.appspot.com

2010-02-12 Thread Igor Ivanov




Hi Ethan,

You got 500 Error because "The timeout exception handling" raised.
Full description looks as "This request used a high amount of CPU, and
was roughly 1.9 times over the average request CPU limit. High CPU
requests have a small quota, and if you exceed this quota, your app
will be temporarily disabled."

There are set of limitations and quotas that exists in Google App
Engine
(http://code.google.com/appengine/docs/python/runtime.html#Responses).
You request matched one of them. 


Common recommendation to avoid this issue:
1. use --no-raw (get data w/o zip archive), --no-ref (no references to
other models(tables)), --no-limit (will try to get data by portion)
options of bquery;
for example: --gqls="select * from TestRunPhase where status=1" 
--no-raw --no-ref --no-limt 
2. construct --gqls query more detail using additional condition in
where clause;
note: "select * from TestRunPhase where status=1" query is trying to
get ~ 90% all data stored in datastore.
3. active using 'tag' property allow to filter needed data;

Regards,
Igor



Ethan Mallove wrote:

  Hi,

I'm getting a 500 Internal Server Error using bquery.pl.  I can --ping
successfully:

  $ client/bquery.pl --ping --server=http://open-mpi-mtt.appspot.com/ --password=x --username=sun 
  Ping is successful.

But an actual query gets an error:

  $ client/bquery.pl --server=http://open-mpi-mtt.appspot.com/ --password=x --username=sun --query --gqls="select * from TestRunPhase where status=1" --dir="bquery-test"
  Error at http://open-mpi-mtt.appspot.com//client
  500 Internal Server Error

-Ethan
___
mtt-devel mailing list
mtt-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel

__ Information from ESET NOD32 Antivirus, version of virus signature database 4859 (20100211) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru




  



__ Information from ESET NOD32 Antivirus, version of virus signature database 4859 (20100211) __

The message was checked by ESET NOD32 Antivirus.

http://www.esetnod32.ru