Re: CLIMATE-744 Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson
Here's I've stripped it down to a very basic curl command and run it on a
variety of hosts.  There's no header or footer in the file that would
contribute to the discrepancy.

Running the following 3 times:

curl "
https://rcmes.jpl.nasa.gov/query-api/query.php?datasetId=3=36=-45.76=42.24=-24.64=60.28=19900101TZ=20071231TZ;
> curl.1

On my Mac:

ls -l curl.*

rw-rr-  1 michaelanderson  staff  419156308 Jan 27 12:48 curl.1
rw-rr-  1 michaelanderson  staff  432335472 Jan 27 12:54 curl.2
rw-rr-  1 michaelanderson  staff  422500568 Jan 27 12:59 curl.3

On a pretty large AWS with RHEL 7:

rw-rw-r-. 1 ec2-user ec2-user 546429902 Jan 27 18:12 curl.1
rw-rw-r-. 1 ec2-user ec2-user 578373399 Jan 27 18:15 curl.2
rw-rw-r-. 1 ec2-user ec2-user 484616136 Jan 27 18:17 curl.3

On an AWS with 16 cpu and 122 GB RAM on RHEL 7:

-rw-rw-r--. 1 ec2-user ec2-user 659658707 Jan 27 18:30 curl.1
-rw-rw-r--. 1 ec2-user ec2-user 671670784 Jan 27 18:32 curl.2
-rw-rw-r--. 1 ec2-user ec2-user 686702730 Jan 27 18:34 curl.3

On an AWS with 64 CPU, 488 GB RAM and 25GB network on RHEL 7:

-rw-rw-r--. 1 ec2-user ec2-user 675561581 Jan 27 18:41 curl.1
-rw-rw-r--. 1 ec2-user ec2-user 688015459 Jan 27 18:43 curl.2
-rw-rw-r--. 1 ec2-user ec2-user 633800036 Jan 27 18:45 curl.3


On Thu, Jan 25, 2018 at 12:04 AM, lewis john mcgibbney 
wrote:

> Hi Michael,
> To answer your final question, I have absolutely no idea. I've forwarded
> this to the RCMES team however and will hopefully have a response from you
> sometime soon.
> Lewis
>
> On Mon, Jan 22, 2018 at 8:32 PM, 
> wrote:
>
> > From: Michael Anderson 
> > To: dev@climate.apache.org
> > Cc:
> > Bcc:
> > Date: Sun, 21 Jan 2018 19:30:25 -0500
> > Subject: CLIMATE-744 Cannot load TRMM data from RCMED
> > I was working on this JIRA:  CLIMATE-744
> > .  A call to RCMED
> for
> > TRMM throws an error about not being able to reshape the data.
> >
> > I wrote the following based on the info given in the JIRA:
> >
> > from datetime import datetime as dt
> > start_time = dt.strptime('1990-01-01', '%Y-%m-%d')
> > end_time = dt.strptime('2007-12-31', '%Y-%m-%d')
> > TRMM = rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28,
> > start_time, end_time)
> >
> > In rcmed.py  _get_data, the call to urlopen returned a different amount
> of
> > data on each run of the program.
> >
> > I repeatedly ran the following URL in my browser and each run also
> returned
> > a different amount of data.
> >
> > https://rcmes.jpl.nasa.gov/query-api/query.php?datasetId=
> > 3=36=-45.76=42.24=-24.64&
> > lonMax=60.28=19900101TZ=20071231TZ
> >
> > When I constrain the dates to a smaller range, one year, the same number
> of
> > results are returned consistently with each run of the program.
> >
> > Does RCMED, by any chance, have any constraints on the amount of data
> that
> > can be returned?
> >
> >
>


[jira] [Comment Edited] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342258#comment-16342258
 ] 

Michael Anderson edited comment on CLIMATE-744 at 1/27/18 6:46 PM:
---

I ran the same curl command on an AWS with 64 CPU, 488 GB RAM and 25GB network 
on RHEL 7:

[ec2-user@ip-172-31-16-17 ~]$ ls -l

total 1950572

-rw-rw-r--. 1 ec2-user ec2-user 675561581 Jan 27 18:41 curl.1

-rw-rw-r--. 1 ec2-user ec2-user 688015459 Jan 27 18:43 curl.2

-rw-rw-r--. 1 ec2-user ec2-user 633800036 Jan 27 18:45 curl.3


was (Author: michael.arthur.ander...@gmail.com):
I ran the same curl command on an AWS with 64 CPU, 488 GB RAM and 25GB network:

[ec2-user@ip-172-31-16-17 ~]$ ls -l

total 1950572

-rw-rw-r--. 1 ec2-user ec2-user 675561581 Jan 27 18:41 curl.1

-rw-rw-r--. 1 ec2-user ec2-user 688015459 Jan 27 18:43 curl.2

-rw-rw-r--. 1 ec2-user ec2-user 633800036 Jan 27 18:45 curl.3

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342258#comment-16342258
 ] 

Michael Anderson commented on CLIMATE-744:
--

I ran the same curl command on an AWS with 64 CPU, 488 GB RAM and 25GB network:

[ec2-user@ip-172-31-16-17 ~]$ ls -l

total 1950572

-rw-rw-r--. 1 ec2-user ec2-user 675561581 Jan 27 18:41 curl.1

-rw-rw-r--. 1 ec2-user ec2-user 688015459 Jan 27 18:43 curl.2

-rw-rw-r--. 1 ec2-user ec2-user 633800036 Jan 27 18:45 curl.3

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342257#comment-16342257
 ] 

Michael Anderson commented on CLIMATE-744:
--

I ran the same curl command on an AWS with 16 cpu and 122 GB RAM on RHEL 7.  
Here are the results:

 

[ec2-user@ip-172-31-2-252 ~]$ ls -l

total 1970744

-rw-rw-r--. 1 ec2-user ec2-user 659658707 Jan 27 18:30 curl.1

-rw-rw-r--. 1 ec2-user ec2-user 671670784 Jan 27 18:32 curl.2

-rw-rw-r--. 1 ec2-user ec2-user 686702730 Jan 27 18:34 curl.3

[ec2-user@ip-172-31-2-252 ~]$ 

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342249#comment-16342249
 ] 

Michael Anderson commented on CLIMATE-744:
--

Ran the same three curl commands on a pretty large AWS host with RHEL 7:

 

[ec2-user@ip-172-31-53-246 ~]$ ls -l

total 1571704

-rw-rw-r--. 1 ec2-user ec2-user 546429902 Jan 27 18:12 curl.1

-rw-rw-r--. 1 ec2-user ec2-user 578373399 Jan 27 18:15 curl.2

-rw-rw-r--. 1 ec2-user ec2-user 484616136 Jan 27 18:17 curl.3

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342245#comment-16342245
 ] 

Michael Anderson commented on CLIMATE-744:
--

Running this on my mac:

 

curl 
"https://rcmes.jpl.nasa.gov/query-api/query.php?datasetId=3=36=-45.76=42.24=-24.64=60.28=19900101TZ=20071231TZ;
 > curl.1

curl 
"https://rcmes.jpl.nasa.gov/query-api/query.php?datasetId=3=36=-45.76=42.24=-24.64=60.28=19900101TZ=20071231TZ;
 > curl.2

curl 
"https://rcmes.jpl.nasa.gov/query-api/query.php?datasetId=3=36=-45.76=42.24=-24.64=60.28=19900101TZ=20071231TZ;
 > curl.3

ls -l curl.*

-rw-r--r--  1 michaelanderson  staff  419156308 Jan 27 12:48 curl.1

-rw-r--r--  1 michaelanderson  staff  432335472 Jan 27 12:54 curl.2

-rw-r--r--  1 michaelanderson  staff  422500568 Jan 27 12:59 curl.3

 

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342234#comment-16342234
 ] 

Michael Anderson commented on CLIMATE-744:
--

I changed my local copy of rcmed.py as follows:

chunk = string.read()

while chunk:
     data_string += chunk
     chunk = string.read()

Each run of the program brings back a different amount of data consistent with 
my previous comments.

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Anderson updated CLIMATE-744:
-
Comment: was deleted

(was: [~lewi...@apache.org] any feedback from the RCMED team?)

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342218#comment-16342218
 ] 

Michael Anderson commented on CLIMATE-744:
--

I changed my local copy of rcmed.py as follows:

string = urlopen(url)
data_string = ''
while True:
     data = string.read(1024)
     if data:
          data_string += data
    else:
         break

Each run of the program brings back a different amount of data consistent with 
my previous comments.

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLIMATE-744) Cannot load TRMM data from RCMED

2018-01-27 Thread Michael Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/CLIMATE-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342217#comment-16342217
 ] 

Michael Anderson commented on CLIMATE-744:
--

[~lewi...@apache.org] any feedback from the RCMED team?

> Cannot load TRMM data from RCMED
> 
>
> Key: CLIMATE-744
> URL: https://issues.apache.org/jira/browse/CLIMATE-744
> Project: Apache Open Climate Workbench
>  Issue Type: Bug
>Reporter: Huikyo Lee
>Assignee: Michael Anderson
>Priority: Blocker
> Fix For: 1.3.0
>
>
> For some reasons, rcmed.parameter_dataset causes errors when loading TRMM 
> precipitation data. 
> start_time = 1990-01-01
> end_time = 2007-12-31
> rcmed.parameter_dataset(3, 36, -45.76, 42.24, -24.64, 60.28, start_time, 
> end_time)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Dropping support for Python 2.7

2018-01-27 Thread Goodman, Alexander (398K)
Hi Michael,

We actually discussed this in great detail today internally at JPL. I have
not hosted the code anywhere yet, but am planning on putting it up on
GitHub fairly soon once I have a more complete prototype and end-to-end
example. When it's ready, I'll post a more formal announcement on the
mailing list.

Thanks,
Alex

On Fri, Jan 26, 2018 at 5:45 PM, Michael Anderson <
michael.arthur.ander...@gmail.com> wrote:

> Alex,
>
> Have you posted the pandas version anywhere yet?  I'd be happy to lend a
> hand with the conversion if you have.
>
> Michael A. Anderson
>
> On Thu, Jan 18, 2018 at 10:17 AM, Goodman, Alexander (398K) <
> alexander.good...@jpl.nasa.gov> wrote:
>
> > Hi Michael,
> >
> > This isn't something we have made any hard decisions about, but if you
> > asked for my personal impression, I think just about everything outside
> the
> > core ocw library is more or less in a deprecated state. The original
> > developers of most of the code in these subfolders have stopped
> > contributing to the project long ago, and maintaining them has not been a
> > high priority for the current developers. This sort of includes the UI
> too
> > since pretty much all of its features can be easily replicated (and much
> > more easily maintained) inside a jupyter notebook or even a bokeh server
> > (if you are familiar with those libraries) as it avoids the headaches of
> > directly dealing with Node.js dependencies. This has actually been the
> > subject of (mostly unsuccessful) proposals. Sort of like for my current
> > experiment with pandas and xarray, anyone is of course free to work on
> > something like this given that we are an open source project. It's just
> > that for now, the rest of us have not had the time (or funding) to fully
> > pursue it.
> >
> > So in regards to your question, the only thing external to the main ocw
> > library which we still actively maintain is the RCMES subfolder, and the
> > configuration file system it contains therein. But given that RCMES
> itself
> > is directly tied to JPL, this discussion has actually lead me to believe
> > that it might be best for us to consider trimming down the main
> repository
> > to only include the main library and examples, while keeping everything
> > RCMES-related (included issue-tracking ) as a separate set of
> repositories.
> > This is something that we should discuss some more in person during our
> > planned dev meeting (@Kyo and Lewis).
> >
> > By the way, thank you very much for contributing to this discussion.
> > Regrettably this dev mailing list in particular has been very quiet
> lately
> > but at least now we have had an opportunity to air out some important
> > concerns regarding the future of the project.
> >
> > Thanks,
> > Alex
> >
> > On Wed, Jan 17, 2018 at 2:49 AM, Michael Anderson <
> > michael.arthur.ander...@gmail.com> wrote:
> >
> > > On a related topic, what's the roadmap for the features outside of main
> > ocw
> > > folder?  Which of those will fold into the main ocw library, which will
> > > remain as they are and which will be deprecated?
> > >
> > >
> > > On Tue, Jan 16, 2018 at 11:33 PM, Goodman, Alexander (398K) <
> > > alexander.good...@jpl.nasa.gov> wrote:
> > >
> > > > Hi Lewis,
> > > >
> > > > I think that is sort of a hasty action to take. If anything, I
> question
> > > > whether we should even require pydap as a dependency in the first
> place
> > > > since its main advantage (when used as a client-side tool, anyway) is
> > to
> > > > provide the ability to load OpenDAP datasets without the netcdf4
> > library
> > > > (which is a core dependency to ocw and pretty much any other earth
> > > science
> > > > data analysis library out there). In the short term I think we could
> > > modify
> > > > the conda recipe to drop pydap for python 2.7, as that should fix our
> > CI
> > > > problems.
> > > >
> > > > However in the long term we will have no choice but to drop support
> for
> > > > python 2, as most of the pydata stack (ie, almost all of our core
> > > > dependencies) will soon be doing so:
> > > >
> > > > http://www.python3statement.org/
> > > >
> > > > This was actually something I was going to discuss with you and Kyo
> in
> > my
> > > > planned meeting next week, as I have considered releasing the new
> > version
> > > > of ocw I am developing as a python 3-only codebase. In terms of our
> > > current
> > > > dependencies, only esgf-pyclient does not support python 3 (but this
> > > should
> > > > change soon enough).
> > > >
> > > > My one concern is that the scientific community in general has been
> > > > resistant to switching (don't want to link anything here, since a
> quick
> > > > google search will yield numerous blog posts on the subject). Having
> > > > personally been heavily involved in the process of making ocw python
> > 3, I
> > > > personally don't blame them since for those users python 3 just adds
> > > > needless compatibility issues without offering them a whole lot of
> >