[jira] [Commented] (SDAP-47) Update NEXUS CLI to support datainbounds algorithm

2018-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SDAP-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412387#comment-16412387
 ] 

ASF GitHub Bot commented on SDAP-47:


ntquach closed pull request #12: SDAP-47 Update NEXUS CLI to support 
datainbounds algorithm
URL: https://github.com/apache/incubator-sdap-nexus/pull/12
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/client/nexuscli/__init__.py b/client/nexuscli/__init__.py
index d6deec9..250b474 100644
--- a/client/nexuscli/__init__.py
+++ b/client/nexuscli/__init__.py
@@ -18,3 +18,4 @@
 from nexuscli.nexuscli import time_series
 from nexuscli.nexuscli import dataset_list
 from nexuscli.nexuscli import daily_difference_average
+from nexuscli.nexuscli import subset
diff --git a/client/nexuscli/nexuscli.py b/client/nexuscli/nexuscli.py
index 9bf51e6..b542dba 100644
--- a/client/nexuscli/nexuscli.py
+++ b/client/nexuscli/nexuscli.py
@@ -43,6 +43,15 @@
 __pdoc__['TimeSeries.minimum'] = "`numpy` array containing minimums"
 __pdoc__['TimeSeries.maximum'] = "`numpy` array containing maximums"
 
+Point = namedtuple('Point', ('time', 'latitude', 'longitude', 'variable'))
+Point.__doc__ = '''\
+An object containing Point attributes.
+'''
+__pdoc__['Point.time'] = "time value as `datetime` object"
+__pdoc__['Point.latitude'] = "latitude value"
+__pdoc__['Point.longitude'] = "longitude value"
+__pdoc__['Point.variable'] = "dictionary of variable values"
+
 ISO_FORMAT = "%Y-%m-%dT%H:%M:%SZ"
 
 target = 'http://localhost:8083'
@@ -207,3 +216,57 @@ def time_series(datasets, bounding_box, start_datetime, 
end_datetime, spark=Fals
 )
 
 return time_series_result
+
+
+def subset(dataset, bounding_box, start_datetime, end_datetime, parameter, 
metadata_filter):
+"""
+Fetches point values for a given dataset and geographical area or metadata 
criteria and time range.
+
+__dataset__ Name of the dataset as a String  
+__bounding_box__ Bounding box for area of interest as a 
`shapely.geometry.polygon.Polygon`  
+__start_datetime__ Start time as a `datetime.datetime`  
+__end_datetime__ End time as a `datetime.datetime`  
+__parameter__ The parameter of interest. One of 'sst', 'sss', 'wind' or 
None  
+__metadata_filter__ List of key:value String metadata criteria  
+
+__return__ List of `nexuscli.nexuscli.Point` namedtuples
+"""
+url = "{}/datainbounds?".format(target)
+
+params = {
+'ds': dataset,
+'startTime': start_datetime.strftime(ISO_FORMAT),
+'endTime': end_datetime.strftime(ISO_FORMAT),
+'parameter': parameter,
+}
+if bounding_box:
+params['b'] = ','.join(str(b) for b in bounding_box.bounds)
+else:
+if metadata_filter and len(metadata_filter) > 0:
+params['metadataFilter'] = metadata_filter
+
+response = session.get(url, params=params)
+response.raise_for_status()
+response = response.json()
+
+data = np.array(response['data']).flatten()
+
+assert len(data) > 0, "No data found in {} between {} and {} for Datasets 
{}.".format(bounding_box.wkt if bounding_box is not None else metadata_filter,
+   
   start_datetime.strftime(
+   
   ISO_FORMAT),
+   
   end_datetime.strftime(
+   
   ISO_FORMAT),
+   
   dataset)
+
+subset_result = []
+for d in data:
+subset_result.append(
+Point(
+time=datetime.utcfromtimestamp(d['time']).replace(tzinfo=UTC),
+longitude=d['longitude'],
+latitude=d['latitude'],
+variable=d['data'][0]
+)
+)
+
+return subset_result
diff --git a/client/nexuscli/test/nexuscli_test.py 
b/client/nexuscli/test/nexuscli_test.py
index d202a3f..61a7e46 100644
--- a/client/nexuscli/test/nexuscli_test.py
+++ b/client/nexuscli/test/nexuscli_test.py
@@ -39,3 +39,13 @@ def test_daily_difference_average(self):
datetime(2013, 1, 1), 
datetime(2014, 12, 31))
 
 self.assertEqual(1, len(ts))
+
+def test_data_in_bounds_with_metadata_filter(self):
+subset = nexuscli.subset("MUR-JPL-L4-GLOB-v4.1", None, datetime(2018, 
1, 1), datetime(2018, 1, 2),
+ None, 

[jira] [Commented] (SDAP-47) Update NEXUS CLI to support datainbounds algorithm

2018-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SDAP-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412345#comment-16412345
 ] 

ASF GitHub Bot commented on SDAP-47:


fgreg commented on a change in pull request #12: SDAP-47 Update NEXUS CLI to 
support datainbounds algorithm
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/12#discussion_r176896255
 
 

 ##
 File path: client/nexuscli/nexuscli.py
 ##
 @@ -230,7 +229,7 @@ def data_in_bounds(dataset, bounding_box, start_datetime, 
end_datetime, paramete
 __parameter__ Name of the dataset as a String  
 
 Review comment:
   Update description of the `parameter` parameter


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update NEXUS CLI to support datainbounds algorithm
> --
>
> Key: SDAP-47
> URL: https://issues.apache.org/jira/browse/SDAP-47
> Project: Apache Science Data Analytics Platform
>  Issue Type: New Feature
>  Components: nexus
>Reporter: Nga Chung
>Assignee: Nga Chung
>Priority: Major
>
> Update the NEXUS CLI to support call to datainbounds algorithm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-47) Update NEXUS CLI to support datainbounds algorithm

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SDAP-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399519#comment-16399519
 ] 

ASF GitHub Bot commented on SDAP-47:


fgreg commented on a change in pull request #12: SDAP-47 Update NEXUS CLI to 
support datainbounds algorithm
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/12#discussion_r174621731
 
 

 ##
 File path: client/nexuscli/nexuscli.py
 ##
 @@ -207,3 +217,59 @@ def time_series(datasets, bounding_box, start_datetime, 
end_datetime, spark=Fals
 )
 
 return time_series_result
+
+
+def data_in_bounds(dataset, bounding_box, start_datetime, end_datetime, 
parameter, metadata_filter):
+"""
+Fetches point values for a given dataset and geographical area or metadata 
criteria and time range.
+
+__dataset__ Name of the dataset as a String  
+__bounding_box__ Bounding box for area of interest as a 
`shapely.geometry.polygon.Polygon`  
+__start_datetime__ Start time as a `datetime.datetime`  
+__end_datetime__ End time as a `datetime.datetime`  
+__parameter__ Name of the dataset as a String  
+__metadata_filter__ List of key:value String metadata criteria  
+
+__return__ List of `nexuscli.nexuscli.TimeSeries` namedtuples
+"""
+url = "{}/datainbounds?".format(target)
+
+params = {
+'ds': dataset,
+'startTime': start_datetime.strftime(ISO_FORMAT),
+'endTime': end_datetime.strftime(ISO_FORMAT),
+'parameter': parameter,
+'metadataFilter': metadata_filter,
+}
+if bounding_box:
+params['b'] = ','.join(str(b) for b in bounding_box.bounds)
+
+response = session.get(url, params=params)
+response.raise_for_status()
+response = response.json()
+
+data = np.array(response['data']).flatten()
+
+assert len(data) > 0, "No data found in {} between {} and {} for Datasets 
{}.".format(bounding_box.wkt if bounding_box is not None else metadata_filter,
 
 Review comment:
   Looks like the logic as implemented will allow
   
   1. `bounding_box` only
   2. `metadataFilter` only
   3. `bounding_box` and `metadatafilter`
   
   But this message only checks for 1 & 2


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update NEXUS CLI to support datainbounds algorithm
> --
>
> Key: SDAP-47
> URL: https://issues.apache.org/jira/browse/SDAP-47
> Project: Apache Science Data Analytics Platform
>  Issue Type: New Feature
>Reporter: Nga Chung
>Assignee: Nga Chung
>Priority: Major
>
> Update the NEXUS CLI to support call to datainbounds algorithm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-47) Update NEXUS CLI to support datainbounds algorithm

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SDAP-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399517#comment-16399517
 ] 

ASF GitHub Bot commented on SDAP-47:


fgreg commented on a change in pull request #12: SDAP-47 Update NEXUS CLI to 
support datainbounds algorithm
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/12#discussion_r174620542
 
 

 ##
 File path: client/nexuscli/nexuscli.py
 ##
 @@ -207,3 +217,59 @@ def time_series(datasets, bounding_box, start_datetime, 
end_datetime, spark=Fals
 )
 
 return time_series_result
+
+
+def data_in_bounds(dataset, bounding_box, start_datetime, end_datetime, 
parameter, metadata_filter):
 
 Review comment:
   Should we just name the method subset?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update NEXUS CLI to support datainbounds algorithm
> --
>
> Key: SDAP-47
> URL: https://issues.apache.org/jira/browse/SDAP-47
> Project: Apache Science Data Analytics Platform
>  Issue Type: New Feature
>Reporter: Nga Chung
>Assignee: Nga Chung
>Priority: Major
>
> Update the NEXUS CLI to support call to datainbounds algorithm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-47) Update NEXUS CLI to support datainbounds algorithm

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SDAP-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399518#comment-16399518
 ] 

ASF GitHub Bot commented on SDAP-47:


fgreg commented on a change in pull request #12: SDAP-47 Update NEXUS CLI to 
support datainbounds algorithm
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/12#discussion_r174620404
 
 

 ##
 File path: client/nexuscli/nexuscli.py
 ##
 @@ -207,3 +217,59 @@ def time_series(datasets, bounding_box, start_datetime, 
end_datetime, spark=Fals
 )
 
 return time_series_result
+
+
+def data_in_bounds(dataset, bounding_box, start_datetime, end_datetime, 
parameter, metadata_filter):
+"""
+Fetches point values for a given dataset and geographical area or metadata 
criteria and time range.
+
+__dataset__ Name of the dataset as a String  
+__bounding_box__ Bounding box for area of interest as a 
`shapely.geometry.polygon.Polygon`  
+__start_datetime__ Start time as a `datetime.datetime`  
+__end_datetime__ End time as a `datetime.datetime`  
+__parameter__ Name of the dataset as a String  
+__metadata_filter__ List of key:value String metadata criteria  
+
+__return__ List of `nexuscli.nexuscli.TimeSeries` namedtuples
 
 Review comment:
   Returns list of `Subset` namedtuples


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update NEXUS CLI to support datainbounds algorithm
> --
>
> Key: SDAP-47
> URL: https://issues.apache.org/jira/browse/SDAP-47
> Project: Apache Science Data Analytics Platform
>  Issue Type: New Feature
>Reporter: Nga Chung
>Assignee: Nga Chung
>Priority: Major
>
> Update the NEXUS CLI to support call to datainbounds algorithm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SDAP-47) Update NEXUS CLI to support datainbounds algorithm

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SDAP-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399516#comment-16399516
 ] 

ASF GitHub Bot commented on SDAP-47:


fgreg commented on a change in pull request #12: SDAP-47 Update NEXUS CLI to 
support datainbounds algorithm
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/12#discussion_r174625255
 
 

 ##
 File path: client/nexuscli/nexuscli.py
 ##
 @@ -207,3 +217,59 @@ def time_series(datasets, bounding_box, start_datetime, 
end_datetime, spark=Fals
 )
 
 return time_series_result
+
+
+def data_in_bounds(dataset, bounding_box, start_datetime, end_datetime, 
parameter, metadata_filter):
+"""
+Fetches point values for a given dataset and geographical area or metadata 
criteria and time range.
+
+__dataset__ Name of the dataset as a String  
+__bounding_box__ Bounding box for area of interest as a 
`shapely.geometry.polygon.Polygon`  
+__start_datetime__ Start time as a `datetime.datetime`  
+__end_datetime__ End time as a `datetime.datetime`  
+__parameter__ Name of the dataset as a String  
+__metadata_filter__ List of key:value String metadata criteria  
+
+__return__ List of `nexuscli.nexuscli.TimeSeries` namedtuples
+"""
+url = "{}/datainbounds?".format(target)
+
+params = {
+'ds': dataset,
+'startTime': start_datetime.strftime(ISO_FORMAT),
+'endTime': end_datetime.strftime(ISO_FORMAT),
+'parameter': parameter,
+'metadataFilter': metadata_filter,
+}
+if bounding_box:
+params['b'] = ','.join(str(b) for b in bounding_box.bounds)
+
+response = session.get(url, params=params)
+response.raise_for_status()
+response = response.json()
+
+data = np.array(response['data']).flatten()
+
+assert len(data) > 0, "No data found in {} between {} and {} for Datasets 
{}.".format(bounding_box.wkt if bounding_box is not None else metadata_filter,
+   
   start_datetime.strftime(
+   
   ISO_FORMAT),
+   
   end_datetime.strftime(
+   
   ISO_FORMAT),
+   
   dataset)
+
+variable_values = {}
+for variable in data[0]['data'][0].keys():
+variable_values[variable] = np.array([d['data'][0][variable] for d in 
data])
+
+subset_result = []
+subset_result.append(
 
 Review comment:
   This doesn't seem quite right. I think the result of a subset should be a 
list of points. Arranging the time/data into np arrays worked for the Time 
Series code because the nature of Time Series is 1 value per timestep; but I 
don't think you can rely on that here. Thoughts?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update NEXUS CLI to support datainbounds algorithm
> --
>
> Key: SDAP-47
> URL: https://issues.apache.org/jira/browse/SDAP-47
> Project: Apache Science Data Analytics Platform
>  Issue Type: New Feature
>Reporter: Nga Chung
>Assignee: Nga Chung
>Priority: Major
>
> Update the NEXUS CLI to support call to datainbounds algorithm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)