You might want to have a look at mongodb's methods of find()ing
distinct() and aggregate() data.
Similar, but different, layout and it provides some nice json such as {
$in : { Experiment.Owner : ["bob","alice", ... ] }
and $gt : { Experiment.Runtime : 100 } etc so that logical operators
could be incorporated.
Just wanted to point this out as a possible defined interface
.. it may or may not be worth the effort to provide some subset of these
features.
-E.
Saminda Wijeratne wrote:
In an offline discussion with Chathuri, we came up with a simple way
for gateway developers to specify retrieving a filtered set of
experiment data based on the requirements of the gateway user.
eg:
SearchQuery query =
new SearchQuery({Experiment.Name, Experiment.Status},
{{Experiment.Owner,"bob"},{Experiment.Project,"manhattan"}{Experiment.Created,"03-19-2014",">"})
List<Experiment> experiments = thriftAPI.getExperiments(query);
/Sample syntax/
sq = new SearchQuery(<list of fields that needs to be filled>, <list
of filter criteria for the data>)
Further more the SearchQuery will have the capability to specify
paging (eg; experiments from 11 to 20).
wdyt?
Saminda
On Tue, Mar 18, 2014 at 3:04 PM, Lahiru Gunathilake <[email protected]
<mailto:[email protected]>> wrote:
On Tue, Mar 18, 2014 at 2:55 PM, Saminda Wijeratne
<[email protected] <mailto:[email protected]>> wrote:
For performance issues a gateway should only request a subset
of data of an experiment from Airavata server to compile a
summary view of the experiment to the scientist. Based on my
current experience I feel the following data is required to
compile a general summary.
- Exp ID/Name
- Status
- Project
- Owner/Creation time
+1, We can show minimum data and give detailed view on-demand. But
I think we need to support experiment search based on some
criteria and develop an index for each search criteria, because if
I ran jobs for 6 months and I would never want to get all my
experiments, even thought we make it super fast will minimum data.
ex: I want to search the experiments I ran last week, or with some
text base search.
We can use the above solution Saminda suggested in searching too.
Lahiru
We have seeing a direct relationship between the number of
experiment data records and the turnaround time. Thus we may
need some paging when requesting the experiment data.
wdyt? Your thoughts are welcome.
(Using JIRA [1] to track the status of this task)
A detailed discussion on the topic is on the Architecture
mailing list [2].
Regards,
Saminda
1. https://issues.apache.org/jira/browse/AIRAVATA-995
2.
http://www.mail-archive.com/[email protected]/msg00080.html
--
System Analyst Programmer
PTI Lab
Indiana University