Re: [External] Data Catalog into its own repo

2020-11-10 Thread Pierce, Marlon
Schema.org has a lot of uptake in the NSF EarthCube community 
(https://www.earthcube.org/p418) and the DataONE activity 
(https://ui.adsabs.harvard.edu/abs/2018AGUFMIN31B..29M/abstract).   

 

Marlon

 

 

From: Suresh Marru 
Reply-To: dev 
Date: Tuesday, November 10, 2020 at 1:13 PM
To: dev 
Subject: Re: [External] Data Catalog into its own repo

 

Thank you all for weighing in. I bootstrapped the repo with some basic 
information, please contribute to set the goals for this refactored sub system 
- https://github.com/apache/airavata-data-lake

 

I was doing a literature and software survey on any open source metadata and 
provenance systems we can integrate with, I found this survey paper useful - 
https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf

 

Seems like we can build fairly flexible and yet sophisticated capabilities 
using the schema.org JSON-LD schema - https://schema.org/ , thoughts?

 

Please contribute any other pointers we should brainstorm before proceeding. 

 

Cheers,

Suresh



On Nov 9, 2020, at 4:22 PM, Pierce, Marlon  wrote:

 

+1 for this refactoring. 

On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar"  wrote:

   This message was sent from a non-IU address. Please exercise caution when 
clicking links or opening attachments from external sources.
   ---

   +2. 

   Thanks,
   Sudhakar.

   On 11/9/20, 9:37 AM, "Marru, Suresh"  wrote:

   Hi All,

   Airavata Experiment catalog evolved over time and though the replica 
catalog and data product models are stand alone, they are buried to use them 
outside the experiment context. Any objections to refactor experiment catalog 
and make data catalog a first class repo, in the lines of Custos and MFT?

   Cheers,
   Suresh

 



smime.p7s
Description: S/MIME cryptographic signature


Re: [External] Data Catalog into its own repo

2020-11-10 Thread Suresh Marru
Also a pointer to a potential JSON-LD java implementation to consider - 
https://github.com/jsonld-java/jsonld-java 


Suresh

> On Nov 10, 2020, at 1:06 PM, Suresh Marru  wrote:
> 
> Thank you all for weighing in. I bootstrapped the repo with some basic 
> information, please contribute to set the goals for this refactored sub 
> system - https://github.com/apache/airavata-data-lake 
> 
> 
> I was doing a literature and software survey on any open source metadata and 
> provenance systems we can integrate with, I found this survey paper useful - 
> https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf
>  
> 
> 
> Seems like we can build fairly flexible and yet sophisticated capabilities 
> using the schema.org  JSON-LD schema - 
> https://schema.org/  , thoughts?
> 
> Please contribute any other pointers we should brainstorm before proceeding. 
> 
> Cheers,
> Suresh
> 
>> On Nov 9, 2020, at 4:22 PM, Pierce, Marlon > > wrote:
>> 
>> +1 for this refactoring. 
>> 
>> On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar" > > wrote:
>> 
>>This message was sent from a non-IU address. Please exercise caution when 
>> clicking links or opening attachments from external sources.
>>---
>> 
>>+2. 
>> 
>>Thanks,
>>Sudhakar.
>> 
>>On 11/9/20, 9:37 AM, "Marru, Suresh" > > wrote:
>> 
>>Hi All,
>> 
>>Airavata Experiment catalog evolved over time and though the replica 
>> catalog and data product models are stand alone, they are buried to use them 
>> outside the experiment context. Any objections to refactor experiment 
>> catalog and make data catalog a first class repo, in the lines of Custos and 
>> MFT?
>> 
>>Cheers,
>>Suresh
>> 
> 



Re: [External] Data Catalog into its own repo

2020-11-10 Thread Suresh Marru
Thank you all for weighing in. I bootstrapped the repo with some basic 
information, please contribute to set the goals for this refactored sub system 
- https://github.com/apache/airavata-data-lake 


I was doing a literature and software survey on any open source metadata and 
provenance systems we can integrate with, I found this survey paper useful - 
https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf
 


Seems like we can build fairly flexible and yet sophisticated capabilities 
using the schema.org JSON-LD schema - https://schema.org/  
, thoughts?

Please contribute any other pointers we should brainstorm before proceeding. 

Cheers,
Suresh

> On Nov 9, 2020, at 4:22 PM, Pierce, Marlon  wrote:
> 
> +1 for this refactoring. 
> 
> On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar"  wrote:
> 
>This message was sent from a non-IU address. Please exercise caution when 
> clicking links or opening attachments from external sources.
>---
> 
>+2. 
> 
>Thanks,
>Sudhakar.
> 
>On 11/9/20, 9:37 AM, "Marru, Suresh"  wrote:
> 
>Hi All,
> 
>Airavata Experiment catalog evolved over time and though the replica 
> catalog and data product models are stand alone, they are buried to use them 
> outside the experiment context. Any objections to refactor experiment catalog 
> and make data catalog a first class repo, in the lines of Custos and MFT?
> 
>Cheers,
>Suresh
> 



Re: [External] Re: Data Catalog into its own repo

2020-11-09 Thread Pierce, Marlon
+1 for this refactoring. 

On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar"  wrote:

This message was sent from a non-IU address. Please exercise caution when 
clicking links or opening attachments from external sources.
---

+2. 

Thanks,
Sudhakar.

On 11/9/20, 9:37 AM, "Marru, Suresh"  wrote:

Hi All,

Airavata Experiment catalog evolved over time and though the replica 
catalog and data product models are stand alone, they are buried to use them 
outside the experiment context. Any objections to refactor experiment catalog 
and make data catalog a first class repo, in the lines of Custos and MFT?

Cheers,
Suresh



smime.p7s
Description: S/MIME cryptographic signature


Re: Data Catalog into its own repo

2020-11-09 Thread Isuru Ranawaka
+1

On Mon, Nov 9, 2020 at 12:22 PM DImuthu Upeksha
 wrote:
>
> +1
>
> On Mon, Nov 9, 2020 at 11:28 AM Christie, Marcus Aaron  
> wrote:
>>
>> +1
>>
>> > On Nov 9, 2020, at 9:35 AM, Marru, Suresh  wrote:
>> >
>> > Hi All,
>> >
>> > Airavata Experiment catalog evolved over time and though the replica 
>> > catalog and data product models are stand alone, they are buried to use 
>> > them outside the experiment context. Any objections to refactor experiment 
>> > catalog and make data catalog a first class repo, in the lines of Custos 
>> > and MFT?
>> >
>> > Cheers,
>> > Suresh
>>


-- 
Research Software Engineer
Indiana University, IN


Re: Data Catalog into its own repo

2020-11-09 Thread DImuthu Upeksha
+1

On Mon, Nov 9, 2020 at 11:28 AM Christie, Marcus Aaron 
wrote:

> +1
>
> > On Nov 9, 2020, at 9:35 AM, Marru, Suresh  wrote:
> >
> > Hi All,
> >
> > Airavata Experiment catalog evolved over time and though the replica
> catalog and data product models are stand alone, they are buried to use
> them outside the experiment context. Any objections to refactor experiment
> catalog and make data catalog a first class repo, in the lines of Custos
> and MFT?
> >
> > Cheers,
> > Suresh
>
>


Re: Data Catalog into its own repo

2020-11-09 Thread Christie, Marcus Aaron
+1

> On Nov 9, 2020, at 9:35 AM, Marru, Suresh  wrote:
> 
> Hi All,
> 
> Airavata Experiment catalog evolved over time and though the replica catalog 
> and data product models are stand alone, they are buried to use them outside 
> the experiment context. Any objections to refactor experiment catalog and 
> make data catalog a first class repo, in the lines of Custos and MFT?
> 
> Cheers,
> Suresh



smime.p7s
Description: S/MIME cryptographic signature


Re: Data Catalog into its own repo

2020-11-09 Thread Pamidighantam, Sudhakar
+2. 

Thanks,
Sudhakar.

On 11/9/20, 9:37 AM, "Marru, Suresh"  wrote:

Hi All,

Airavata Experiment catalog evolved over time and though the replica 
catalog and data product models are stand alone, they are buried to use them 
outside the experiment context. Any objections to refactor experiment catalog 
and make data catalog a first class repo, in the lines of Custos and MFT?

Cheers,
Suresh



Data Catalog into its own repo

2020-11-09 Thread Marru, Suresh
Hi All,

Airavata Experiment catalog evolved over time and though the replica catalog 
and data product models are stand alone, they are buried to use them outside 
the experiment context. Any objections to refactor experiment catalog and make 
data catalog a first class repo, in the lines of Custos and MFT?

Cheers,
Suresh