Re: [GSoC2015] list of accepted projects

Junyue Wang Thu, 11 Jun 2015 09:19:00 -0700

Hello,

It seems licence of the java implementation is LGPL:


   - *The libraries are open source (LGPL)*. You can adapt the libraries to
   your needs, and the community can spot and fix issues [1].

Is that good news? Can we just use/link to the java library without
modifying its code?

yours,
junyue

[1] http://www.rdfhdt.org/what-is-hdt/

On Tue, Jun 9, 2015 at 6:57 AM, Peter Ansell <[email protected]> wrote:

> Hi Junyue,
>
> Sorry for any confusion that we may have caused you by not emphasising
> the licensing issue as the main factor in this project, and hence you
> not realising that it required an actual parser to be written (and
> that you can't look at the GPL/LGPL parser for inspiration).
>
> We are still early and I think you should try to follow the W3C
> submission to see how difficult parsing a binary format is to see
> whether you want to continue or not in a week or two after trying to
> write a binary parser from scratch. Don't focus on the writer at this
> point if you think the parser will be enough for you.
>
> Once the RDF-HDT people release a newer version of the specification,
> you can switch to using that, but it would be great to see if you can
> get a basic parser up and running based on the older W3C submission.
> To start off with you could try just parsing the header, and see how
> difficult that turns out to be before deciding about the rest of the
> time.
>
> Sorry in advance btw, this is my first time being a GSOC mentor and I
> may do things wrong.
>
> Cheers,
>
> Peter
>
>
>
> On 8 June 2015 at 17:09, Junyue Wang <[email protected]> wrote:
> > Hello Peter,
> >
> > I went through the W3C document. I think coding from scratch is too
> > difficult for me. In the project proposal I submitted, Java HDT
> library[1]
> > is to be reused for parsing and writing hdt files. The jena integration
> is
> > built on top of Java HDT library as well. I reviewed the source code of
> > Java HDT library, which does not strictly conform to the W3C document. If
> > we follow the specification precisely, the new sesame-rio-rdfhdt module
> may
> > not be able to dealing with the hdt files generated by Java HDT library.
> >
> > I hope it's OK to stick to the original idea in the proposal. Or we may
> > have problems to complete the project within the 3-month period.
> >
> > [1] http://www.rdfhdt.org/manual-of-the-java-hdt-library/
> > [2] http://www.rdfhdt.org/manual-of-hdt-integration-with-jena/
> >
> > yours,
> > junyue
> >
> >
> > On Mon, Jun 8, 2015 at 8:48 AM, Peter Ansell <[email protected]>
> wrote:
> >
> >> Hi Junyue,
> >>
> >> You are not going to be using or linking to the existing RDF/HDT
> >> implementations so their use of TripleString internally should not be
> >> an issue for you and you do not need to look at the RDF/HDT Java
> >> source code for this project.
> >>
> >> The sole reference for your implementation is the following document
> >> that the RDF/HDT team submitted to the W3C:
> >>
> >> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/
> >>
> >> Specifically, you need to implement a binary parser from scratch based
> >> on the specification given in section 3:
> >>
> >> http://www.w3.org/Submission/2011/SUBM-HDT-20110330/#syntax
> >>
> >> Cheers,
> >>
> >> Peter
> >>
> >> On 8 June 2015 at 01:43, Junyue Wang <[email protected]> wrote:
> >> > Hello Peter,
> >> >
> >> > I've done with creating the new module and the new format. Now I'm
> >> > implementing the RDFHDTParser.
> >> > One question: If I search RDF HDT, it provides TripleString for each
> >> > triple. TripleString contains 3 Strings for subject, predicate and
> object
> >> > respectively. I need to transform the Strings into Sesame Values,
> which
> >> may
> >> > be URI, Resource, Literal or BlankNode. But I don't know before hand
> >> which
> >> > concrete types of Value they are. Is there a neat way to do this?
> >> >
> >> > I checked out ValueFactory in Sesame. It only does the transformation
> for
> >> > the given concrete type.
> >> >
> >> > yours,
> >> > junyue
> >> >
> >> > On Sun, May 17, 2015 at 9:09 AM, Peter Ansell <[email protected]
> >
> >> > wrote:
> >> >
> >> >> Hi Junjue,
> >> >>
> >> >> It will be simplest to track if you fork the Marmotta repository at
> >> >> GitHub and create a branch named "MARMOTTA-593".
> >> >>
> >> >> Add me as a collaborator to the GitHub repository. My GitHub id is
> >> >> "ansell".
> >> >>
> >> >> The collaborators list for my fork is at:
> >> >>
> >> >> https://github.com/ansell/marmotta/settings/collaboration
> >> >>
> >> >> When you fork it, you can replace "ansell" with your GitHub id and
> use
> >> >> that page to add me to the list of collaborators.
> >> >>
> >> >> Yes, the code will be merged to Marmotta in the end.
> >> >>
> >> >> You should create a new module inside of marmotta-sesame-tools named
> >> >> "marmotta-rio-rdfht"
> >> >>
> >> >>
> >> >>
> >>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools
> >> >>
> >> >> You will also need to add a format constant into marmotta-rio-api as
> a
> >> >> new folder in the following directory, similar to the current 3
> >> >> folders there:
> >> >>
> >> >>
> >> >>
> >>
> https://github.com/apache/marmotta/tree/master/commons/marmotta-sesame-tools/marmotta-rio-api/src/main/java/org/apache/marmotta/commons/sesame/rio
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Peter
> >> >>
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Peter
> >> >>
> >> >> On 16 May 2015 at 22:19, Junyue Wang <[email protected]> wrote:
> >> >> > Hello Sergio, Peter,
> >> >> >
> >> >> > It's my honor to be a GSoC student. I appreciate your help for the
> >> >> comments
> >> >> > of the project proposal.
> >> >> > I read the proposed methodology you pointed out. But it seems my
> >> project
> >> >> is
> >> >> > only related to Sesame and RDF HDT, without touching the code base
> of
> >> >> > Marmotta. Should I fork Marmotta in github, or start a new
> repository
> >> >> there?
> >> >> > Will my code be merged into Marmotta in the end? If so, which
> module
> >> of
> >> >> > Marmotta?
> >> >> >
> >> >> > yours,
> >> >> > junyue
> >> >> >
> >> >> > On Thu, Apr 30, 2015 at 2:41 PM, Sergio Fernández <
> [email protected]>
> >> >> wrote:
> >> >> >
> >> >> >> Hi Peter,
> >> >> >>
> >> >> >> On Wed, Apr 29, 2015 at 1:12 AM, Peter Ansell <
> >> [email protected]>
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> Those guidelines look great to me, especially the suggestion
> about
> >> the
> >> >> >>> branch name including the Jira issue, which I have found very
> useful
> >> >> >>> in all of my git-based projects. In the RDF/HDT case, and
> possibly
> >> in
> >> >> >>> the GeoSPARQL case, the contributed code could be in the form of
> a
> >> new
> >> >> >>> module, so there won't be much interference with the rest of the
> >> >> >>> codebase during that time. However, it is still useful to
> regularly
> >> >> >>> merge the "develop" branch into each of the branches to keep up
> to
> >> >> >>> date and reduce the number of merge conflicts occurring near the
> end
> >> >> >>> when the students will be rushing to complete the project.
> >> >> >>
> >> >> >>
> >> >> >> Great you like it, Peter :-)
> >> >> >>
> >> >> >> I expect less merge conflicts, nevertheless it's a more concrete
> >> >> library;
> >> >> >> with the GeoSPARQL project that workflow is much more important.
> >> >> >>
> >> >> >> I've just have one concern about the documentation. Last year I
> had
> >> >> >> formatting issues bringing that documentation into the wiki
> (MoinMoin
> >> >> >> syntax is not markdown, unfortunately). Do you think is better to
> do
> >> it
> >> >> >> directly in the wiki?
> >> >> >>
> >> >> >> I'd love to hear comments from our students, after all you're the
> >> ones
> >> >> who
> >> >> >> need to follow that proposed methodology.
> >> >> >>
> >> >> >> Cheers,
> >> >> >>
> >> >> >> --
> >> >> >> Sergio Fernández
> >> >> >> Partner Technology Manager
> >> >> >> Redlink GmbH
> >> >> >> m: +43 6602747925
> >> >> >> e: [email protected]
> >> >> >> w: http://redlink.co
> >> >> >>
> >> >>
> >>
>

Re: [GSoC2015] list of accepted projects

Reply via email to