Remaining rdflib-jsonld work:

- Connection Negotiation
  So that e.g. @context: https://schema.org/ correctly resolves the Link:
header

- Only access external resources if RDFLIB_CONFIG or
rdflibconfig['allow_access_external_resources']
  instead of by default, which is what 6.0 is currently doing:

  "URLInputSource can be abused to retrieve arbitrary documents if used
naïvely"
  https://github.com/RDFLib/rdflib/issues/1369

  - Should RDFlib cache @contexts with requests-level caching with
requests-cache or CacheControl or something else?

  - Should RDFlib cache at least contexts in the JSON LD Recommended
Context [so that @context: https://schema.org/ works out of the box]?
    https://github.com/w3c/json-ld-rc/blob/main/context.jsonld

On Sat, Jul 31, 2021 at 8:03 AM Wes Turner <wes.tur...@gmail.com> wrote:

>
>
> On Sat, Jul 31, 2021 at 6:45 AM Wes Turner <wes.tur...@gmail.com> wrote:
>
>>
>>
>> On Wed, Jul 28, 2021 at 4:49 AM Miel Vander Sande <
>> miel.vandersa...@meemoo.be> wrote:
>>
>>> Hi Nick,
>>>
>>> TBH, it's pretty much a function that converts a Dict or a JSON file in
>>> a streaming fashion:
>>> https://github.com/viaacode/construction-site/blob/main/construction_site/parse_functions.py.
>>> I think it's a stand-alone thing; I don't plan anything extra on that
>>> specifically, with maybe the exception of a cmd interface (hence the
>>> proposed refactoring of csv2rdf)
>>>
>>
>> Profiling / [comparative] benchmarks with e.g. Scalene [1][2] and/or
>> perfplot [3] (%timeit) [4][5] could be worthwhile.
>>
>> [1]
>> https://awesomeopensource.com/project/plasma-umass/scalene?categoryPage=26
>> [2] https://github.com/plasma-umass/scalene
>> [3] https://github.com/nschloe/perfplot
>> [4]
>> https://docs.python.org/3/library/timeit.html#timeit-command-line-interface
>> [5]
>> https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit
>>
>
> From https://twitter.com/westurner/status/1274571490688630785 plus
> further references:
>
> Other methods for CSV + transforms => RDF?
>> - #rdflib csv2rdf
>>
>
>
>> - COW
>
> - url: https://github.com/CLARIAH/COW
> - desc: Integrated CSV to RDF converter, using CSVW and nanopublications
>
> - #CSVW  https://github.com/cldf/csvw/blob/master/README.md#see-also
>>
>
> Could https://github.com/cldf/csvw/ be modified to support (1)
> json-ld-streaming; and (2) alternate csv parsers?
>
> - @kidehen sponger / rdfm_yq_parse_csv()?
>
> -
> http://docs.openlinksw.com/virtuoso/rdfspongerprogrammerguide/#virtuosospongeroverviewcartarch
> -
> https://github.com/openlink/Virtuoso-RDFIzer-Mapper-Scripts/blob/master/rdf_mappers.sql
>
> - #tarql
>>
> - Web: https://tarql.github.io/
> - Src: https://github.com/tarql/tarql
> - ProgrammingLanguage: Java
>
> - #csv2rdf GH topic: https://github.com/topics/csv2rdf
>>
>
> #csv2rdf
> https://github.com/topics/csv2rdf
>
>
>> Adding columnar & dataset-level metadata *with URIs* is the value add
>> here, IMHO #LR
>
>
> "7 metadata header rows (column label, property URI path, DataType, unit,
> accuracy, precision, significant figures)"
>
> https://wrdrd.github.io/docs/consulting/linkedreproducibility#csv-csvw-and-metadata-rows
>
> Example Table A with 7 metadata header rows:
> https://wrdrd.github.io/docs/consulting/linkedreproducibility#id4
>
> The csv2rdf tool would need to optionally read this additional metadata
> from either additional header rows or an external 'header' file.
>
>
>> ijson [6] looks like it has some interesting features; iterative,
>> asyncio, push. How does the performance compare?
>>
>> [6] https://pypi.org/project/ijson/
>>
>> I do plan to develop more components that assist scalable ETL,
>>> data-to-rdf like tasks. This includes a plugin for Apache Airflow
>>> ("provider"), which would be good as a RDFLib family repository.
>>>
>>
>> - The datasette and dogsheep projects have a bunch of *-to-sqlite utils
>> and an interface that a number of projects on PyPI have implemented:
>>   https://datasette.io/tools
>>   https://github.com/dogsheep
>>
>>   https://pypi.org/search/?q=to-sqlite
>>   - https://datasette.io/tools/csvs-to-sqlite
>>     -
>> https://github.com/simonw/csvs-to-sqlite/blob/a8a37a016790dc93270c74e32d0a5051bc5a0f4d/tests/test_csvs_to_sqlite.py#L417-L446
>>       - parse datetimes in CSVs
>>         - xsd:datetime (and schema.org/Date and schema.org/dateCreated
>> and schema.org/dateModified) specifies that time will be specified in
>> ISO8601 formats
>>
>> What are the solutions for generating RDFS schema from CSVs and SQL
>> tables?
>>
>> - https://pypi.org/project/tablib/
>>   https://github.com/jazzband/tablib/blob/master/tests/test_tablib.py
>>   - doesn't do anything with datatypes FWICS
>>
>> - https://sqlite-utils.datasette.io/en/stable/cli.html#showing-the-schema
>>   https://github.com/simonw/sqlite-utils/blob/main/sqlite_utils/db.py
>>
>>   def suggest_column_types:
>>
>> https://github.com/simonw/sqlite-utils/blob/c7e8d72be9fe8fe0811f685a18eebc637662d41b/sqlite_utils/utils.py#L29-L58
>>
>> - https://en.wikipedia.org/wiki/Knowledge_extraction
>>
>> https://en.wikipedia.org/wiki/Knowledge_extraction#Relational_databases_to_RDF
>>
>> - https://pypi.org/project/rdb2rdf/
>>   https://github.com/nisavid/pyrdb2rdf/blob/master/rdb2rdf/stores.py
>>   > PyRDB2RDF provides RDFLib with an interface to relational databases
>> as RDF stores. The underlying data is accessed via SQLAlchemy. It is mapped
>> to RDF according to the specifications of RDB2RDF. The corresponding RDF
>> graph is represented as an RDFLib graph.
>>   >
>>   > Translating from relational data to RDF via direct mapping is
>> currently supported. Translating in the other direction and mapping with
>> R2RML are planned but not yet implemented.
>>
>> - https://pypi.org/project/rdfizer/
>>
>> - https://github.com/RDFLib/pyTARQL
>>   - Does this handle datetimes?
>>
>> - Generate JSONschema from JSON and SHACL from JSON-Schema:
>>
>> https://stackoverflow.com/questions/7341537/tool-to-generate-json-schema-from-json-data/30294535#30294535
>>   - https://pypi.org/project/genson/ has been recently updated
>>     - Src: https://github.com/wolverdude/genson/
>>   - https://github.com/mulesoft-labs/json-ld-schema
>>     https://github.com/mulesoft-labs/json-ld-schema#how-does-it-work
>>     > JSON-LD Schema defines a simple 'semantics' JSON-Schema vocabulary
>> (effectively a JSON-Schema meta-schema) that reuses the official JSON
>> Schema for JSON-LD to provide definitions for @context and @type
>> properties. These annotations can be used to provide JSON-LD context for a
>> JSON-Schema document. Provided this JSON-LD context, constraints over named
>> 'properties' in a JSON Schema document can be understood as constraints
>> over CURIES of JSON-LD documents following the context rules defined in the
>> JSON-LD specification.
>>
>> ## CSVW: CSV on the Web
>>
>> - Homepage: https://w3c.github.io/csvw/
>> - Standard: https://www.w3.org/TR/tabular-data-model/
>> - Standard: https://www.w3.org/TR/tabular-metadata/
>> - Standard: https://www.w3.org/TR/csv2json/
>> - Standard: https://www.w3.org/TR/csv2rdf/
>> - Namespace: https://www.w3.org/ns/csvw#
>> - xmlns: `@prefix csvw: <https://www.w3.org/ns/csvw#> .`
>> - @context: https://www.w3.org/ns/csvw.jsonld
>>
>> CSVW (*CSV on the Web*) is a set of relatively new standards
>> for representing :ref:`CSV` rows and columns
>> as :ref:`RDF` (and :ref:`JSON` / :ref:`JSON-LD`)
>> along with *metadata*.
>>
>> * URIs for datatypes (XSD)
>>
>
> FWIU, there is not yet a vocabulary for physical units like meters**2 in
> the JSON-LD Recommended Context:
> https://w3c.github.io/json-ld-rc/context.jsonld
> https://github.com/w3c/json-ld-rc
> https://github.com/w3c/json-ld-rc/blob/main/context.jsonld
>
> QUDT is one such vocabulary:
> ```turtle
> qudt-quantity:Time
>     rdf:type qudt:SpaceAndTimeQuantityKind ;
>     rdfs:label "Time"^^xsd:string ;
>     qudt:description "Time is a basic component of the measuring system
> used to sequence events, to compare the durations of events and the
> intervals between them, and to quantify the motions of
> objects."^^xsd:string ;
>     qudt:symbol "T"^^xsd:string ;
>     skos:exactMatch <http://dbpedia.org/resource/Time> .
>
> # ...
> unit:SecondTime
>       rdf:type qudt:SIBaseUnit , qudt:TimeUnit ;
>       rdfs:label "Second"^^xsd:string ;
>       qudt:abbreviation "s"^^xsd:string ;
>       qudt:code "1615"^^xsd:string ;
>       qudt:conversionMultiplier
>               "1"^^xsd:double ;
>       qudt:conversionOffset
>               "0.0"^^xsd:double ;
>       qudt:symbol "s"^^xsd:string ;
>       skos:exactMatch <http://dbpedia.org/resource/Second> .
> # ...
> ```
> http://www.qudt.org/qudt/owl/1.0.0/unit/Instances.html#SecondTime
>
> ... We must be able to say that the numbers in a column have a physical
> unit with URI; to specify columnar metadata so that downstream tools don't
> need to try to sniff and cast between datatypes and lossily drop units from
> strings in column names:
>
> - (_datatype_ _physical_unit_):
> - (float64, "unit:SecondTime",)
> - (float64, unit["SecondTime"],)
>
>
>
>> * URIs for columns (RDF)
>> * Document Metadata
>> * CSV -> JSON (-> JSON-LD -> RDF)
>> * CSV -> RDF
>>
>> Could there be a file naming convention for specifying the extra CSVW
>> header to apply_to or transform zero or more CSV files with?
>>
>> filename.csv
>> filename.csv.csvw
>> filename.csv.csvwheader.jsonld.json
>> filename.csv.csvw.jsonld.json
>>
> ```python
>
> uri = 'filename.csv'
> if Path(uri + 'csvw.jsonld.json').exists():
>     read_csvw(uri, *args, **kwargs)
> else:
>     read_csv(uri, *args, **kwargs)
>
> ```
>
>>
>> https://www.w3.org/TR/tabular-data-primer/
>>
>>
>>> Best,
>>>
>>> Miel
>>>
>>> Op wo 28 jul. 2021 om 06:09 schreef Nicholas Car <
>>> nicholas....@surroundaustralia.com>:
>>>
>>>> Hi Meil,
>>>>
>>>> Yes, all offers of contribution are of interest! The CSV 2 RDF stuff is
>>>> very old and many tools related to it, such as pyTARQL (
>>>> https://github.com/RDFLib/pyTARQL), are missing. Are you planning on
>>>> presenting JSON2RDF as a new plugin to RDFlib? that may be an option,
>>>> however remember that another option is also just to present your tool's
>>>> repository within RDFlib's family of repositories (i.e. within
>>>> https://github.com/RDFLib) and the choice will depend on how stable
>>>> the tool is and how you see it's future development going.
>>>>
>>>> But perhaps you have other things in mind? Whatever the case, we'd love
>>>> to hear your plans.
>>>>
>>>> Cheers,
>>>>
>>>> Nick
>>>>
>>>> On Tue, Jul 27, 2021 at 5:56 PM Miel Vander Sande <
>>>> miel.vandersa...@meemoo.be> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> little late to the party, but what a great effort this is! Congrats
>>>>> with the release and thank you; this library is super essential to my work
>>>>> and it makes RDF usable in ways other libraries can't.
>>>>>
>>>>> Sidenote: I have a streaming direct json-to-rdf mapping implementation
>>>>> (port of https://github.com/AtomGraph/JSON2RDF) that I'd like to
>>>>> contribute, possibly in combination with a refactoring of
>>>>> https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.tools.html#rdflib.tools.csv2rdf.CSV2RDF.
>>>>> Would that be of interest?
>>>>>
>>>>
>> Does JSON2RDF [need to] implement the w3c json-ld-streaming spec [7]?
>> [7] https://w3c.github.io/json-ld-streaming/
>>
>> - https://w3c.github.io/json-ld-streaming/#streaming-document-form
>> - https://w3c.github.io/json-ld-streaming/#streaming-rdf-form
>>
>>
>>>
>>>>> Best,
>>>>>
>>>>> Miel
>>>>>
>>>>> Op di 20 jul. 2021 om 21:58 schreef Natanael Arndt <arn...@gmail.com>:
>>>>>
>>>>>> I've retweetet the tweet by jarven. But I don't use reddit or hacker
>>>>>> news, I think also semantic web mailing list would be a good idea.
>>>>>>
>>>>>> If you'd like to post something in the channels, please do so.
>>>>>>
>>>>>> Natanael
>>>>>>
>>>>>> Am 20. Juli 2021 20:14:09 MESZ schrieb Wes Turner <
>>>>>> wes.tur...@gmail.com>:
>>>>>> >Congrats and thanks!
>>>>>> >
>>>>>> >From the release notes on the Release:
>>>>>> >https://github.com/RDFLib/rdflib/releases/tag/6.0.0
>>>>>> >
>>>>>> >```
>>>>>> >6.0.0 is a major stable release that drops support for Python 2 and
>>>>>> >Python
>>>>>> >3 < 3.7. Type hinting is now present in much
>>>>>> >of the toolkit as a result.
>>>>>> >
>>>>>> >It includes the formerly independent JSON-LD parser/serializer,
>>>>>> >improvements to Namespaces that allow for IDE namespace
>>>>>> >prompting, simplified use of g.serialize() (turtle default, no need
>>>>>> to
>>>>>> >decode()) and many other updates to
>>>>>> >documentation, store backends and so on.
>>>>>> >
>>>>>> >Performance of the in-memory store has also improved since Python 3.6
>>>>>> >dictionary improvements.
>>>>>> >
>>>>>> >There are numerous supplementary improvements to the toolkit too,
>>>>>> such
>>>>>> >as:
>>>>>> >
>>>>>> >- inclusion of Docker files for easier CI/CD
>>>>>> >- black config files for standardised code formatting
>>>>>> >- improved testing with mock SPARQL stores, rather than a reliance on
>>>>>> >DBPedia etc
>>>>>> >```
>>>>>> >
>>>>>> >Have there been ANN posts to e.g. Hacker news and e.g.
>>>>>> /r/semanticweb?
>>>>>> >
>>>>>> >On Tue, Jul 20, 2021, 10:23 Florent Georges <fgeor...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> >> Congratulations, and thank you all for the hard work!
>>>>>> >>
>>>>>> >> --
>>>>>> >> Florent Georges
>>>>>> >> H2O Consulting
>>>>>> >> http://h2o.consulting/
>>>>>> >>
>>>>>> >> On Tue, Jul 20, 2021, 16:00 Nicholas Car <
>>>>>> >> nicholas....@surroundaustralia.com> wrote:
>>>>>> >>
>>>>>> >>> Hi all,
>>>>>> >>>
>>>>>> >>> Yes, 6.0.0 is out:
>>>>>> >>>
>>>>>> >>>    - https://pypi.org/project/rdflib/6.0.0/
>>>>>> >>>    - https://github.com/RDFLib/rdflib/releases/tag/6.0.0
>>>>>> >>>
>>>>>> >>> Please publicise this release: it has a lot of stuff since 5.0.0
>>>>>> in
>>>>>> >April
>>>>>> >>> last year.
>>>>>> >>>
>>>>>> >>> Thank you very much to all of you who contributed, in particular
>>>>>> my
>>>>>> >>> co-maintainers, Ashley & Natanael and Edmond, Iwan, Tom, Remi,
>>>>>> >Harold and
>>>>>> >>> all the PR and Issue creators. Thanks also to the institutions
>>>>>> that
>>>>>> >>> provided time for their staff to contribute.
>>>>>> >>>
>>>>>> >>> If you see issues, please let the co-maintainers know straight
>>>>>> away:
>>>>>> >we
>>>>>> >>> keen to get a 6.0.1 release out shortly (like weeks to a month) to
>>>>>> >speed up
>>>>>> >>> the RDFlib release cycle.
>>>>>> >>>
>>>>>> >>> Cheers,
>>>>>> >>>
>>>>>> >>> Nick
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> kind regards
>>>>>> >>> Dr Nicholas Car
>>>>>> >>> Data Systems Architect
>>>>>> >>>
>>>>>> >>> SURROUND Australia Pty Ltd and
>>>>>> >>> SURROUND NZ Limited
>>>>>> >>>
>>>>>> >>> Address Level 9, Nishi Building,
>>>>>> >>> 2 Phillip Law Street
>>>>>> >>> New Acton Canberra 2601
>>>>>> >>> Mobile +61 477 560 177
>>>>>> >>> Email nicholas....@surroundaustralia.com
>>>>>> >>> Website https://www.surroundaustralia.com
>>>>>> >>>
>>>>>> >>> Enhancing Intelligence Within Organisations
>>>>>> >>> delivering evidence that connects decisions to outcomes
>>>>>> >>>
>>>>>> >>> Dr Nicholas Car
>>>>>> >>> Adjunct Senior Lecturer
>>>>>> >>>
>>>>>> >>> Research School of Computer Science
>>>>>> >>>
>>>>>> >>> The Australian National University,
>>>>>> >>> Canberra ACT Australia
>>>>>> >>> +61 477 560 177
>>>>>> >>> nicholas....@anu.edu.au
>>>>>> >>> https://cs.anu.edu.au/people/nicholas-car
>>>>>> >>> https://orcid.org/0000-0002-8742-7730
>>>>>> ><https://www.surroundaustralia.com>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> http://github.com/RDFLib
>>>>>> >>> ---
>>>>>> >>> You received this message because you are subscribed to the Google
>>>>>> >Groups
>>>>>> >>> "rdflib-dev" group.
>>>>>> >>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> >send an
>>>>>> >>> email to rdflib-dev+unsubscr...@googlegroups.com.
>>>>>> >>> To view this discussion on the web visit
>>>>>> >>>
>>>>>> >
>>>>>> https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh19yjpwB8EoHVqs5QzKug_rSq1X%2BfFHfnFtOJBdZ1RwYg%40mail.gmail.com
>>>>>> >>>
>>>>>> ><
>>>>>> https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh19yjpwB8EoHVqs5QzKug_rSq1X%2BfFHfnFtOJBdZ1RwYg%40mail.gmail.com?utm_medium=email&utm_source=footer
>>>>>> >
>>>>>> >>> .
>>>>>> >>>
>>>>>> >> --
>>>>>> >> http://github.com/RDFLib
>>>>>> >> ---
>>>>>> >> You received this message because you are subscribed to the Google
>>>>>> >Groups
>>>>>> >> "rdflib-dev" group.
>>>>>> >> To unsubscribe from this group and stop receiving emails from it,
>>>>>> >send an
>>>>>> >> email to rdflib-dev+unsubscr...@googlegroups.com.
>>>>>> >> To view this discussion on the web visit
>>>>>> >>
>>>>>> >
>>>>>> https://groups.google.com/d/msgid/rdflib-dev/CADyR_r1Q_hvfnufYVD0YLYhP%3DwEXnjsi5ucpjzWK_owyYfsfnQ%40mail.gmail.com
>>>>>> >>
>>>>>> ><
>>>>>> https://groups.google.com/d/msgid/rdflib-dev/CADyR_r1Q_hvfnufYVD0YLYhP%3DwEXnjsi5ucpjzWK_owyYfsfnQ%40mail.gmail.com?utm_medium=email&utm_source=footer
>>>>>> >
>>>>>> >> .
>>>>>> >>
>>>>>>
>>>>>> --
>>>>>> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>>>>>>
>>>>>> --
>>>>>> http://github.com/RDFLib
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "rdflib-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to rdflib-dev+unsubscr...@googlegroups.com.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/rdflib-dev/905F1E60-396C-4320-88D1-5A0BCB15B785%40gmail.com
>>>>>> .
>>>>>>
>>>>> --
>>>>> http://github.com/RDFLib
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "rdflib-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to rdflib-dev+unsubscr...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/rdflib-dev/CAHeRLWs6jG-f5HWqew0iqdpqObab3ft-L%3DNyvS7p%2By%2BGAV4RoQ%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/rdflib-dev/CAHeRLWs6jG-f5HWqew0iqdpqObab3ft-L%3DNyvS7p%2By%2BGAV4RoQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>>> http://github.com/RDFLib
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "rdflib-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to rdflib-dev+unsubscr...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh06aApBfyx11L0_uik_BMFA563qXUFLYYOaonumyP4A5g%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/rdflib-dev/CAP7nqh06aApBfyx11L0_uik_BMFA563qXUFLYYOaonumyP4A5g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
>>> http://github.com/RDFLib
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "rdflib-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to rdflib-dev+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/rdflib-dev/CAHeRLWtCCa-b7Q%3DycdsxaoTRd%2BHALfWK9rAeNS0JTZcO4pUX9w%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/rdflib-dev/CAHeRLWtCCa-b7Q%3DycdsxaoTRd%2BHALfWK9rAeNS0JTZcO4pUX9w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rdflib-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/CACfEFw9FEM42_E8QATaP%3DDPiQsDD4nbPzi-XzNDfurkZ0P9Efg%40mail.gmail.com.

Reply via email to