Re: stdx.data.json needs a layer on top

Laeeth Isharc via Digitalmars-d Tue, 23 Jun 2015 12:27:13 -0700

On Tuesday, 23 June 2015 at 14:06:38 UTC, Sönke Ludwig wrote:

As I understand it, there is a gap between what you cancurrently do
with std.json (and indeed vibed json) and what you can do with
stdx.data.json. And the capability falls short of what can bedone in
other standard libraries such as the ones for python.
So since we are going for a nuclear-power station includedapproach,does that not mean that we need to specify what this layershould do,
and somebody should start to work on it?
One thing. which I consider the most important missing buildingblock, is Jacob's anticipated std.serialization module [1]*.Skipping the data representation layer and going straight for astatically typed access to the data is the way to go in alanguage such as D, at least in most situations.

Thanks, Sonke. I appreciate your taking the time to reply, and Ihope I represented my understanding of things correctly. I thinkoften things get stuck in limbo because people don't know what'smost useful, so I do think a central list of "things that need tobe done" in D ecosystem might be nice, if it doesn't becomeexcessively structured and bureaucratic. (I ain't volunteeringto maintain it, as I can't commit to it).

Thing is there are different use cases. For example, I pull datafrom Quandl - the metadata is standard and won't change in formatoften; but the data for a particular series will. For example ifI pull volatility data that will have different fields to priceor economic data. And I don't know beforehand the total set ofpossibilities. This must be quite a common use case, and indeedI just hit another one recently with a poorly-documented internalcorporate database for securities.

Maybe it's fine to generate the static typing in response toreading the data, but then it ought to be easy to do so(ultimately). Because otherwise you hack something up in Pythonbecause it's just easier, and that hack job becomes the basis forsomething larger then you ever intended or wanted and it's neverworth rewriting given the other stuff you need.

But even if you prefer static typing generated on the fly (whichmaybe becomes useful via introspection a la Alexandrescu talk),sometimes one will prefer dynamic typing, and since it's easy todo in a way that doesn't destroy the elegance and coherence ofthe whole project, why not give people the option ? It seems tome that Guido painted a target on Python by saying "it's fastenough, and you are usually I/O etc bound", because the numericalcomputing people have different needs. So BLAS and the like maybe part of that, but also having something like pandas - and theability to get data in and out of it - would be an important partin making it easy and fun to use D for this purpose, and it's notso hard to do so, just a fair bit of work. Not that it makessense to undergo a death march to duplicate python functionality,but there are some things that are relatively easy that have ahigh payoff - like John Colvin's pydmagic.

(The link here, which may not be so obvious, is that in a waypandas is a kind of replacement for a spreadsheet, and being ableto just pull stuff in without minding your 'p's and 'q's to get aquick result lends itself to the kind of iterative explorationthat makes spreadsheets still overused even today. And that'sthe link to JSON and (de)-serialization).

Another part is a high level layer on top of the stream parserthat exists for a while (albeit with room for improvement), butthat I forgot to update the documentation for. I've now caughtup on that and it can be found under [2] - see the read[...]and skip[...] functions.


Thank you for the link.

Do you, or anyone else, have further ideas for higher levelfunctionality, or any concrete examples in other standardlibraries?

Will think it through and try to come up with some simpleexamples. Paging John Colvin and Russell Winder, too.

* Or any other suitable replacement, if that doesn't work outfor some reason. The vibe.data.serialization module to me isnot a suitable candidate as it stands, because it lacks somefeatures of Jacob's solution, such as proper handling of(duplicate/interior) references. But it's a perfect fit for myown class of problems, so I currently can't justify to put workinto this either.

Is it worth you or someone else trying to articulate well what itdoes well that is missing from stdx.data.json?

Re: stdx.data.json needs a layer on top

Reply via email to