I would argue the other direction. DFDL is the "Data Format Description" Language. So you don't really get to pick what you mean by the word "format". We're stuck with the DFDL specifications meaning in the use of this word.
DFDL does not describe "data" generally, It doesn't describe, for example, the data as it resides in the memory of a software program. There are ways of describing that, such as in programming languages and the means they provide for describing data structures. DFDL is about data in some external physical representation, independent of how a program represents the data in memory. The rules and regularities of this external physical rep are called the "format". DFDL has nothing to do with whether the data format is standardized, ad-hoc, one-of, complex, or simple. It is orthogonal to how "important" or common or frequently used the data or the format of the data is. Furthermore, DFDL doesn't describe the data. It describes only the format of the data. For example, the fact that a given number has a given representation is captured by DFDL. The fact that that number happens to be the latitude component of a geo-location of a thing, that's just captured by the name given to that data item in the schema, and inclusion in that name of the letters "latitude". There's nothing in DFDL about latitiudes, or about the fact that this data is a geo-location. DFDL is orthogonal to all such concepts having to do with the meaning or semantics of the data. Finally, a DFDL implementation takes this format description, and provides some services around the data - converting it to JSON or XML, or access to it as a JDOM tree via API, etc. But these are things enabled by DFDL. They are not what DFDL is "about". DFDL is about the format. So I think your attempts to conceptually clarify things by generalizing DFDL to "general metadata describing data" is not really right as I see it. ________________________________ From: Costello, Roger L. <[email protected]> Sent: Tuesday, November 5, 2019 1:10 PM To: [email protected] <[email protected]> Subject: Assertion: Focusing on "data formats" is a red herring [Definition] Red herring: something that misleads or distracts from the relevant or important issue. The first thing that I talk about in my DFDL tutorial is that there are a huge number of data formats (CSV, iCalendar, vCard, etc.) and DFDL is all about describing data formats. I’ve come to the belief that focusing on “data formats” is a red herring. I believe this is the correct focus: Nearly every software program inputs and processes data: [cid:[email protected]] DFDL is about creating descriptions of data. DFDL processors are about using the descriptions to parse the data and then making the parsed data available in a useful form. Notice there is no mention of “data formats”. Do you agree that focusing on data formats is a red herring? Do you agree that DFDL is about describing data? Do you agree that DFDL processors is about using the descriptions to parse the data and then making the parsed data available in a useful form? It just occurred to me that I am making an implicit assertion: data and data formats are different. I believe that data formats is a subset of data: [cid:[email protected]] I think of data formats as a format that has been standardized, either by a standards organization (W3C, ISO, IETF, etc.) or by a corporation. Data, on the other hand, is any format: single-time-use formats, quickly cobbled together formats, 1-1 data exchange formats between a small group, and also standardized formats. Do you agree that data is a broader, more general concept than data formats? Do you agree that, with regard to DFDL, the focus should be on data, not on data formats? /Roger
