Dear BaseX developers, I noticed in example 3 under http://docs.basex.org/wiki/CSV_Module#Examples that csv:parse() with option { 'format': 'map' } returns a map of maps, with hardcoded row numbers:
map { 1: map { "City": "Newton", "Name": "John" }, 2: map { "City": "Oldtown", "Name": "Jack" } } Using maps, which are unordered, to represent something ordered like rows in a CSV, hardcoded row numbers are necessary for reassembling the map in document order. I assume this was a necessary approach when the module was developed in the map-only world of XQuery 3.0. Now that 3.1 supports arrays, might an array of maps be a closer fit for CSV parsing? array { map { "City": "Newton", "Name": "John" }, map { "City": "Oldtown", "Name": "Jack" } } I'm also curious, do you know of any efforts to create an EXPath spec for CSV? Putting spec and CSV in the same sentence is dangerous, since CSV is a notoriously under-specified format: "The CSV file format is not standardized" (see https://en.wikipedia.org/wiki/Comma-separated_values). But perhaps there is a common enough need for CSV parsing that such a spec would benefit the community? I thought I'd start by asking here, since BaseX's seems to be the most developed (or only?) CSV module in XQuery. Then there's the question of how to approach implementations of such a spec. While XQuery is probably capable of parsing and serializing small enough CSV, CSVs do get large and naive processing with XQuery would tend to run into memory issues (as I found with xqjson). This means implementations would tend to write in a lower-level language. eXist, for example, uses Jackson for fn:parse-json(). I see Jackson has a CSV extension too: https://github.com/FasterXML/jackson-dataformat-csv. Any thoughts on the suitability of XQuery for the task? Joe