But, "datasource" is just not an existing word, right? Of course if we put
spelling mistakes into class names, that will decrease the chance of name
clashes big time, but... :)

On Sat, Feb 29, 2020 at 6:06 PM Siegfried Goeschl <
[email protected]> wrote:

> Well, clashes with the "java.activation.DataSource" - can do & not
> definite opinion about it :)
>
> > On 29.02.2020, at 18:03, Daniel Dekany <[email protected]> wrote:
> >
> > I believe that should be DataSource (with capital S), as it's two words.
> >
> > Also, it's the name of a too widely used and known JDBC interface. So if
> > anyone can tell a similarly descriptive alternative...
> >
> > On Sat, Feb 29, 2020 at 5:03 PM Siegfried Goeschl <
> > [email protected]> wrote:
> >
> >> Hi Daniel,
> >>
> >> I'm an enterprise developer - bad habits die hard :-)
> >>
> >> So I closed the following tickets and merged the branches
> >>
> >> 1) FREEMARKER-129 freemarker-generator: Merge "freemarker-cli" into
> >> "freemarker-generator"
> >> 2) FREEMARKER-134 freemarker-generator: Rename "Document" to
> "Datasource"
> >> 3) FREEMARKER-135 freemarker-generator-cli: Support user-supplied names
> >> for datasources
> >>
> >> Thanks in advance,
> >>
> >> Siegfried Goeschl
> >>
> >>
> >>> On 29.02.2020, at 12:19, Daniel Dekany <[email protected]>
> wrote:
> >>>
> >>> Yeah, and of course, you can merge that branch. You can even work on
> the
> >>> master directly after all.
> >>>
> >>> On Sat, Feb 29, 2020 at 12:17 PM Daniel Dekany <
> [email protected]>
> >>> wrote:
> >>>
> >>>> But, I do recognize the cattle use case (several "faceless" files with
> >>>> common format/schema). Only, my idea is to push that complexity on the
> >> data
> >>>> source. The "data source" concept shields the rest of the application
> >> from
> >>>> the details of how the data is stored or retrieved. So, a data source
> >> might
> >>>> loads a bunch of log files from a directory, and present them as a
> >> single
> >>>> big table, or like a list of tables, etc. So I want to deal with the
> >> cattle
> >>>> use case, but the question is what part of the of architecture will
> deal
> >>>> with this complication, with other words, how do you box things. Why
> my
> >>>> initial bet is to stuff that complication into the "data source"
> >>>> implementation(s) is that data sources are inherently varied. Some
> >> returns
> >>>> a table-like thing, some have multiple named tables (worksheets in
> >> Excel),
> >>>> some returns tree of nodes (XML), etc. So then, some might returns a
> >>>> list-of-list-of log records, or just a single list of log-records (put
> >>>> together from daily log files). That way cattles don't add to
> conceptual
> >>>> complexity. Now, you might be aware of cases where the cattle concept
> >> must
> >>>> be more exposed than this, and the we can't box things like this. But
> >> this
> >>>> is what I tried to express.
> >>>>
> >>>> Regarding "output generators", and how that applies on the command
> >> line. I
> >>>> think it's important that the common core between Maven and
> >> command-line is
> >>>> as fat as possible. Ideally, they are just two syntax to set up the
> same
> >>>> thing. Mostly at least. So, if you specify a template file to the CLI
> >>>> application, in a way so that it causes it to process that template to
> >>>> generate a single output, then there you have just defined an "output
> >>>> generator" (even if it wasn't explicitly called like that in the
> command
> >>>> line). If you specify 3 csv files to the CLI application, in a way so
> >> that
> >>>> it causes it to generate 3 output files, then you have just defined 3
> >>>> "output generators" there (there's at least one template specified
> there
> >>>> too, but that wasn't an "output generator" itself, it was just an
> >> attribute
> >>>> of the 3 output generators). If you specify 1 template, and 3 csv
> >> files, in
> >>>> a way so that it will yield 4 output files (1 for the template, 3 for
> >> the
> >>>> csv-s), then you have defined 4 output generators there. If you have a
> >> data
> >>>> source that loads a list of 3 entities (say, 3 csv files, so it's a
> >> list of
> >>>> tables then), and you have 2 templates, and you tell the CLI to
> execute
> >>>> each template for each item in said data source, then you have just
> >> defined
> >>>> 6 "output generators".
> >>>>
> >>>> On Fri, Feb 28, 2020 at 11:08 AM Siegfried Goeschl <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> Hi Daniel,
> >>>>>
> >>>>> That all depends on your mental model and work you do, expectations,
> >>>>> experience :-)
> >>>>>
> >>>>>
> >>>>> __Document Handling__
> >>>>>
> >>>>> *"But I think actually we have no good use case for list of documents
> >>>>> that's passed at once to a single template run, so, we can just
> ignore
> >>>>> that complication"*
> >>>>>
> >>>>> In my case that's not a complication but my daily business - I'm
> >>>>> regularly wading through access logs - yesterday probably a couple of
> >>>>> hundreds access logs across two staging sites to help tracking some
> >>>>> strange API gateway issues :-)
> >>>>>
> >>>>> My gut feeling is (borrowing from
> >>>>>
> >>>>>
> >>
> https://medium.com/@Joachim8675309/devops-concepts-pets-vs-cattle-2380b5aab313
> >>>>> )
> >>>>>
> >>>>> 1. You have a few lovely named documents / templates - `pets`
> >>>>> 2. You have tons of anonymous documents / templates to process -
> >>>>> `cattle`
> >>>>> 3. The "grey area" comes into play when mixing `pets & cattle`
> >>>>>
> >>>>> `freemarker-cli` was built with 2) in mind and I want to cover 1)
> since
> >>>>> it is equally important and common.
> >>>>>
> >>>>>
> >>>>> __Template And Document Processing Modes__
> >>>>>
> >>>>> IMHO it is important to answer the following question : "How many
> >>>>> outputs do you get when rendering 2 template and 3 datasources? Two,
> >>>>> Three or Six?"
> >>>>>
> >>>>> Your answer is influenced by your mental model / experience
> >>>>>
> >>>>> * When wading through tons of CSV files, access logs, etc. the answer
> >> is
> >>>>> "2"
> >>>>> * When doing source code generation the obvious answer is "6"
> >>>>> * Can't image a use case which results in "3" but I'm pretty sure we
> >>>>> will encounter one
> >>>>>
> >>>>> __Template and document mode probably shouldn't exist__
> >>>>>
> >>>>> That's hard for me to fully understand - I definitely lack your
> >> insights
> >>>>> & experience writing such tools :-)
> >>>>>
> >>>>> Defining the `Output Generator` is the underlying model for the Maven
> >>>>> plugin (and probably FMPP).
> >>>>>
> >>>>> I'm not sure if this applies for command lines at least not in the
> way
> >> I
> >>>>> use them (or would like to use them)
> >>>>>
> >>>>>
> >>>>> Thanks in advance,
> >>>>>
> >>>>> Siegfried Goeschl
> >>>>>
> >>>>> PS: Can/shall I merge the PR to bring in `freemarker-cli`?
> >>>>>
> >>>>>
> >>>>> On 28 Feb 2020, at 9:14, Daniel Dekany wrote:
> >>>>>
> >>>>>> Yeah, "data source" is surely a too popular name, but for reason.
> >>>>>> Anyone
> >>>>>> has other ideas?
> >>>>>>
> >>>>>> As of naming data sources and such. One thing I was wondering about
> >>>>>> back
> >>>>>> then is how to deal with list of documents given to a template,
> versus
> >>>>>> exactly 1 document given to a template. But I think actually we have
> >>>>>> no
> >>>>>> good use case for list of documents that's passed at once to a
> single
> >>>>>> template run, so, we can just ignore that complication. A document
> has
> >>>>>> a
> >>>>>> name, and that's always just a single document, not a collection, as
> >>>>>> far as
> >>>>>> the template is concerned. (We can have multiple documents per run,
> >>>>>> but
> >>>>>> those normally yield separate output generators, so it's still only
> >>>>>> one
> >>>>>> document per template.) However, we can have data source types
> >>>>>> (document
> >>>>>> types with old terminology) that collect together multiple data
> files.
> >>>>>> So
> >>>>>> then that complexity is encapsulated into the data source type, and
> >>>>>> doesn't
> >>>>>> complicate the overall architecture. That's another case when a data
> >>>>>> source
> >>>>>> is not just a file. Like maybe there's a data source type that loads
> >>>>>> all
> >>>>>> the CSV-s from a directory, into a single big table (I had such
> case),
> >>>>>> or
> >>>>>> even into a list of tables. Or, as I mentioned already, a data
> source
> >>>>>> is
> >>>>>> maybe an SQL query on a JDBC data source (and we got the first term
> >>>>>> clash... JDBC also call them data sources).
> >>>>>>
> >>>>>> Template and document mode probably shouldn't exist from user
> >>>>>> perspective
> >>>>>> either, at least not as a global option that must apply to
> everything
> >>>>>> in a
> >>>>>> run. They could just give the files that define the "output
> >>>>>> generators",
> >>>>>> and some of them will be templates, some of them are data files, in
> >>>>>> which
> >>>>>> case a template need to be associated with them (and there can be a
> >>>>>> couple
> >>>>>> of ways of doing that). And then again, there are the cases where
> you
> >>>>>> want
> >>>>>> to create one output generator per entity from some data source.
> >>>>>>
> >>>>>> On Fri, Feb 28, 2020 at 8:23 AM Siegfried Goeschl <
> >>>>>> [email protected]> wrote:
> >>>>>>
> >>>>>>> Hi Daniel,
> >>>>>>>
> >>>>>>> See my comments below - and thanks for your patience and input :-)
> >>>>>>>
> >>>>>>> *Renaming Document To DataSource*
> >>>>>>>
> >>>>>>> Yes, makes sense. I tried to avoid since I'm using javax.activation
> >>>>>>> and
> >>>>>>> its DataSource.
> >>>>>>>
> >>>>>>> *Template And Document Mode*
> >>>>>>>
> >>>>>>> Agreed - I think it is a valuable abstraction for the user but it
> is
> >>>>>>> not
> >>>>>>> an implementation concept :-)
> >>>>>>>
> >>>>>>> *Document Without Symbolic Names*
> >>>>>>>
> >>>>>>> Also agreed and it is going to change but I have not settled my
> mind
> >>>>>>> yet
> >>>>>>> what exactly to implement.
> >>>>>>>
> >>>>>>> Thanks in advance,
> >>>>>>>
> >>>>>>> Siegfried Goeschl
> >>>>>>>
> >>>>>>> On 28 Feb 2020, at 1:05, Daniel Dekany wrote:
> >>>>>>>
> >>>>>>> A few quick thoughts on that:
> >>>>>>>
> >>>>>>> - We should replace the "document" term with something more
> speaking.
> >>>>>>> It
> >>>>>>> doesn't tell that it's some kind of input. Also, most of these
> inputs
> >>>>>>> aren't something that people typically call documents. Like a csv
> >>>>>>> file, or
> >>>>>>> a database table, which is not even a file (OK we don't support
> such
> >>>>>>> thing
> >>>>>>> at the moment). I think, maybe "data source" is a safe enough term.
> >>>>>>> (It
> >>>>>>> also rhymes with data model.)
> >>>>>>> - You have separate "template" and "document" "mode", that applies
> to
> >>>>>>> a
> >>>>>>> whole run. I think such specialization won't be helpful. We could
> >>>>>>> just say,
> >>>>>>> on the conceptual level at lest, that we need a set of "outputs
> >>>>>>> generators". An output generator is an object (in the API) that
> >>>>>>> specifies a
> >>>>>>> template, a data-model (where the data-model is possibly populated
> >>>>>>> with
> >>>>>>> "documents"), and an output "sink" (a file path, or stdout), and
> can
> >>>>>>> generate the output itself. A practical way of defining the output
> >>>>>>> generators in a CLI application is via a bunch of files, each
> >>>>>>> defining an
> >>>>>>> output generator. Some of those files is maybe a template (that you
> >>>>>>> can
> >>>>>>> even detect from the file extension), or a data file that we
> >>>>>>> currently call
> >>>>>>> a "document". They could freely mix inside the same run. I have
> also
> >>>>>>> met
> >>>>>>> use case when you have a single table (single "document"), and each
> >>>>>>> record
> >>>>>>> in it yields an output file. That can also be described in some
> file
> >>>>>>> format, or really in any other way, like directly in command line
> >>>>>>> argument,
> >>>>>>> via API, etc.
> >>>>>>> - You have multiple documents without associated symbolical name in
> >>>>>>> some
> >>>>>>> examples. Templates can't identify those then in a well
> maintainable
> >>>>>>> way.
> >>>>>>> The actual file name is often not a good identifier, can change
> over
> >>>>>>> time,
> >>>>>>> and you might don't even have good control over it, like you
> already
> >>>>>>> receive it as a parameter from somewhere else, or someone
> >>>>>>> moves/renames
> >>>>>>> that files that you need to read. Index is also not very good, but
> I
> >>>>>>> have
> >>>>>>> written about that earlier.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Feb 26, 2020 at 9:33 AM Siegfried Goeschl <
> >>>>>>> [email protected]> wrote:
> >>>>>>>
> >>>>>>> Hi folks,
> >>>>>>>
> >>>>>>> still wrapping my side around but assembled some thoughts here -
> >>>>>>> https://gist.github.com/sgoeschl/b09b343a761b31a6c790d882167ff449
> >>>>>>>
> >>>>>>> Thanks in advance,
> >>>>>>>
> >>>>>>> Siegfried Goeschl
> >>>>>>>
> >>>>>>>
> >>>>>>> On 23 Feb 2020, at 23:14, Daniel Dekany <[email protected]>
> wrote:
> >>>>>>>
> >>>>>>> What you are describing is more like the angle that FMPP took
> >>>>>>> initially,
> >>>>>>> where templates drive things, they generate the output for
> themselves
> >>>>>>>
> >>>>>>> (even
> >>>>>>>
> >>>>>>> multiple output files if they wish). By default output files name
> >>>>>>> (and
> >>>>>>> relative path) is deduced from template name. There was also a
> global
> >>>>>>> data-model, built in a configuration file (or equally, built via
> >>>>>>> command
> >>>>>>> line arguments, or both mixed), from which templates get whatever
> >>>>>>> data
> >>>>>>>
> >>>>>>> they
> >>>>>>>
> >>>>>>> are interested in. Take a look at the figures here:
> >>>>>>> http://fmpp.sourceforge.net/qtour.html. Later, this concept was
> >>>>>>>
> >>>>>>> generalized
> >>>>>>>
> >>>>>>> a bit more, because you could add XML files at the same place where
> >>>>>>> you
> >>>>>>> have the templates, and then you could associate transform
> templates
> >>>>>>> to
> >>>>>>>
> >>>>>>> the
> >>>>>>>
> >>>>>>> XML files (based on path pattern and/or the XML document element).
> >>>>>>> Now
> >>>>>>> that's like what freemarker-generator had initially (data files
> drive
> >>>>>>> output, and the template is there to transform it).
> >>>>>>>
> >>>>>>> So I think the generic mental model would like this:
> >>>>>>>
> >>>>>>> 1. You got files that drive the process, let's call them *generator
> >>>>>>> files* for now. Usually, each generator file yields an output file
> >>>>>>> (but
> >>>>>>> maybe even multiple output files, as you might saw in the last
> >>>>>>> figure).
> >>>>>>> These generator files can be of many types, like XML, JSON, XLSX
> (as
> >>>>>>>
> >>>>>>> in the
> >>>>>>>
> >>>>>>> original freemarker-generator), and even templates (as is the norm
> in
> >>>>>>> FMPP). If the file is not a template, then you got a set of
> >>>>>>> transformer
> >>>>>>> templates (-t CLI option) in a separate directory, which can be
> >>>>>>>
> >>>>>>> associated
> >>>>>>>
> >>>>>>> with the generator files base on name patterns, and even based on
> >>>>>>>
> >>>>>>> content
> >>>>>>>
> >>>>>>> (schema usually). If the generator file is a template (so that's a
> >>>>>>> positional @Parameter CLI argument that happens to be an *.ftl, and
> >>>>>>> is
> >>>>>>>
> >>>>>>> not
> >>>>>>>
> >>>>>>> a template file specified after the "-t" option), then you just
> >>>>>>> Template.process(...) it, and it prints what the output will be.
> >>>>>>> 2. You also have a set of variables, the global data-model, that
> >>>>>>> contains commonly useful stuff, like what you now call parameters
> >>>>>>> (CLI
> >>>>>>> -Pname=value), but also maybe data loaded from JSON, XML, etc..
> Those
> >>>>>>>
> >>>>>>> data
> >>>>>>>
> >>>>>>> files aren't "generator files". Templates just use them if they
> need
> >>>>>>>
> >>>>>>> them.
> >>>>>>>
> >>>>>>> An important thing here is to reuse the same mechanism to read and
> >>>>>>>
> >>>>>>> parse
> >>>>>>>
> >>>>>>> those data files, which was used in templates when transforming
> >>>>>>>
> >>>>>>> generator
> >>>>>>>
> >>>>>>> files. So we need a common format for specifying how to load data
> >>>>>>>
> >>>>>>> files.
> >>>>>>>
> >>>>>>> That's maybe just FTL that #assigns to the variables, or maybe more
> >>>>>>> declarative format.
> >>>>>>>
> >>>>>>> What I have described in the original post here was a less generic
> >>>>>>> form
> >>>>>>>
> >>>>>>> of
> >>>>>>>
> >>>>>>> this, as I tried to be true with the original approach. I though
> the
> >>>>>>> proposal will be drastic enough as it is... :) There, the "main"
> >>>>>>> document
> >>>>>>> is the "generator file" from point 1, the "-t" template is the
> >>>>>>> transform
> >>>>>>> template for the "main" document, and the other named documents
> >>>>>>> ("users",
> >>>>>>> "groups") is a poor man's shared data-model from point 2 (together
> >>>>>>> with
> >>>>>>> with -PName=value).
> >>>>>>>
> >>>>>>> There's further somewhat confusing thing to get right with the
> >>>>>>> list-of-documents (`DocuentList`, `NamedDocumentLists`) thing
> though.
> >>>>>>> In
> >>>>>>> the model above, as per point 1, if you list multiple data files,
> >>>>>>> each
> >>>>>>>
> >>>>>>> will
> >>>>>>>
> >>>>>>> generate a separate output file. So, if you need take in a list of
> >>>>>>> files
> >>>>>>>
> >>>>>>> to
> >>>>>>>
> >>>>>>> transform it to a single output file (or at least with a single
> >>>>>>> transform
> >>>>>>> template execution), then you have to be explicit about that, as
> >>>>>>> that's
> >>>>>>>
> >>>>>>> not
> >>>>>>>
> >>>>>>> the default behavior anymore. But it's still absolutely possible.
> >>>>>>> Imagine
> >>>>>>> it as a "list of XLSX-es" is itself like a file format. You need
> some
> >>>>>>> CLI
> >>>>>>> (and Maven config, etc.) syntax to express that, but that shouldn't
> >>>>>>> be a
> >>>>>>> big deal.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sun, Feb 23, 2020 at 9:43 PM Siegfried Goeschl <
> >>>>>>> [email protected]> wrote:
> >>>>>>>
> >>>>>>> Hi Daniel,
> >>>>>>>
> >>>>>>> Good timing - I was looking at a similar problem from different
> angle
> >>>>>>> yesterday (see below)
> >>>>>>>
> >>>>>>> Don't have enough time to answer your email in detail now - will do
> >>>>>>> that
> >>>>>>> tomorrow evening
> >>>>>>>
> >>>>>>> Thanks in advance,
> >>>>>>>
> >>>>>>> Siegfried Goeschl
> >>>>>>>
> >>>>>>>
> >>>>>>> ===. START
> >>>>>>> # FreeMarker CLI Improvement
> >>>>>>> ## Support Of Multiple Template Files
> >>>>>>> Currently we support the following combinations
> >>>>>>>
> >>>>>>> * Single template and no data files
> >>>>>>> * Single template and one or more data files
> >>>>>>>
> >>>>>>> But we can not support the following use case which is quite
> typical
> >>>>>>> in
> >>>>>>> the cloud
> >>>>>>>
> >>>>>>> __Convert multiple templates with a single data file, e.g copying a
> >>>>>>> directory of configuration files using a JSON configuration file__
> >>>>>>>
> >>>>>>> ## Implementation notes
> >>>>>>> * When we copy a directory we can remove the `ftl`extension on the
> >>>>>>> fly
> >>>>>>> * We might need an `exclude` filter for the copy operation
> >>>>>>> * Initially resolve to a list of template files and process one
> after
> >>>>>>> another
> >>>>>>> * Need to calculate the output file location and extension
> >>>>>>> * We need to rename the existing command line parameters (see
> below)
> >>>>>>> * Do we need multiple include and exclude filter?
> >>>>>>> * Do we need file versus directory filters?
> >>>>>>>
> >>>>>>> ### Command Line Options
> >>>>>>> ```
> >>>>>>> --input-encoding : Encoding of the documents
> >>>>>>> --output-encoding : Encoding of the rendered template
> >>>>>>> --template-encoding : Encoding of the template
> >>>>>>> --output : Output file or directory
> >>>>>>> --include-document : Include pattern for documents
> >>>>>>> --exclude-document : Exclude pattern for documents
> >>>>>>> --include-template: Include pattern for templates
> >>>>>>> --exclude-template : Exclude pattern for templates
> >>>>>>> ```
> >>>>>>>
> >>>>>>> ### Command Line Examples
> >>>>>>> ```text
> >>>>>>> # Copy all FTL templates found in "ext/config" to the "/config"
> >>>>>>>
> >>>>>>> directory
> >>>>>>>
> >>>>>>> using the data from "config.json"
> >>>>>>>
> >>>>>>> freemarker-cli -t ./ext/config --include-template *.ftl --o /config
> >>>>>>>
> >>>>>>> config.json
> >>>>>>>
> >>>>>>> freemarker-cli --template ./ext/config --include-template *.ftl
> >>>>>>>
> >>>>>>> --output
> >>>>>>>
> >>>>>>> /config config.json
> >>>>>>>
> >>>>>>> # Bascically the same using a named document "configuration"
> >>>>>>> # It might make sense to expose "conf" directly in the FreeMarker
> >>>>>>> data
> >>>>>>> model
> >>>>>>> # It might make sens to allow URIs for loading documents
> >>>>>>>
> >>>>>>> freemarker-cli -t ./ext/config/*.ftl -o /config -d
> >>>>>>>
> >>>>>>> configuration=config.json
> >>>>>>>
> >>>>>>> freemarker-cli --template ./ext/config --include-template *.ftl
> >>>>>>>
> >>>>>>> --output
> >>>>>>>
> >>>>>>> /config --document configuration=config.json
> >>>>>>>
> >>>>>>> freemarker-cli --template ./ext/config --include-template *.ftl
> >>>>>>>
> >>>>>>> --output
> >>>>>>>
> >>>>>>> /config --document configuration=file:///config.json
> >>>>>>>
> >>>>>>> # Bascically the same using an environment variable as named
> document
> >>>>>>>
> >>>>>>> freemarker-cli -t ./ext/config --include-template *.ftl -o /config
> -d
> >>>>>>>
> >>>>>>> configuration=env:///CONFIGURATION
> >>>>>>>
> >>>>>>> freemarker-cli --template ./ext/config --include-template *.ftl
> >>>>>>>
> >>>>>>> --output
> >>>>>>>
> >>>>>>> /config --document configuration=env:///CONFIGURATION
> >>>>>>> ```
> >>>>>>> === END
> >>>>>>>
> >>>>>>> On 23.02.2020, at 16:37, Daniel Dekany <[email protected]> wrote:
> >>>>>>>
> >>>>>>> Input documents is a fundamental concept in freemarker-generator,
> so
> >>>>>>> we
> >>>>>>> should think about that more, and probably refine/rework how it's
> >>>>>>> done.
> >>>>>>>
> >>>>>>> Currently it works like this, with CLI at least.
> >>>>>>>
> >>>>>>> freemarker-cli
> >>>>>>> -t access-report.ftl
> >>>>>>> somewhere/foo-access-log.csv
> >>>>>>>
> >>>>>>> Then in access-report.ftl you have to do something like this:
> >>>>>>>
> >>>>>>> <#assign doc = Documents.get(0)>
> >>>>>>> ... process doc here
> >>>>>>>
> >>>>>>> (The more idiomatic Documents[0] won't work. Actually, that lead
> to a
> >>>>>>>
> >>>>>>> funny
> >>>>>>>
> >>>>>>> chain of coincidences: It returned the string "D", then
> >>>>>>>
> >>>>>>> CSVTool.parse(...)
> >>>>>>>
> >>>>>>> happily parsed that to a table with the single column "D", and 0
> >>>>>>> rows,
> >>>>>>>
> >>>>>>> and
> >>>>>>>
> >>>>>>> as there were 0 rows, the template didn't run into an error because
> >>>>>>> row.myExpectedColumn refers to a missing column either, so the
> >>>>>>> process
> >>>>>>> finished with success. (: Pretty unlucky for sure. The root was
> >>>>>>> unintentionally breaking a FreeMarker idiom though; eventually we
> >>>>>>> will
> >>>>>>>
> >>>>>>> have
> >>>>>>>
> >>>>>>> to work on those too, but, different topic.)
> >>>>>>>
> >>>>>>> However, actually multiple input documents can be passed in:
> >>>>>>>
> >>>>>>> freemarker-cli
> >>>>>>> -t access-report.ftl
> >>>>>>> somewhere/foo-access-log.csv
> >>>>>>> somewhere/bar-access-log.csv
> >>>>>>>
> >>>>>>> Above template will still work, though then you ignored all but the
> >>>>>>>
> >>>>>>> first
> >>>>>>>
> >>>>>>> document. So if you expect any number of input documents, you
> >>>>>>> probably
> >>>>>>>
> >>>>>>> will
> >>>>>>>
> >>>>>>> have to do this:
> >>>>>>>
> >>>>>>> <#list Documents.list as doc>
> >>>>>>> ... process doc here
> >>>>>>> </#list>
> >>>>>>>
> >>>>>>> (The more idiomatic <#list Documents as doc> won't work; but again,
> >>>>>>>
> >>>>>>> those
> >>>>>>>
> >>>>>>> we will work out in a different thread.)
> >>>>>>>
> >>>>>>>
> >>>>>>> So, what would be better, in my opinion. I start out from what I
> >>>>>>> think
> >>>>>>>
> >>>>>>> are
> >>>>>>>
> >>>>>>> the common uses cases, in decreasing order of frequency. Goal is to
> >>>>>>>
> >>>>>>> make
> >>>>>>>
> >>>>>>> those less error prone for the users, and simpler to express.
> >>>>>>>
> >>>>>>> USE CASE 1
> >>>>>>>
> >>>>>>> You have exactly 1 input documents, which is therefore simply "the"
> >>>>>>> document in the mind of the user. This is probably the typical use
> >>>>>>>
> >>>>>>> case,
> >>>>>>>
> >>>>>>> but at least the use case users typically start out from when
> >>>>>>> starting
> >>>>>>>
> >>>>>>> the
> >>>>>>>
> >>>>>>> work.
> >>>>>>>
> >>>>>>> freemarker-cli
> >>>>>>> -t access-report.ftl
> >>>>>>> somewhere/foo-access-log.csv
> >>>>>>>
> >>>>>>> Then `Documents.get(0)` is not very fitting. Most importantly it's
> >>>>>>>
> >>>>>>> error
> >>>>>>>
> >>>>>>> prone, because if the user passed in more than 1 documents (can
> even
> >>>>>>>
> >>>>>>> happen
> >>>>>>>
> >>>>>>> totally accidentally, like if the user was lazy and used a wildcard
> >>>>>>>
> >>>>>>> that
> >>>>>>>
> >>>>>>> the shell exploded), the template will silently ignore the rest of
> >>>>>>> the
> >>>>>>> documents, and the singe document processed will be practically
> >>>>>>> picked
> >>>>>>> randomly. The user might won't notice that and submits a bad report
> >>>>>>> or
> >>>>>>>
> >>>>>>> such.
> >>>>>>>
> >>>>>>> I think that in this use case the document should be simply
> referred
> >>>>>>> as
> >>>>>>> `Document` in the template. When you have multiple documents there,
> >>>>>>> referring to `Document` should be an error, saying that the
> template
> >>>>>>>
> >>>>>>> was
> >>>>>>>
> >>>>>>> made to process a single document only.
> >>>>>>>
> >>>>>>>
> >>>>>>> USE CASE 2
> >>>>>>>
> >>>>>>> You have multiple input documents, but each has different role
> >>>>>>>
> >>>>>>> (different
> >>>>>>>
> >>>>>>> schema, maybe different file type). Like, you pass in users.csv and
> >>>>>>> groups.csv. Each has difference schema, and so you want to access
> >>>>>>> them
> >>>>>>> differently, but in the same template.
> >>>>>>>
> >>>>>>> freemarker-cli
> >>>>>>> [...]
> >>>>>>> --named-document users somewhere/foo-users.csv
> >>>>>>> --named-document groups somewhere/foo-groups.csv
> >>>>>>>
> >>>>>>> Then in the template you could refer to them as:
> >>>>>>>
> >>>>>>> `NamedDocuments.users`,
> >>>>>>>
> >>>>>>> and `NamedDocuments.groups`.
> >>>>>>>
> >>>>>>> Use Case 1, and 2 can be unified into a coherent concept, where
> >>>>>>>
> >>>>>>> `Document`
> >>>>>>>
> >>>>>>> is just a shorthand for `NamedDocuments.main`. It's called "main"
> >>>>>>>
> >>>>>>> because
> >>>>>>>
> >>>>>>> that's "the" document the template is about, but then you have to
> >>>>>>> added
> >>>>>>> some helper documents, with symbolic names representing their role.
> >>>>>>>
> >>>>>>> freemarker-cli
> >>>>>>> -t access-report.ftl
> >>>>>>> --document-name=main somewhere/foo-access-log.csv
> >>>>>>> --document-name=users somewhere/foo-users.csv
> >>>>>>> --document-name=groups somewhere/foo-groups.csv
> >>>>>>>
> >>>>>>> Here, `Document` still works in the template, and it refers to
> >>>>>>> `somewhere/foo-access-log.csv`. (While omitting
> --document-name=main
> >>>>>>>
> >>>>>>> above
> >>>>>>>
> >>>>>>> would be cleaner, I couldn't figure out how to do that with
> Picocli.
> >>>>>>> Anyway, for now the point is the concept, which is not specific to
> >>>>>>>
> >>>>>>> CLI.)
> >>>>>>>
> >>>>>>> USE CASE 3
> >>>>>>>
> >>>>>>> Here you have several of the same kind of documents. That has a
> more
> >>>>>>> generic sub-use-case, when you have explicitly named documents
> (like
> >>>>>>> "users" above), and for some you expect multiple input files.
> >>>>>>>
> >>>>>>> freemarker-cli
> >>>>>>> -t access-report.ftl
> >>>>>>> --document-name=main somewhere/foo-access-log.csv
> >>>>>>> somewhere/bar-access-log.csv
> >>>>>>> --document-name=users somewhere/foo-users.csv
> >>>>>>> somewhere/bar-users.csv
> >>>>>>> --document-name=groups somewhere/global-groups.csv
> >>>>>>>
> >>>>>>> The template must to be written with this use case in mind, as now
> it
> >>>>>>>
> >>>>>>> has
> >>>>>>>
> >>>>>>> #list some of the documents. (I think in practice you hardly ever
> >>>>>>> want
> >>>>>>>
> >>>>>>> to
> >>>>>>>
> >>>>>>> get a document by hard coded index. Either you don't know how many
> >>>>>>> documents you have, so you can't use hard coded indexes, or you do,
> >>>>>>> and
> >>>>>>> each index has a specific meaning, but then you should name the
> >>>>>>>
> >>>>>>> documents
> >>>>>>>
> >>>>>>> instead, as using indexes is error prone, and hard to read.)
> >>>>>>> Accessing that list of documents in the template, maybe could be
> done
> >>>>>>>
> >>>>>>> like
> >>>>>>>
> >>>>>>> this:
> >>>>>>> - For the "main" documents: `DocumentList`
> >>>>>>> - For explicitly named documents, like "users":
> >>>>>>>
> >>>>>>> `NamedDocumentLists.users`
> >>>>>>>
> >>>>>>> SUMMING UP
> >>>>>>>
> >>>>>>> To unify all 3 use cases into a coherent concept:
> >>>>>>> - `NamedDocumentLists.<name>` is the most generic form, and while
> you
> >>>>>>>
> >>>>>>> can
> >>>>>>>
> >>>>>>> achieve everything with it, using it requires your template to
> handle
> >>>>>>>
> >>>>>>> the
> >>>>>>>
> >>>>>>> most generic case too. So, I think it would be rarely used.
> >>>>>>> - `DocumentList` is just a shorthand for `NamedDocumentLists.main`.
> >>>>>>>
> >>>>>>> It's
> >>>>>>>
> >>>>>>> used if you only have one kind of documents (single format and
> >>>>>>> schema),
> >>>>>>>
> >>>>>>> but
> >>>>>>>
> >>>>>>> potentially multiple of them.
> >>>>>>> - `NamedDocuments.<name>` expresses that you expect exactly 1
> >>>>>>> document
> >>>>>>>
> >>>>>>> of
> >>>>>>>
> >>>>>>> the given name.
> >>>>>>> - `Document` is just a shorthand for `NamedDocuments.main`. This is
> >>>>>>> for
> >>>>>>>
> >>>>>>> the
> >>>>>>>
> >>>>>>> most natural/frequent use case.
> >>>>>>>
> >>>>>>> That's 4 possible ways of accessing your documents, which is a
> >>>>>>>
> >>>>>>> trade-off
> >>>>>>>
> >>>>>>> for the sake of these:
> >>>>>>> - Catching CLI (or Maven, etc.) input where the template output
> >>>>>>> likely
> >>>>>>>
> >>>>>>> will
> >>>>>>>
> >>>>>>> be wrong. That's only possible if the user can communicate its
> intent
> >>>>>>>
> >>>>>>> in
> >>>>>>>
> >>>>>>> the template.
> >>>>>>> - Users don't need to deal with concepts that are irrelevant in
> their
> >>>>>>> concrete use case. Just start with the trivial, `Document`, and
> later
> >>>>>>>
> >>>>>>> if
> >>>>>>>
> >>>>>>> the need arises, generalize to named documents, document lists, or
> >>>>>>>
> >>>>>>> both.
> >>>>>>>
> >>>>>>> What do guys think?
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Daniel Dekany
> >>>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>> Daniel Dekany
> >>
> >>
> >
> > --
> > Best regards,
> > Daniel Dekany
>
>

-- 
Best regards,
Daniel Dekany

Reply via email to