Hi Daniel, Please see my comment below
Thanks in advance, Siegfried Goeschl > On 05.03.2020, at 22:36, Daniel Dekany <[email protected]> wrote: > >> >> Regarding the "global mode" and "output generators files" - I'm sorry, but >> I'm not getting it > > > I'm not getting what doesn't go though. Can you explain? The CLI suggested > that you got "global mode" (a single --mode switch per run). [SG] I think the confusion stems from different levels of abstractions (see next chapter) - while I try to get the command line invocation right you seem to think along a more technical implementation level. Please have a look at https://gist.github.com/sgoeschl/b09b343a761b31a6c790d882167ff449 <https://gist.github.com/sgoeschl/b09b343a761b31a6c790d882167ff449> - I think such a "mode" might be needed but is not strictly relevant in the beginning. But it is an important concept fro the implementation ... > > Do you think of defining explicit "output generator file" containing >> `datasources, `templates` and `outputs` - yes that could be done but >> does not feel like an interactive command line tool any longer > > > I think what the CLI exposes and how should be a secondary detail at this > phase, as the CLI is (or should be) just a front end, that wraps the common > core (genertor.base). The CLI, the Maven task, Gradle task, etc. should > probably just be thin wrappers around the common core. Do we agree on that? > So, these concepts are "core" concepts, and probably govern the API of > generator.base. That's was my intent here, to hammer out these core > concepts. > > Also the "output generator file" is usually just a data file, or just a > template. It's just the file that causes some output generated. So,usually > it doesn't *explicitly* contain all that information (though you might as > well introduce a file type that does). But it still defines an output > generator, because, you will have a template, a data-model, and an output > file name. [SG] If you think about the internal representation I fully agree with - I personally see something like a list of "Transformation" executed which contains the template, datasources and output > > I think you are leaning towards a 1.0 release why I favour 0.x.y to >> have room to make mistakes / experiments > > > The version number doesn't tell much to me, so what's your intent/strategy > with these 0.x.y releases you plant to do? Like, if you release 0.1.0, then > will you feel inconvenient to change things *radically* after that? That > can be a problem, if the goal is iterating without bounds. On the other > hand, if you don't feel inconvenient about that at all, I don't really see > why a user would use it. But, if it's clearly indicated that everything can > change, and you think it's useful to release that way, I don't want to be > in your way.ng way [SG] What represents backward compatibility of CLI or Maven plugin? What can change? * I don't want to change the command line parameters (CLI) and generator file layout (Maven) in a breaking way * I want to avoid releasing things like "name:group" versus "group:name" when we have not settled on a decision * What I still want to do in the near future is to change the public implementation classes since I do not assume that someone is using them for the time being > > perfect is the enemy of good > > > I just think the overall concept/architecture should be iterated out first. > Polish, and adding all kind of bells, even fixing bugs, is different matter. [SG] +1 > > On Thu, Mar 5, 2020 at 9:36 PM Siegfried Goeschl < > [email protected]> wrote: > >> Hi Daniel, >> >> The introduction of named `Datasource` allows to simplify / streamline a >> few things >> >> * I have a meaningful user-supplied name >> * I can pass additional configuration information as already implemented >> with `charset` and `contenttype` and this would also allow configure a >> `CSV Datasource`, e.g. >> `users=./data/users.csv#format=default&header=true&delimeter=TAB` which >> can be readily parses >> * Currently the name of datasources are are taken from their relative >> file name - might make sense to drop that but I need to contemplate :-) >> >> Regarding the "global mode" and "output generators files" - I'm sorry, >> but I'm not getting it >> >> * I refined the >> https://gist.github.com/sgoeschl/b09b343a761b31a6c790d882167ff449 to >> make my points more clearly >> * Do you think of defining explicit "output generator file" containing >> `datasources, `templates` and `outputs` - yes that could be done but >> does not feel like an interactive command line tool any longer >> >> >> Regarding "more idiomatic FTL usage" >> >> * Yes, I need to dive into custom template models or whatever it is >> called :-) >> >> >> Something we need to iron out is a release policy >> >> * Currently we have little agreement how the CLI should look like or >> behave >> * I think you are leaning towards a 1.0 release why I favour 0.x.y to >> have room to make mistakes / experiments >> * I personally see the possibility that we don't get a release out - >> "perfect is the enemy of good" >> >> How would you like to handle the problem - can we agree on minimal >> feature set worthy a release? >> >> Thanks in advance, >> >> Siegfried Goeschl >> >> >> On 1 Mar 2020, at 11:33, Daniel Dekany wrote: >> >>>> >>>> Actually not recommended but we have named data sources for less than >>>> 24 >>>> hours >>> >>> >>> Sorry, not sure what that means. Anyway, my "vote" is let's not give >>> automatic names if that's not recommended to utilize. I mean, in case >>> we >>> happen to agree on that, why leave it there. Especially if >>> automatically >>> chosen names can clash with explicitly given ones, that would be a >>> trouble. (I'm not sure right now if they can... the path we use as >>> the >>> name can be realtive? Then it realistically can.) >>> >>> This is a command line tool where we have little idea what the user >>> will do >>>> or abuse >>> >>> >>> No matter how much/little we know, we firmly put our bets by releasing >>> something. So if some feature is certainly not right, that's enough to >>> not >>> have it, I think. >>> >>> How does a "data loader" knows that it is responsible to load a file >>> >>> What should as "CSV data loader" should do - parse it into a list of >>>> records or stream one by one? >>> >>> >>> I think I was misunderstood here. It's not about some kind of >>> auto-magic. >>> It's about where do you specify what to load and how, and in what >>> format do >>> you specify that. Of course, you must specify the data source >>> (basically an >>> URI for now as I saw), the rough format (CSV), and the format options >>> (separator character, etc.), and other freemarker-generator loading >>> options >>> (like which CSV columns are numbers, which are dates, with what >>> format, >>> what counts as null, etc.). >>> >>> What was confusing in what I said much earlier is probably that you >>> don't >>> need a global "--mode". That just means that you can have multiple >>> "modes" >>> in the same run, not that you need some big auto-magic. And that they >>> aren't really "modes" then... I think it's just natural that you can >>> have >>> different kind of "output generator" files in the same run. Why force >>> the >>> assumption that you don't, especially considering that they will might >>> want >>> to access common data (which you don't want to load again and again, >>> for >>> each run of the different --mode-s you need). Of course, as you might >>> select files with wildcards (or by specifying a whole directory, or >>> with >>> some Maven matcher), you just can't directly associate the data loader >>> options to the individual data sources. Instead you can say elsewhere >>> that >>> *.csv inside this explicit "group", or with this file name pattern, is >>> to >>> be loaded like this. That's what you might perceived as auto-magic. >>> It's >>> just mass-producing data loaders for "cattle" files. >>> >>> How to handle the case if you have multiple potential data loaders for >>> a >>>> single file? >>> >>> >>> As per above, that's just two data loaders referring to the same data >>> source, so, nothing special. >>> >>> As of the current state of things, this is how I'm supposed to load a >>> CSV, >>> in the template itself (if I'm not outdated/mistaken): >>> >>> <#assign cvsFormat = CSVTool.formats.DEFAULT.withHeader()> >>> <#assign foos = CSVTool.parse(Datasources.get("foos"), >>> cvsFormat).records> >>> <#assign bars = CSVTool.parse(Datasources.get("barb"), >>> cvsFormat).records> >>> >>> It will worth exploring how to make these look more "idiomatic" FTL >>> (given >>> this is an "official" FM product now, I think, we should show how it's >>> done), and nicer in general. Point for now is, that's basically two >>> data-loaders interwoven with the template there. Because they are >>> interwoven like that, you can't reuse what they loaded for another >>> template >>> execution. >>> >>> That's comes down to personal preferences, e.g. chown uses >>> "owner[:group] " >>> >>> >>> Yeah, but XML namespaces, Java, C, etc. all use >>> <parent><operator><child>, >>> so, I think, that clicks for more of our potential users. So let's bet >>> on >>> what clicks for more users. >>> >>> Besides, I challenged the very idea that we need both groups and >>> names. :) >>> Saying that it's simpler and less opinioned (more flexible) to have >>> just >>> multiple names (like tags). What's the end of that? >>> >>> On Sun, Mar 1, 2020 at 9:47 AM Siegfried Goeschl < >>> [email protected]> wrote: >>> >>>> HI Daniel, >>>> >>>> Please see my comments below >>>> >>>> Thanks in advance, >>>> >>>> Siegfried Goeschl >>>> >>>> >>>>> On 29.02.2020, at 21:02, Daniel Dekany <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> I try to provide a useful name even when the content is coming from >>>>>> an >>>>>> URL >>>>> >>>>> >>>>> When is it recommended to rely on that though? Because utilizing >>>>> that >>>> means >>>>> that renaming a data source file can break the process, even if you >>>>> call >>>>> freemarker-cli with the up to date file name. And if that happens >>>>> depends >>>>> on what you (or an other random colleague!) have dug inside the >>>> templates. >>>>> So I guess we better just don't support this. Less code and less >>>>> things >>>> to >>>>> document too. >>>>> >>>> >>>> Actually not recommended but we have named data sources for less than >>>> 24 >>>> hours >>>> >>>>> >>>>>> I think we have a different understanding what a "Document" / >>>> "Datasource >>>>>> / DataSource" should do >>>>> >>>>> >>>>> Thing is, eventually (most certainly pre-1.0, as it influences >>>>> architecture), certain needs will have to addressed, somehow. Then >>>>> we >>>> will >>>>> see what "things" we really need. For now I though we need "things" >>>>> that >>>>> are much more than paths, and encapsulate the "how to load the data" >>>>> aspect. I called them data sources, but maybe we should called them >>>>> "data >>>>> loaders" to free up data sources for the more primitive thing. Some >>>>> needs/doubts to address, *later*: Is it really the best approach for >>>> users >>>>> to load/parse data sources programmatically (that coded is written >>>>> in >>>> FTL, >>>>> inside the templates)? Also, is the template the right place for >>>>> doing >>>>> that, because, when multiple templates (or just multiple template >>>>> *runs* >>>> of >>>>> the same template, each generating a different output file) needs >>>>> common >>>>> data, they shouldn't load it again and again. Also, different topic, >>>>> can >>>> we >>>>> handle the case "transparently" enough when the data is not coming >>>>> from a >>>>> file? >>>> >>>> This is a command line tool where we have little idea what the user >>>> will >>>> do or abuse >>>> >>>> * How does a "data loader" knows that it is responsible to load a >>>> file >>>> * What should as "CSV data loader" should do - parse it into a list >>>> of >>>> records or stream one by one? >>>> * How to handle the case if you have multiple potential data loaders >>>> for a >>>> single file? >>>> >>>> I'm leaning towards building blocks where the user controls the work >>>> to be >>>> done even it requires one to two extra lines of FTL code >>>> >>>> >>>>> >>>>> The joy of programming - I did not intend to use "name:group" >>>>> together >>>> with >>>>>> wildcards :-) >>>>> >>>>> >>>>> For a CLI tool, I guess we agree that it should work. So maybe, like >>>>> this >>>>> (here logs and foos meant to be "groups"): >>>>> --data-source logs file1.log file2.log fileN.log --data-source >>>>> foos >>>>> foo1.csv foo2.csv fooN.csv --data-source bar bar.xlsx >>>>> >>>>> It so happens that here you don't really have a good control about >>>>> the >>>>> number of files associated to the name, so, maybe yet another reason >>>>> to >>>> not >>>>> differentiate names and groups. >>>>> >>>>> I Disagree here - I think using a name would be used more often. I >>>>> added >>>>>> the "group" as an afterthought since some grouping could be useful >>>>> >>>>> >>>>> We do agree in that. What I said is that the *syntax* should be so >>>>> that >>>> the >>>>> group comes first. It's still optional. Like this: >>>>> --data-source group:name /somewhere >>>>> --data-source name /somewhere >>>> >>>> That's comes down to personal preferences, e.g. chown uses >>>> "owner[:group] " >>>> >>>>> >>>>> On Sat, Feb 29, 2020 at 7:34 PM Siegfried Goeschl < >>>>> [email protected]> wrote: >>>>> >>>>>> HI Daniel, >>>>>> >>>>>> Seem my comments below >>>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> Siegfried Goeschl >>>>>> >>>>>> >>>>>>> On 29.02.2020, at 19:08, Daniel Dekany <[email protected]> >>>> wrote: >>>>>>> >>>>>>> FREEMARKER-135 freemarker-generator-cli: Support user-supplied >>>>>>> names >>>> for >>>>>>> datasources >>>>>>> >>>>>>> So, I can do this to have both a name an a group associated to a >>>>>>> data >>>>>>> source: >>>>>>> --datasource someName:someGroup=somewhere/something >>>>>> >>>>>> Correct >>>>>> >>>>>>> Or if I only want a name, but not a group (or an "" group >>>>>>> actually - >>>>>>> bug?), then: >>>>>>> --datasource someName=somewhere/something >>>>>> >>>>>> Correct >>>>>> >>>>>>> >>>>>>> Or if only a group but not a name (or a "" name actually) then: >>>>>>> --datasource :someGroup=somewhere/something >>>>>> >>>>>> Mhmm, that would be unintended functionality from my side - current >>>>>> approach is that every "Document" / "Datasource / DataSource" is >>>>>> named >>>>>> >>>>>>> >>>>>>> A name must identify exactly 1 data source, while a group >>>>>>> identifies a >>>>>> list >>>>>>> of data sources. >>>>>> >>>>>> No, every "Document" / "Datasource / DataSource" has a name >>>>>> currently >>>> but >>>>>> uniqueness is not enforced. Only if you want to get a "Document" / >>>>>> "Datasource / DataSource" with it's exact name I checked for >>>>>> exactly one >>>>>> search hit and throw an exception. I try to provide a useful name >>>>>> even >>>> when >>>>>> the content is coming from an URL or STDIN (and I will probably add >>>>>> environment variables as "Document" / "Datasource / DataSource", >>>>>> e.g >>>>>> configuration in the cloud as JSON content passed as environment >>>> variable) >>>>>> >>>>>>> >>>>>>> Is that this idea, that the a data source can be part of a group, >>>>>>> and >>>>>> then >>>>>>> is also possibly identifiable with a name comes from an use case? >>>>>>> I >>>> mean, >>>>>>> it's possibly important somewhere, but if so, then it's strange >>>>>>> that >>>> you >>>>>>> can put something into only a single group. If we need this kind >>>>>>> of >>>>>> thing, >>>>>>> then perhaps you should be just allowed to associate the data >>>>>>> source >>>>>> with a >>>>>>> list of names (kind of like tagging), and then when the template >>>>>>> wants >>>> to >>>>>>> get something by name, it will tell there if it expects exactly >>>>>>> one or >>>> a >>>>>>> list of data sources. Then you don't need to introduce two terms >>>>>>> in the >>>>>>> documentation either (names and groups). Again, if we want this at >>>>>>> all, >>>>>>> instead of just going with a data source that itself gives a list. >>>>>>> (And >>>>>> if >>>>>>> not, how will we handle a data source that loads from a non-file >>>> source?) >>>>>> >>>>>> I actually thought of implementing tagging but considered a "group" >>>>>> sufficient. >>>>>> >>>>>> * If you don't define anything everything goes into the "default" >>>>>> group >>>>>> * For individual documents you can define a name and an optional >>>>>> group >>>>>> >>>>>> I think we have a different understanding what a "Document" / >>>> "Datasource >>>>>> / DataSource" should do >>>>>> >>>>>> * It is a dumb >>>>>> * It is lazy since data is only loaded on demand >>>>>> * There is no automagic like "oh, this is a JSON file, so let's go >>>>>> to >>>> the >>>>>> JSON tool and create a map readily accessible in the data model" >>>>>> >>>>>>> >>>>>>> Note that the current command line syntax doesn't work well with >>>>>>> shell >>>>>>> wildcard expansion. Like this: >>>>>>> --datasource :someGroup=logs/*.log >>>>>>> will try to expand ":someGroup=logs/*.log", and because it finds >>>> nothing >>>>>>> (and because the rules of sh and the like is a mess), you will get >>>>>>> the >>>>>>> parameter value as is, without * expanded. >>>>>> >>>>>> The joy of programming - I did not intend to use "name:group" >>>>>> together >>>>>> with wildcards :-) >>>>>> >>>>>>> >>>>>>> Also, I think the syntax with colon should be flipped, because on >>>> other >>>>>>> places foo:bar usually means that foo is the bigger unit (the >>>> container), >>>>>>> and bar is the smaller unit (the child). >>>>>> >>>>>> I Disagree here - I think using a name would be used more often. I >>>>>> added >>>>>> the "group" as an afterthought since some grouping could be useful >>>>>> >>>>>>> >>>>>>> On Sat, Feb 29, 2020 at 5:03 PM Siegfried Goeschl < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Daniel, >>>>>>>> >>>>>>>> I'm an enterprise developer - bad habits die hard :-) >>>>>>>> >>>>>>>> So I closed the following tickets and merged the branches >>>>>>>> >>>>>>>> 1) FREEMARKER-129 freemarker-generator: Merge "freemarker-cli" >>>>>>>> into >>>>>>>> "freemarker-generator" >>>>>>>> 2) FREEMARKER-134 freemarker-generator: Rename "Document" to >>>>>> "Datasource" >>>>>>>> 3) FREEMARKER-135 freemarker-generator-cli: Support user-supplied >>>> names >>>>>>>> for datasources >>>>>>>> >>>>>>>> Thanks in advance, >>>>>>>> >>>>>>>> Siegfried Goeschl >>>>>>>> >>>>>>>> >>>>>>>>> On 29.02.2020, at 12:19, Daniel Dekany <[email protected]> >>>>>> wrote: >>>>>>>>> >>>>>>>>> Yeah, and of course, you can merge that branch. You can even >>>>>>>>> work on >>>>>> the >>>>>>>>> master directly after all. >>>>>>>>> >>>>>>>>> On Sat, Feb 29, 2020 at 12:17 PM Daniel Dekany < >>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> But, I do recognize the cattle use case (several "faceless" >>>>>>>>>> files >>>> with >>>>>>>>>> common format/schema). Only, my idea is to push that complexity >>>>>>>>>> on >>>> the >>>>>>>> data >>>>>>>>>> source. The "data source" concept shields the rest of the >>>> application >>>>>>>> from >>>>>>>>>> the details of how the data is stored or retrieved. So, a data >>>> source >>>>>>>> might >>>>>>>>>> loads a bunch of log files from a directory, and present them >>>>>>>>>> as a >>>>>>>> single >>>>>>>>>> big table, or like a list of tables, etc. So I want to deal >>>>>>>>>> with the >>>>>>>> cattle >>>>>>>>>> use case, but the question is what part of the of architecture >>>>>>>>>> will >>>>>> deal >>>>>>>>>> with this complication, with other words, how do you box >>>>>>>>>> things. Why >>>>>> my >>>>>>>>>> initial bet is to stuff that complication into the "data >>>>>>>>>> source" >>>>>>>>>> implementation(s) is that data sources are inherently varied. >>>>>>>>>> Some >>>>>>>> returns >>>>>>>>>> a table-like thing, some have multiple named tables (worksheets >>>>>>>>>> in >>>>>>>> Excel), >>>>>>>>>> some returns tree of nodes (XML), etc. So then, some might >>>>>>>>>> returns a >>>>>>>>>> list-of-list-of log records, or just a single list of >>>>>>>>>> log-records >>>> (put >>>>>>>>>> together from daily log files). That way cattles don't add to >>>>>> conceptual >>>>>>>>>> complexity. Now, you might be aware of cases where the cattle >>>> concept >>>>>>>> must >>>>>>>>>> be more exposed than this, and the we can't box things like >>>>>>>>>> this. >>>> But >>>>>>>> this >>>>>>>>>> is what I tried to express. >>>>>>>>>> >>>>>>>>>> Regarding "output generators", and how that applies on the >>>>>>>>>> command >>>>>>>> line. I >>>>>>>>>> think it's important that the common core between Maven and >>>>>>>> command-line is >>>>>>>>>> as fat as possible. Ideally, they are just two syntax to set up >>>>>>>>>> the >>>>>> same >>>>>>>>>> thing. Mostly at least. So, if you specify a template file to >>>>>>>>>> the >>>> CLI >>>>>>>>>> application, in a way so that it causes it to process that >>>>>>>>>> template >>>> to >>>>>>>>>> generate a single output, then there you have just defined an >>>> "output >>>>>>>>>> generator" (even if it wasn't explicitly called like that in >>>>>>>>>> the >>>>>> command >>>>>>>>>> line). If you specify 3 csv files to the CLI application, in a >>>>>>>>>> way >>>> so >>>>>>>> that >>>>>>>>>> it causes it to generate 3 output files, then you have just >>>>>>>>>> defined >>>> 3 >>>>>>>>>> "output generators" there (there's at least one template >>>>>>>>>> specified >>>>>> there >>>>>>>>>> too, but that wasn't an "output generator" itself, it was just >>>>>>>>>> an >>>>>>>> attribute >>>>>>>>>> of the 3 output generators). If you specify 1 template, and 3 >>>>>>>>>> csv >>>>>>>> files, in >>>>>>>>>> a way so that it will yield 4 output files (1 for the template, >>>>>>>>>> 3 >>>> for >>>>>>>> the >>>>>>>>>> csv-s), then you have defined 4 output generators there. If you >>>> have a >>>>>>>> data >>>>>>>>>> source that loads a list of 3 entities (say, 3 csv files, so >>>>>>>>>> it's a >>>>>>>> list of >>>>>>>>>> tables then), and you have 2 templates, and you tell the CLI to >>>>>> execute >>>>>>>>>> each template for each item in said data source, then you have >>>>>>>>>> just >>>>>>>> defined >>>>>>>>>> 6 "output generators". >>>>>>>>>> >>>>>>>>>> On Fri, Feb 28, 2020 at 11:08 AM Siegfried Goeschl < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Daniel, >>>>>>>>>>> >>>>>>>>>>> That all depends on your mental model and work you do, >>>> expectations, >>>>>>>>>>> experience :-) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> __Document Handling__ >>>>>>>>>>> >>>>>>>>>>> *"But I think actually we have no good use case for list of >>>> documents >>>>>>>>>>> that's passed at once to a single template run, so, we can >>>>>>>>>>> just >>>>>> ignore >>>>>>>>>>> that complication"* >>>>>>>>>>> >>>>>>>>>>> In my case that's not a complication but my daily business - >>>>>>>>>>> I'm >>>>>>>>>>> regularly wading through access logs - yesterday probably a >>>>>>>>>>> couple >>>> of >>>>>>>>>>> hundreds access logs across two staging sites to help tracking >>>>>>>>>>> some >>>>>>>>>>> strange API gateway issues :-) >>>>>>>>>>> >>>>>>>>>>> My gut feeling is (borrowing from >>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>> >>>> >> https://medium.com/@Joachim8675309/devops-concepts-pets-vs-cattle-2380b5aab313 >>>>>>>>>>> ) >>>>>>>>>>> >>>>>>>>>>> 1. You have a few lovely named documents / templates - `pets` >>>>>>>>>>> 2. You have tons of anonymous documents / templates to process >>>>>>>>>>> - >>>>>>>>>>> `cattle` >>>>>>>>>>> 3. The "grey area" comes into play when mixing `pets & cattle` >>>>>>>>>>> >>>>>>>>>>> `freemarker-cli` was built with 2) in mind and I want to cover >>>>>>>>>>> 1) >>>>>> since >>>>>>>>>>> it is equally important and common. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> __Template And Document Processing Modes__ >>>>>>>>>>> >>>>>>>>>>> IMHO it is important to answer the following question : "How >>>>>>>>>>> many >>>>>>>>>>> outputs do you get when rendering 2 template and 3 >>>>>>>>>>> datasources? >>>> Two, >>>>>>>>>>> Three or Six?" >>>>>>>>>>> >>>>>>>>>>> Your answer is influenced by your mental model / experience >>>>>>>>>>> >>>>>>>>>>> * When wading through tons of CSV files, access logs, etc. the >>>> answer >>>>>>>> is >>>>>>>>>>> "2" >>>>>>>>>>> * When doing source code generation the obvious answer is "6" >>>>>>>>>>> * Can't image a use case which results in "3" but I'm pretty >>>>>>>>>>> sure >>>> we >>>>>>>>>>> will encounter one >>>>>>>>>>> >>>>>>>>>>> __Template and document mode probably shouldn't exist__ >>>>>>>>>>> >>>>>>>>>>> That's hard for me to fully understand - I definitely lack >>>>>>>>>>> your >>>>>>>> insights >>>>>>>>>>> & experience writing such tools :-) >>>>>>>>>>> >>>>>>>>>>> Defining the `Output Generator` is the underlying model for >>>>>>>>>>> the >>>> Maven >>>>>>>>>>> plugin (and probably FMPP). >>>>>>>>>>> >>>>>>>>>>> I'm not sure if this applies for command lines at least not in >>>>>>>>>>> the >>>>>> way >>>>>>>> I >>>>>>>>>>> use them (or would like to use them) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks in advance, >>>>>>>>>>> >>>>>>>>>>> Siegfried Goeschl >>>>>>>>>>> >>>>>>>>>>> PS: Can/shall I merge the PR to bring in `freemarker-cli`? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 28 Feb 2020, at 9:14, Daniel Dekany wrote: >>>>>>>>>>> >>>>>>>>>>>> Yeah, "data source" is surely a too popular name, but for >>>>>>>>>>>> reason. >>>>>>>>>>>> Anyone >>>>>>>>>>>> has other ideas? >>>>>>>>>>>> >>>>>>>>>>>> As of naming data sources and such. One thing I was wondering >>>> about >>>>>>>>>>>> back >>>>>>>>>>>> then is how to deal with list of documents given to a >>>>>>>>>>>> template, >>>>>> versus >>>>>>>>>>>> exactly 1 document given to a template. But I think actually >>>>>>>>>>>> we >>>> have >>>>>>>>>>>> no >>>>>>>>>>>> good use case for list of documents that's passed at once to >>>>>>>>>>>> a >>>>>> single >>>>>>>>>>>> template run, so, we can just ignore that complication. A >>>>>>>>>>>> document >>>>>> has >>>>>>>>>>>> a >>>>>>>>>>>> name, and that's always just a single document, not a >>>>>>>>>>>> collection, >>>> as >>>>>>>>>>>> far as >>>>>>>>>>>> the template is concerned. (We can have multiple documents >>>>>>>>>>>> per >>>> run, >>>>>>>>>>>> but >>>>>>>>>>>> those normally yield separate output generators, so it's >>>>>>>>>>>> still >>>> only >>>>>>>>>>>> one >>>>>>>>>>>> document per template.) However, we can have data source >>>>>>>>>>>> types >>>>>>>>>>>> (document >>>>>>>>>>>> types with old terminology) that collect together multiple >>>>>>>>>>>> data >>>>>> files. >>>>>>>>>>>> So >>>>>>>>>>>> then that complexity is encapsulated into the data source >>>>>>>>>>>> type, >>>> and >>>>>>>>>>>> doesn't >>>>>>>>>>>> complicate the overall architecture. That's another case when >>>>>>>>>>>> a >>>> data >>>>>>>>>>>> source >>>>>>>>>>>> is not just a file. Like maybe there's a data source type >>>>>>>>>>>> that >>>> loads >>>>>>>>>>>> all >>>>>>>>>>>> the CSV-s from a directory, into a single big table (I had >>>>>>>>>>>> such >>>>>> case), >>>>>>>>>>>> or >>>>>>>>>>>> even into a list of tables. Or, as I mentioned already, a >>>>>>>>>>>> data >>>>>> source >>>>>>>>>>>> is >>>>>>>>>>>> maybe an SQL query on a JDBC data source (and we got the >>>>>>>>>>>> first >>>> term >>>>>>>>>>>> clash... JDBC also call them data sources). >>>>>>>>>>>> >>>>>>>>>>>> Template and document mode probably shouldn't exist from user >>>>>>>>>>>> perspective >>>>>>>>>>>> either, at least not as a global option that must apply to >>>>>> everything >>>>>>>>>>>> in a >>>>>>>>>>>> run. They could just give the files that define the "output >>>>>>>>>>>> generators", >>>>>>>>>>>> and some of them will be templates, some of them are data >>>>>>>>>>>> files, >>>> in >>>>>>>>>>>> which >>>>>>>>>>>> case a template need to be associated with them (and there >>>>>>>>>>>> can be >>>> a >>>>>>>>>>>> couple >>>>>>>>>>>> of ways of doing that). And then again, there are the cases >>>>>>>>>>>> where >>>>>> you >>>>>>>>>>>> want >>>>>>>>>>>> to create one output generator per entity from some data >>>>>>>>>>>> source. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Feb 28, 2020 at 8:23 AM Siegfried Goeschl < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Daniel, >>>>>>>>>>>>> >>>>>>>>>>>>> See my comments below - and thanks for your patience and >>>>>>>>>>>>> input >>>> :-) >>>>>>>>>>>>> >>>>>>>>>>>>> *Renaming Document To DataSource* >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, makes sense. I tried to avoid since I'm using >>>> javax.activation >>>>>>>>>>>>> and >>>>>>>>>>>>> its DataSource. >>>>>>>>>>>>> >>>>>>>>>>>>> *Template And Document Mode* >>>>>>>>>>>>> >>>>>>>>>>>>> Agreed - I think it is a valuable abstraction for the user >>>>>>>>>>>>> but it >>>>>> is >>>>>>>>>>>>> not >>>>>>>>>>>>> an implementation concept :-) >>>>>>>>>>>>> >>>>>>>>>>>>> *Document Without Symbolic Names* >>>>>>>>>>>>> >>>>>>>>>>>>> Also agreed and it is going to change but I have not settled >>>>>>>>>>>>> my >>>>>> mind >>>>>>>>>>>>> yet >>>>>>>>>>>>> what exactly to implement. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>>> >>>>>>>>>>>>> Siegfried Goeschl >>>>>>>>>>>>> >>>>>>>>>>>>> On 28 Feb 2020, at 1:05, Daniel Dekany wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> A few quick thoughts on that: >>>>>>>>>>>>> >>>>>>>>>>>>> - We should replace the "document" term with something more >>>>>> speaking. >>>>>>>>>>>>> It >>>>>>>>>>>>> doesn't tell that it's some kind of input. Also, most of >>>>>>>>>>>>> these >>>>>> inputs >>>>>>>>>>>>> aren't something that people typically call documents. Like >>>>>>>>>>>>> a csv >>>>>>>>>>>>> file, or >>>>>>>>>>>>> a database table, which is not even a file (OK we don't >>>>>>>>>>>>> support >>>>>> such >>>>>>>>>>>>> thing >>>>>>>>>>>>> at the moment). I think, maybe "data source" is a safe >>>>>>>>>>>>> enough >>>> term. >>>>>>>>>>>>> (It >>>>>>>>>>>>> also rhymes with data model.) >>>>>>>>>>>>> - You have separate "template" and "document" "mode", that >>>> applies >>>>>> to >>>>>>>>>>>>> a >>>>>>>>>>>>> whole run. I think such specialization won't be helpful. We >>>>>>>>>>>>> could >>>>>>>>>>>>> just say, >>>>>>>>>>>>> on the conceptual level at lest, that we need a set of >>>>>>>>>>>>> "outputs >>>>>>>>>>>>> generators". An output generator is an object (in the API) >>>>>>>>>>>>> that >>>>>>>>>>>>> specifies a >>>>>>>>>>>>> template, a data-model (where the data-model is possibly >>>> populated >>>>>>>>>>>>> with >>>>>>>>>>>>> "documents"), and an output "sink" (a file path, or stdout), >>>>>>>>>>>>> and >>>>>> can >>>>>>>>>>>>> generate the output itself. A practical way of defining the >>>> output >>>>>>>>>>>>> generators in a CLI application is via a bunch of files, >>>>>>>>>>>>> each >>>>>>>>>>>>> defining an >>>>>>>>>>>>> output generator. Some of those files is maybe a template >>>>>>>>>>>>> (that >>>> you >>>>>>>>>>>>> can >>>>>>>>>>>>> even detect from the file extension), or a data file that we >>>>>>>>>>>>> currently call >>>>>>>>>>>>> a "document". They could freely mix inside the same run. I >>>>>>>>>>>>> have >>>>>> also >>>>>>>>>>>>> met >>>>>>>>>>>>> use case when you have a single table (single "document"), >>>>>>>>>>>>> and >>>> each >>>>>>>>>>>>> record >>>>>>>>>>>>> in it yields an output file. That can also be described in >>>>>>>>>>>>> some >>>>>> file >>>>>>>>>>>>> format, or really in any other way, like directly in command >>>>>>>>>>>>> line >>>>>>>>>>>>> argument, >>>>>>>>>>>>> via API, etc. >>>>>>>>>>>>> - You have multiple documents without associated symbolical >>>>>>>>>>>>> name >>>> in >>>>>>>>>>>>> some >>>>>>>>>>>>> examples. Templates can't identify those then in a well >>>>>> maintainable >>>>>>>>>>>>> way. >>>>>>>>>>>>> The actual file name is often not a good identifier, can >>>>>>>>>>>>> change >>>>>> over >>>>>>>>>>>>> time, >>>>>>>>>>>>> and you might don't even have good control over it, like you >>>>>> already >>>>>>>>>>>>> receive it as a parameter from somewhere else, or someone >>>>>>>>>>>>> moves/renames >>>>>>>>>>>>> that files that you need to read. Index is also not very >>>>>>>>>>>>> good, >>>> but >>>>>> I >>>>>>>>>>>>> have >>>>>>>>>>>>> written about that earlier. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Feb 26, 2020 at 9:33 AM Siegfried Goeschl < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi folks, >>>>>>>>>>>>> >>>>>>>>>>>>> still wrapping my side around but assembled some thoughts >>>>>>>>>>>>> here - >>>>>>>>>>>>> >>>> https://gist.github.com/sgoeschl/b09b343a761b31a6c790d882167ff449 >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>>> >>>>>>>>>>>>> Siegfried Goeschl >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 23 Feb 2020, at 23:14, Daniel Dekany <[email protected]> >>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> What you are describing is more like the angle that FMPP >>>>>>>>>>>>> took >>>>>>>>>>>>> initially, >>>>>>>>>>>>> where templates drive things, they generate the output for >>>>>> themselves >>>>>>>>>>>>> >>>>>>>>>>>>> (even >>>>>>>>>>>>> >>>>>>>>>>>>> multiple output files if they wish). By default output files >>>>>>>>>>>>> name >>>>>>>>>>>>> (and >>>>>>>>>>>>> relative path) is deduced from template name. There was also >>>>>>>>>>>>> a >>>>>> global >>>>>>>>>>>>> data-model, built in a configuration file (or equally, built >>>>>>>>>>>>> via >>>>>>>>>>>>> command >>>>>>>>>>>>> line arguments, or both mixed), from which templates get >>>>>>>>>>>>> whatever >>>>>>>>>>>>> data >>>>>>>>>>>>> >>>>>>>>>>>>> they >>>>>>>>>>>>> >>>>>>>>>>>>> are interested in. Take a look at the figures here: >>>>>>>>>>>>> http://fmpp.sourceforge.net/qtour.html. Later, this concept >>>>>>>>>>>>> was >>>>>>>>>>>>> >>>>>>>>>>>>> generalized >>>>>>>>>>>>> >>>>>>>>>>>>> a bit more, because you could add XML files at the same >>>>>>>>>>>>> place >>>> where >>>>>>>>>>>>> you >>>>>>>>>>>>> have the templates, and then you could associate transform >>>>>> templates >>>>>>>>>>>>> to >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> XML files (based on path pattern and/or the XML document >>>> element). >>>>>>>>>>>>> Now >>>>>>>>>>>>> that's like what freemarker-generator had initially (data >>>>>>>>>>>>> files >>>>>> drive >>>>>>>>>>>>> output, and the template is there to transform it). >>>>>>>>>>>>> >>>>>>>>>>>>> So I think the generic mental model would like this: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. You got files that drive the process, let's call them >>>> *generator >>>>>>>>>>>>> files* for now. Usually, each generator file yields an >>>>>>>>>>>>> output >>>> file >>>>>>>>>>>>> (but >>>>>>>>>>>>> maybe even multiple output files, as you might saw in the >>>>>>>>>>>>> last >>>>>>>>>>>>> figure). >>>>>>>>>>>>> These generator files can be of many types, like XML, JSON, >>>>>>>>>>>>> XLSX >>>>>> (as >>>>>>>>>>>>> >>>>>>>>>>>>> in the >>>>>>>>>>>>> >>>>>>>>>>>>> original freemarker-generator), and even templates (as is >>>>>>>>>>>>> the >>>> norm >>>>>> in >>>>>>>>>>>>> FMPP). If the file is not a template, then you got a set of >>>>>>>>>>>>> transformer >>>>>>>>>>>>> templates (-t CLI option) in a separate directory, which can >>>>>>>>>>>>> be >>>>>>>>>>>>> >>>>>>>>>>>>> associated >>>>>>>>>>>>> >>>>>>>>>>>>> with the generator files base on name patterns, and even >>>>>>>>>>>>> based on >>>>>>>>>>>>> >>>>>>>>>>>>> content >>>>>>>>>>>>> >>>>>>>>>>>>> (schema usually). If the generator file is a template (so >>>>>>>>>>>>> that's >>>> a >>>>>>>>>>>>> positional @Parameter CLI argument that happens to be an >>>>>>>>>>>>> *.ftl, >>>> and >>>>>>>>>>>>> is >>>>>>>>>>>>> >>>>>>>>>>>>> not >>>>>>>>>>>>> >>>>>>>>>>>>> a template file specified after the "-t" option), then you >>>>>>>>>>>>> just >>>>>>>>>>>>> Template.process(...) it, and it prints what the output will >>>>>>>>>>>>> be. >>>>>>>>>>>>> 2. You also have a set of variables, the global data-model, >>>>>>>>>>>>> that >>>>>>>>>>>>> contains commonly useful stuff, like what you now call >>>>>>>>>>>>> parameters >>>>>>>>>>>>> (CLI >>>>>>>>>>>>> -Pname=value), but also maybe data loaded from JSON, XML, >>>>>>>>>>>>> etc.. >>>>>> Those >>>>>>>>>>>>> >>>>>>>>>>>>> data >>>>>>>>>>>>> >>>>>>>>>>>>> files aren't "generator files". Templates just use them if >>>>>>>>>>>>> they >>>>>> need >>>>>>>>>>>>> >>>>>>>>>>>>> them. >>>>>>>>>>>>> >>>>>>>>>>>>> An important thing here is to reuse the same mechanism to >>>>>>>>>>>>> read >>>> and >>>>>>>>>>>>> >>>>>>>>>>>>> parse >>>>>>>>>>>>> >>>>>>>>>>>>> those data files, which was used in templates when >>>>>>>>>>>>> transforming >>>>>>>>>>>>> >>>>>>>>>>>>> generator >>>>>>>>>>>>> >>>>>>>>>>>>> files. So we need a common format for specifying how to load >>>>>>>>>>>>> data >>>>>>>>>>>>> >>>>>>>>>>>>> files. >>>>>>>>>>>>> >>>>>>>>>>>>> That's maybe just FTL that #assigns to the variables, or >>>>>>>>>>>>> maybe >>>> more >>>>>>>>>>>>> declarative format. >>>>>>>>>>>>> >>>>>>>>>>>>> What I have described in the original post here was a less >>>> generic >>>>>>>>>>>>> form >>>>>>>>>>>>> >>>>>>>>>>>>> of >>>>>>>>>>>>> >>>>>>>>>>>>> this, as I tried to be true with the original approach. I >>>>>>>>>>>>> though >>>>>> the >>>>>>>>>>>>> proposal will be drastic enough as it is... :) There, the >>>>>>>>>>>>> "main" >>>>>>>>>>>>> document >>>>>>>>>>>>> is the "generator file" from point 1, the "-t" template is >>>>>>>>>>>>> the >>>>>>>>>>>>> transform >>>>>>>>>>>>> template for the "main" document, and the other named >>>>>>>>>>>>> documents >>>>>>>>>>>>> ("users", >>>>>>>>>>>>> "groups") is a poor man's shared data-model from point 2 >>>> (together >>>>>>>>>>>>> with >>>>>>>>>>>>> with -PName=value). >>>>>>>>>>>>> >>>>>>>>>>>>> There's further somewhat confusing thing to get right with >>>>>>>>>>>>> the >>>>>>>>>>>>> list-of-documents (`DocuentList`, `NamedDocumentLists`) >>>>>>>>>>>>> thing >>>>>> though. >>>>>>>>>>>>> In >>>>>>>>>>>>> the model above, as per point 1, if you list multiple data >>>>>>>>>>>>> files, >>>>>>>>>>>>> each >>>>>>>>>>>>> >>>>>>>>>>>>> will >>>>>>>>>>>>> >>>>>>>>>>>>> generate a separate output file. So, if you need take in a >>>>>>>>>>>>> list >>>> of >>>>>>>>>>>>> files >>>>>>>>>>>>> >>>>>>>>>>>>> to >>>>>>>>>>>>> >>>>>>>>>>>>> transform it to a single output file (or at least with a >>>>>>>>>>>>> single >>>>>>>>>>>>> transform >>>>>>>>>>>>> template execution), then you have to be explicit about >>>>>>>>>>>>> that, as >>>>>>>>>>>>> that's >>>>>>>>>>>>> >>>>>>>>>>>>> not >>>>>>>>>>>>> >>>>>>>>>>>>> the default behavior anymore. But it's still absolutely >>>>>>>>>>>>> possible. >>>>>>>>>>>>> Imagine >>>>>>>>>>>>> it as a "list of XLSX-es" is itself like a file format. You >>>>>>>>>>>>> need >>>>>> some >>>>>>>>>>>>> CLI >>>>>>>>>>>>> (and Maven config, etc.) syntax to express that, but that >>>> shouldn't >>>>>>>>>>>>> be a >>>>>>>>>>>>> big deal. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sun, Feb 23, 2020 at 9:43 PM Siegfried Goeschl < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Daniel, >>>>>>>>>>>>> >>>>>>>>>>>>> Good timing - I was looking at a similar problem from >>>>>>>>>>>>> different >>>>>> angle >>>>>>>>>>>>> yesterday (see below) >>>>>>>>>>>>> >>>>>>>>>>>>> Don't have enough time to answer your email in detail now - >>>>>>>>>>>>> will >>>> do >>>>>>>>>>>>> that >>>>>>>>>>>>> tomorrow evening >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks in advance, >>>>>>>>>>>>> >>>>>>>>>>>>> Siegfried Goeschl >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ===. START >>>>>>>>>>>>> # FreeMarker CLI Improvement >>>>>>>>>>>>> ## Support Of Multiple Template Files >>>>>>>>>>>>> Currently we support the following combinations >>>>>>>>>>>>> >>>>>>>>>>>>> * Single template and no data files >>>>>>>>>>>>> * Single template and one or more data files >>>>>>>>>>>>> >>>>>>>>>>>>> But we can not support the following use case which is quite >>>>>> typical >>>>>>>>>>>>> in >>>>>>>>>>>>> the cloud >>>>>>>>>>>>> >>>>>>>>>>>>> __Convert multiple templates with a single data file, e.g >>>> copying a >>>>>>>>>>>>> directory of configuration files using a JSON configuration >>>> file__ >>>>>>>>>>>>> >>>>>>>>>>>>> ## Implementation notes >>>>>>>>>>>>> * When we copy a directory we can remove the `ftl`extension >>>>>>>>>>>>> on >>>> the >>>>>>>>>>>>> fly >>>>>>>>>>>>> * We might need an `exclude` filter for the copy operation >>>>>>>>>>>>> * Initially resolve to a list of template files and process >>>>>>>>>>>>> one >>>>>> after >>>>>>>>>>>>> another >>>>>>>>>>>>> * Need to calculate the output file location and extension >>>>>>>>>>>>> * We need to rename the existing command line parameters >>>>>>>>>>>>> (see >>>>>> below) >>>>>>>>>>>>> * Do we need multiple include and exclude filter? >>>>>>>>>>>>> * Do we need file versus directory filters? >>>>>>>>>>>>> >>>>>>>>>>>>> ### Command Line Options >>>>>>>>>>>>> ``` >>>>>>>>>>>>> --input-encoding : Encoding of the documents >>>>>>>>>>>>> --output-encoding : Encoding of the rendered template >>>>>>>>>>>>> --template-encoding : Encoding of the template >>>>>>>>>>>>> --output : Output file or directory >>>>>>>>>>>>> --include-document : Include pattern for documents >>>>>>>>>>>>> --exclude-document : Exclude pattern for documents >>>>>>>>>>>>> --include-template: Include pattern for templates >>>>>>>>>>>>> --exclude-template : Exclude pattern for templates >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> ### Command Line Examples >>>>>>>>>>>>> ```text >>>>>>>>>>>>> # Copy all FTL templates found in "ext/config" to the >>>>>>>>>>>>> "/config" >>>>>>>>>>>>> >>>>>>>>>>>>> directory >>>>>>>>>>>>> >>>>>>>>>>>>> using the data from "config.json" >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli -t ./ext/config --include-template *.ftl --o >>>> /config >>>>>>>>>>>>> >>>>>>>>>>>>> config.json >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli --template ./ext/config --include-template >>>>>>>>>>>>> *.ftl >>>>>>>>>>>>> >>>>>>>>>>>>> --output >>>>>>>>>>>>> >>>>>>>>>>>>> /config config.json >>>>>>>>>>>>> >>>>>>>>>>>>> # Bascically the same using a named document "configuration" >>>>>>>>>>>>> # It might make sense to expose "conf" directly in the >>>>>>>>>>>>> FreeMarker >>>>>>>>>>>>> data >>>>>>>>>>>>> model >>>>>>>>>>>>> # It might make sens to allow URIs for loading documents >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli -t ./ext/config/*.ftl -o /config -d >>>>>>>>>>>>> >>>>>>>>>>>>> configuration=config.json >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli --template ./ext/config --include-template >>>>>>>>>>>>> *.ftl >>>>>>>>>>>>> >>>>>>>>>>>>> --output >>>>>>>>>>>>> >>>>>>>>>>>>> /config --document configuration=config.json >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli --template ./ext/config --include-template >>>>>>>>>>>>> *.ftl >>>>>>>>>>>>> >>>>>>>>>>>>> --output >>>>>>>>>>>>> >>>>>>>>>>>>> /config --document configuration=file:///config.json >>>>>>>>>>>>> >>>>>>>>>>>>> # Bascically the same using an environment variable as named >>>>>> document >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli -t ./ext/config --include-template *.ftl -o >>>> /config >>>>>> -d >>>>>>>>>>>>> >>>>>>>>>>>>> configuration=env:///CONFIGURATION >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli --template ./ext/config --include-template >>>>>>>>>>>>> *.ftl >>>>>>>>>>>>> >>>>>>>>>>>>> --output >>>>>>>>>>>>> >>>>>>>>>>>>> /config --document configuration=env:///CONFIGURATION >>>>>>>>>>>>> ``` >>>>>>>>>>>>> === END >>>>>>>>>>>>> >>>>>>>>>>>>> On 23.02.2020, at 16:37, Daniel Dekany <[email protected]> >>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Input documents is a fundamental concept in >>>>>>>>>>>>> freemarker-generator, >>>>>> so >>>>>>>>>>>>> we >>>>>>>>>>>>> should think about that more, and probably refine/rework how >>>>>>>>>>>>> it's >>>>>>>>>>>>> done. >>>>>>>>>>>>> >>>>>>>>>>>>> Currently it works like this, with CLI at least. >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli >>>>>>>>>>>>> -t access-report.ftl >>>>>>>>>>>>> somewhere/foo-access-log.csv >>>>>>>>>>>>> >>>>>>>>>>>>> Then in access-report.ftl you have to do something like >>>>>>>>>>>>> this: >>>>>>>>>>>>> >>>>>>>>>>>>> <#assign doc = Documents.get(0)> >>>>>>>>>>>>> ... process doc here >>>>>>>>>>>>> >>>>>>>>>>>>> (The more idiomatic Documents[0] won't work. Actually, that >>>>>>>>>>>>> lead >>>>>> to a >>>>>>>>>>>>> >>>>>>>>>>>>> funny >>>>>>>>>>>>> >>>>>>>>>>>>> chain of coincidences: It returned the string "D", then >>>>>>>>>>>>> >>>>>>>>>>>>> CSVTool.parse(...) >>>>>>>>>>>>> >>>>>>>>>>>>> happily parsed that to a table with the single column "D", >>>>>>>>>>>>> and 0 >>>>>>>>>>>>> rows, >>>>>>>>>>>>> >>>>>>>>>>>>> and >>>>>>>>>>>>> >>>>>>>>>>>>> as there were 0 rows, the template didn't run into an error >>>> because >>>>>>>>>>>>> row.myExpectedColumn refers to a missing column either, so >>>>>>>>>>>>> the >>>>>>>>>>>>> process >>>>>>>>>>>>> finished with success. (: Pretty unlucky for sure. The root >>>>>>>>>>>>> was >>>>>>>>>>>>> unintentionally breaking a FreeMarker idiom though; >>>>>>>>>>>>> eventually we >>>>>>>>>>>>> will >>>>>>>>>>>>> >>>>>>>>>>>>> have >>>>>>>>>>>>> >>>>>>>>>>>>> to work on those too, but, different topic.) >>>>>>>>>>>>> >>>>>>>>>>>>> However, actually multiple input documents can be passed in: >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli >>>>>>>>>>>>> -t access-report.ftl >>>>>>>>>>>>> somewhere/foo-access-log.csv >>>>>>>>>>>>> somewhere/bar-access-log.csv >>>>>>>>>>>>> >>>>>>>>>>>>> Above template will still work, though then you ignored all >>>>>>>>>>>>> but >>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> first >>>>>>>>>>>>> >>>>>>>>>>>>> document. So if you expect any number of input documents, >>>>>>>>>>>>> you >>>>>>>>>>>>> probably >>>>>>>>>>>>> >>>>>>>>>>>>> will >>>>>>>>>>>>> >>>>>>>>>>>>> have to do this: >>>>>>>>>>>>> >>>>>>>>>>>>> <#list Documents.list as doc> >>>>>>>>>>>>> ... process doc here >>>>>>>>>>>>> </#list> >>>>>>>>>>>>> >>>>>>>>>>>>> (The more idiomatic <#list Documents as doc> won't work; but >>>> again, >>>>>>>>>>>>> >>>>>>>>>>>>> those >>>>>>>>>>>>> >>>>>>>>>>>>> we will work out in a different thread.) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> So, what would be better, in my opinion. I start out from >>>>>>>>>>>>> what I >>>>>>>>>>>>> think >>>>>>>>>>>>> >>>>>>>>>>>>> are >>>>>>>>>>>>> >>>>>>>>>>>>> the common uses cases, in decreasing order of frequency. >>>>>>>>>>>>> Goal is >>>> to >>>>>>>>>>>>> >>>>>>>>>>>>> make >>>>>>>>>>>>> >>>>>>>>>>>>> those less error prone for the users, and simpler to >>>>>>>>>>>>> express. >>>>>>>>>>>>> >>>>>>>>>>>>> USE CASE 1 >>>>>>>>>>>>> >>>>>>>>>>>>> You have exactly 1 input documents, which is therefore >>>>>>>>>>>>> simply >>>> "the" >>>>>>>>>>>>> document in the mind of the user. This is probably the >>>>>>>>>>>>> typical >>>> use >>>>>>>>>>>>> >>>>>>>>>>>>> case, >>>>>>>>>>>>> >>>>>>>>>>>>> but at least the use case users typically start out from >>>>>>>>>>>>> when >>>>>>>>>>>>> starting >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> work. >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli >>>>>>>>>>>>> -t access-report.ftl >>>>>>>>>>>>> somewhere/foo-access-log.csv >>>>>>>>>>>>> >>>>>>>>>>>>> Then `Documents.get(0)` is not very fitting. Most >>>>>>>>>>>>> importantly >>>> it's >>>>>>>>>>>>> >>>>>>>>>>>>> error >>>>>>>>>>>>> >>>>>>>>>>>>> prone, because if the user passed in more than 1 documents >>>>>>>>>>>>> (can >>>>>> even >>>>>>>>>>>>> >>>>>>>>>>>>> happen >>>>>>>>>>>>> >>>>>>>>>>>>> totally accidentally, like if the user was lazy and used a >>>> wildcard >>>>>>>>>>>>> >>>>>>>>>>>>> that >>>>>>>>>>>>> >>>>>>>>>>>>> the shell exploded), the template will silently ignore the >>>>>>>>>>>>> rest >>>> of >>>>>>>>>>>>> the >>>>>>>>>>>>> documents, and the singe document processed will be >>>>>>>>>>>>> practically >>>>>>>>>>>>> picked >>>>>>>>>>>>> randomly. The user might won't notice that and submits a bad >>>> report >>>>>>>>>>>>> or >>>>>>>>>>>>> >>>>>>>>>>>>> such. >>>>>>>>>>>>> >>>>>>>>>>>>> I think that in this use case the document should be simply >>>>>> referred >>>>>>>>>>>>> as >>>>>>>>>>>>> `Document` in the template. When you have multiple documents >>>> there, >>>>>>>>>>>>> referring to `Document` should be an error, saying that the >>>>>> template >>>>>>>>>>>>> >>>>>>>>>>>>> was >>>>>>>>>>>>> >>>>>>>>>>>>> made to process a single document only. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> USE CASE 2 >>>>>>>>>>>>> >>>>>>>>>>>>> You have multiple input documents, but each has different >>>>>>>>>>>>> role >>>>>>>>>>>>> >>>>>>>>>>>>> (different >>>>>>>>>>>>> >>>>>>>>>>>>> schema, maybe different file type). Like, you pass in >>>>>>>>>>>>> users.csv >>>> and >>>>>>>>>>>>> groups.csv. Each has difference schema, and so you want to >>>>>>>>>>>>> access >>>>>>>>>>>>> them >>>>>>>>>>>>> differently, but in the same template. >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli >>>>>>>>>>>>> [...] >>>>>>>>>>>>> --named-document users somewhere/foo-users.csv >>>>>>>>>>>>> --named-document groups somewhere/foo-groups.csv >>>>>>>>>>>>> >>>>>>>>>>>>> Then in the template you could refer to them as: >>>>>>>>>>>>> >>>>>>>>>>>>> `NamedDocuments.users`, >>>>>>>>>>>>> >>>>>>>>>>>>> and `NamedDocuments.groups`. >>>>>>>>>>>>> >>>>>>>>>>>>> Use Case 1, and 2 can be unified into a coherent concept, >>>>>>>>>>>>> where >>>>>>>>>>>>> >>>>>>>>>>>>> `Document` >>>>>>>>>>>>> >>>>>>>>>>>>> is just a shorthand for `NamedDocuments.main`. It's called >>>>>>>>>>>>> "main" >>>>>>>>>>>>> >>>>>>>>>>>>> because >>>>>>>>>>>>> >>>>>>>>>>>>> that's "the" document the template is about, but then you >>>>>>>>>>>>> have to >>>>>>>>>>>>> added >>>>>>>>>>>>> some helper documents, with symbolic names representing >>>>>>>>>>>>> their >>>> role. >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli >>>>>>>>>>>>> -t access-report.ftl >>>>>>>>>>>>> --document-name=main somewhere/foo-access-log.csv >>>>>>>>>>>>> --document-name=users somewhere/foo-users.csv >>>>>>>>>>>>> --document-name=groups somewhere/foo-groups.csv >>>>>>>>>>>>> >>>>>>>>>>>>> Here, `Document` still works in the template, and it refers >>>>>>>>>>>>> to >>>>>>>>>>>>> `somewhere/foo-access-log.csv`. (While omitting >>>>>> --document-name=main >>>>>>>>>>>>> >>>>>>>>>>>>> above >>>>>>>>>>>>> >>>>>>>>>>>>> would be cleaner, I couldn't figure out how to do that with >>>>>> Picocli. >>>>>>>>>>>>> Anyway, for now the point is the concept, which is not >>>>>>>>>>>>> specific >>>> to >>>>>>>>>>>>> >>>>>>>>>>>>> CLI.) >>>>>>>>>>>>> >>>>>>>>>>>>> USE CASE 3 >>>>>>>>>>>>> >>>>>>>>>>>>> Here you have several of the same kind of documents. That >>>>>>>>>>>>> has a >>>>>> more >>>>>>>>>>>>> generic sub-use-case, when you have explicitly named >>>>>>>>>>>>> documents >>>>>> (like >>>>>>>>>>>>> "users" above), and for some you expect multiple input >>>>>>>>>>>>> files. >>>>>>>>>>>>> >>>>>>>>>>>>> freemarker-cli >>>>>>>>>>>>> -t access-report.ftl >>>>>>>>>>>>> --document-name=main somewhere/foo-access-log.csv >>>>>>>>>>>>> somewhere/bar-access-log.csv >>>>>>>>>>>>> --document-name=users somewhere/foo-users.csv >>>>>>>>>>>>> somewhere/bar-users.csv >>>>>>>>>>>>> --document-name=groups somewhere/global-groups.csv >>>>>>>>>>>>> >>>>>>>>>>>>> The template must to be written with this use case in mind, >>>>>>>>>>>>> as >>>> now >>>>>> it >>>>>>>>>>>>> >>>>>>>>>>>>> has >>>>>>>>>>>>> >>>>>>>>>>>>> #list some of the documents. (I think in practice you hardly >>>>>>>>>>>>> ever >>>>>>>>>>>>> want >>>>>>>>>>>>> >>>>>>>>>>>>> to >>>>>>>>>>>>> >>>>>>>>>>>>> get a document by hard coded index. Either you don't know >>>>>>>>>>>>> how >>>> many >>>>>>>>>>>>> documents you have, so you can't use hard coded indexes, or >>>>>>>>>>>>> you >>>> do, >>>>>>>>>>>>> and >>>>>>>>>>>>> each index has a specific meaning, but then you should name >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> documents >>>>>>>>>>>>> >>>>>>>>>>>>> instead, as using indexes is error prone, and hard to read.) >>>>>>>>>>>>> Accessing that list of documents in the template, maybe >>>>>>>>>>>>> could be >>>>>> done >>>>>>>>>>>>> >>>>>>>>>>>>> like >>>>>>>>>>>>> >>>>>>>>>>>>> this: >>>>>>>>>>>>> - For the "main" documents: `DocumentList` >>>>>>>>>>>>> - For explicitly named documents, like "users": >>>>>>>>>>>>> >>>>>>>>>>>>> `NamedDocumentLists.users` >>>>>>>>>>>>> >>>>>>>>>>>>> SUMMING UP >>>>>>>>>>>>> >>>>>>>>>>>>> To unify all 3 use cases into a coherent concept: >>>>>>>>>>>>> - `NamedDocumentLists.<name>` is the most generic form, and >>>>>>>>>>>>> while >>>>>> you >>>>>>>>>>>>> >>>>>>>>>>>>> can >>>>>>>>>>>>> >>>>>>>>>>>>> achieve everything with it, using it requires your template >>>>>>>>>>>>> to >>>>>> handle >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> most generic case too. So, I think it would be rarely used. >>>>>>>>>>>>> - `DocumentList` is just a shorthand for >>>> `NamedDocumentLists.main`. >>>>>>>>>>>>> >>>>>>>>>>>>> It's >>>>>>>>>>>>> >>>>>>>>>>>>> used if you only have one kind of documents (single format >>>>>>>>>>>>> and >>>>>>>>>>>>> schema), >>>>>>>>>>>>> >>>>>>>>>>>>> but >>>>>>>>>>>>> >>>>>>>>>>>>> potentially multiple of them. >>>>>>>>>>>>> - `NamedDocuments.<name>` expresses that you expect exactly >>>>>>>>>>>>> 1 >>>>>>>>>>>>> document >>>>>>>>>>>>> >>>>>>>>>>>>> of >>>>>>>>>>>>> >>>>>>>>>>>>> the given name. >>>>>>>>>>>>> - `Document` is just a shorthand for `NamedDocuments.main`. >>>>>>>>>>>>> This >>>> is >>>>>>>>>>>>> for >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> most natural/frequent use case. >>>>>>>>>>>>> >>>>>>>>>>>>> That's 4 possible ways of accessing your documents, which is >>>>>>>>>>>>> a >>>>>>>>>>>>> >>>>>>>>>>>>> trade-off >>>>>>>>>>>>> >>>>>>>>>>>>> for the sake of these: >>>>>>>>>>>>> - Catching CLI (or Maven, etc.) input where the template >>>>>>>>>>>>> output >>>>>>>>>>>>> likely >>>>>>>>>>>>> >>>>>>>>>>>>> will >>>>>>>>>>>>> >>>>>>>>>>>>> be wrong. That's only possible if the user can communicate >>>>>>>>>>>>> its >>>>>> intent >>>>>>>>>>>>> >>>>>>>>>>>>> in >>>>>>>>>>>>> >>>>>>>>>>>>> the template. >>>>>>>>>>>>> - Users don't need to deal with concepts that are irrelevant >>>>>>>>>>>>> in >>>>>> their >>>>>>>>>>>>> concrete use case. Just start with the trivial, `Document`, >>>>>>>>>>>>> and >>>>>> later >>>>>>>>>>>>> >>>>>>>>>>>>> if >>>>>>>>>>>>> >>>>>>>>>>>>> the need arises, generalize to named documents, document >>>>>>>>>>>>> lists, >>>> or >>>>>>>>>>>>> >>>>>>>>>>>>> both. >>>>>>>>>>>>> >>>>>>>>>>>>> What do guys think? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best regards, >>>>>>>>>> Daniel Dekany >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best regards, >>>>>>>>> Daniel Dekany >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best regards, >>>>>>> Daniel Dekany >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Daniel Dekany >>>> >>>> >>>> >>> >>> -- >>> Best regards, >>> Daniel Dekany >> > > > -- > Best regards, > Daniel Dekany
