Re: Current state of "freemarker-generator"

Siegfried Goeschl Tue, 07 Jul 2020 06:26:32 -0700

Hi folks,

I assembled the open discussions & TODOs - see below


Thanks in advance,

Siegfried Goeschl


1. Open Discussions
=============================================================================


1.1 Naming of CLI file
-----------------------------------------------------------------------------

It is currently named "freemarker-cli" but it was suggested to call it"freemarker-generator" - I'm happy with "freemarker-cli" since itreflects the Maven project layout



1.2 Complex Transformations using the CLI
-----------------------------------------------------------------------------

That a large topic (seed templates, seed datasources, shareddatasources, aggregation versus generation) - currently low on mypriority list because


* Transforming multiple templates and template directories is supported

* Personally I would not use a command-line tool for source codegeneration but use a plugin (to avoid extra dependencies to beinstalled)


Question being discussed are

* Is this an important feature?
* Can it be implemented later (if required) without breaking stuff?


1.3 Site Generation
-----------------------------------------------------------------------------

Currently Maven & Markdown is used but it is suggested to migrate toXDocBook to match the existing FreeMarker documentation.


* Daniel is having a look at it


1.4 Data Loaders
-----------------------------------------------------------------------------

Currently it is possible to materialize certain files types directly inthe model using "-m" or "--data-model" parameter which parses the givencontent based on the content type

```

freemarker-cli --data-modelpost=https://jsonplaceholder.typicode.com/posts/2 -i 'post title is:${post.title}'; echo

```

It was suggested to change to something like

```
-s 'post=json("https://jsonplaceholder.typicode.com/posts/2";)'
```

which adds more flexibility


1.5 Named URIs
-----------------------------------------------------------------------------

Currently Named URIs are supported to provide a name and additionalmetadats (mimetype, encoding) - the current implementation is notcomplete but it should look like


```
-s 'myTable=foo/bar.csv#format=EXCEL&delimiter=TAB'
```

Current suggestion is to switch to something like

```
-s 'myTable=csv("foo/bar.csv", {"format": "EXCEL", "delimiter": ";"})'
```

1.6 Exposing DataSources
-----------------------------------------------------------------------------

Currently a class `DataSources` is used which keeps a list of datasources and allows lookup by name (mixture of list and map) - we haveongoing discussions


* Does a DataSource has a name when not directly specified by the user?
* Shall we enforce unique data source names?


1.7 Making Usage More FreeMarker Friendly
-----------------------------------------------------------------------------

In FREEMARKER-148 I wrapped the `DataSources` into a `BeanModel`allowing better FTL support, e.g.


* `DataSources[0]` instead of `DataSources.get(0)`
* `DataSources["user"]` instead of `DataSource.get("user")`

Care must be taken regarding name collisions of data sources and exposedmethods of `DataSources`



1.8 Overlap Of CLI And Maven Plugin
-----------------------------------------------------------------------------

Both artifacts reside in the same Maven build but currently the `MavenPlugin` is completely independent

* This might change in the future but I spend little time on the Mavenplugin



1.9 Support Of Backward Compatibility
-----------------------------------------------------------------------------

I lean towards 0.X.Y release and not guaranteeing backward compatibility- little field testing and no real users.



1.10 Support Of Groups
-----------------------------------------------------------------------------

The Named URI allow to assign an optional group which can be used tolookup up data sources


```
-data-source 'myTable:data=foo/bar.csv'
```

* I implemented it since I though it would be a nice feature but not sosure about it any longer :-)* It was suggest to swap name and group, e.g. `-data-source'data:myTable=foo/bar.csv'`



2. TODOs
=============================================================================


2.1 Discuss And Prioritise The Open Topics
-----------------------------------------------------------------------------

Discuss and define what needs to be done for the first public release.


2.2 Get The Release Process Going
-----------------------------------------------------------------------------

FreeMarker project is using `Ant` for the release process (seehttps://freemarker.apache.org/committer-howto.html) but that does notwork out-of-the-box with the `freemarker-generator` Maven project.



2.3 Tool Access in FreeMarker Model
-----------------------------------------------------------------------------

All tools are directly exposed which adds a lot of noise - might be amap a better idea?


```
tools["csv"]
tools["gson"]
```

instead of

```
${CSVTool}
${GsonTool}
```


2.3 DataSource Access in FreeMarker Model
-----------------------------------------------------------------------------

As discussed above a `DataSources` instance is exposed - maybe it shouldbe called `dataSources` since it might be only a dumb list/map?




On 5 Jul 2020, at 18:38, Daniel Dekany wrote:

On Sun, Jul 5, 2020 at 4:13 PM Siegfried Goeschl <
[email protected]> wrote:
Hi Daniel,
No problem with the XDocBook if you feel strongly about it :-) Willthe
Maven site also be published somehow?

Regarding releasing "freemarker-generator"

* At the end of the day it is a command line tool people might use
sporadically - having said that the Maven plugin is a slightlydifferent
story
* As far as I'm concerned my use-cases (data-centric processing &cloud
configuration stuff) are reasonably working
* I guess we can spend another 12 months discussing assumptions,
architectural decisions and implementation details (whileappreciating your
knowledge and insights)
Let's not do that, but it's not what is happening either, as far as Isee.There were like 3(ish) issues of the more fundamental nature (notdetails)that I brought up. Actually, re-raised them now after a few months,becauseI see no closure on your side. I mean, you agree but can't invest intoit,
or you don't think these serve real demands, or... what is it?
I also said that as far as I'm concerned, we can do a release, ifusers are
properly informed about what we do not promise.
* As long as there is no release & real users out there thosediscussion
are mostly an intellectual exercise
In general, yes, but it depends on the concrete cases. For example,
data-file-seeded generation. The Maven plugin donators did that atwork.FMPP users did both directions too. Those are real users. That wedon'taddress that can have some valid reasons, but not having real/provenvalue
is not one of them. Or at least I guess you will agree with that.
* There is a danger that we never get a release out as it happenedwith
the Maven plugin
The maven plugin was abandoned by the donators right after thedonation.
Nobody cared, or had the time they may wanted to have, so nobody did
anything with it. (That's really the #1 issue with OS development,when
there's no company with paid developers behind it.)
I think the code base went a long way already so we should clearlydefine
what's ABSOLUTELY missing to get it out the door .
Sure, let's see what others find (I have already told myobservations).
Please note that there are a couple of additional tasks actually tocreate
a user base

* No idea how to get the Apache release procedures working (signing,
staging, etc ..)
It's documented here, but we (or at least I) will help:
https://freemarker.apache.org/committer-howto.html
Whatever you will find missing from there, we should add.
* Need to look int Brew and Linux distributions
And Windows, if you have access... tests fail on that for example.(Some
charset or line ending mess... didn't investigate yet.)
* Getting some public awareness - blogs, presentations, ApacheConanyone?
Tweet is where we have some people... and of course menu item on
freemarker.apache.org.

Thanks in advance,
Siegfried Goeschl
On 05.07.2020, at 15:18, Daniel Dekany <[email protected]>wrote:
Find answers inline.

On Sun, Jul 5, 2020 at 9:31 AM Siegfried Goeschl <
[email protected] <mailto:[email protected]>>wrote:
Hi Daniel,

thanks for your feedback ...

# 1. Site Generation

Maven is not soo bad - I uploaded the generated site to
http://www.senilesoftwareengineer.org/freemarker-generator/ - I
personally have no problem using FreeMarker documentation approachbutdecided no to follow it up for the time being due to timeconstraints.
So if I convert the markdown to XDocBook, and set up site generationforit, and thereafter you will edit XDocBook (I recommend XXE forthat),
will
that be OK with you?

# 2. Design/Architectural Doubts
## 2.1 Parsed Data

Currently we can load files/URLs as "DataSource" (which need to be
processed in FTL) or we can use "-m / --data-model" to parse andexpose
the content directly (see
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/data-models.html
).
(Yeah, that's where I said that it looks like a quick ad-hocsolution,
and
as such is quite limited.)

Regarding a less programmatic way - Named URIs are used (see
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/named-uris.html
<
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/named-uris.html
)
to pass additional metadata but the implementation requires somere-work
to pass arbitrary metadata in the fragment.

E.g. what should be possible is

```
freeMarker-cli -t some.ftl -m
user=somepath/user.csv#delimiter=TAB&format=DEFAULT&charset=UTF-16&header=false
${CSVtool.parse(DataSources["user"])}
```
The idea is to be able to pass all metadata being used to parse thedata
source - that would be also helpful for defining a SQL data source.
Yes. But I guess we should break out from the preconception that adatasource must be an URL, and also that therefore we have to encode allthe
parsing and selection parameters into the anchor as name-value pairs
(which
also implies that you might have % escapes). Because we sometimesList
and
Map<String, Object>-like data structures as well, etc. (Also, usingtheanchor is kind of an abuse of the URL syntax, and &name=value isquery
parameter syntax, not anchor syntax. So this is maybe confusing for
users,
like they will keep writing ? instead of #.)
My idea is that a Data Source should be the name of a Data Loader(like,"csv", "excel", "jdbc", etc.), plus its parameters. To specify that,weshould use FTL method call syntax, because that syntax the usershave tolearn anyway, and it's still relatively expressive (and can bere-used in
actual templates - see that later):
-s 'myTable=csv("foo/bar.csv", {"format": "EXCEL", "delimiter":";"})'
If we have this (which has basically the same meaning as it has
currently):
-s  myRawTable=foo/bar.csv
that could be just a shorthand for:
-s myRawTable='raw("foo/bar.csv")'

So "raw" data is what you pass to "tools" for parsing in the current
version.
If you prefer do parsing in the templates, then you can reuse whatyou
learned about -s, as it's pretty much the same:
<#assign myTable = DataLoaders.csv(DataSources.myRawTable, {"format":
"EXCEL", "delimiter": ";" })>
Also, if you decide to parse data inside the template, doing it willbe
very similar as in the -s switch.
## Data versus Template Driven Generation
Currently a variety of transformations are supported on the commandline
(see
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/transformation.html
)

* single template
* multiple templates
* directory of templates

Regarding "template versus data driven" generation - CLI does not
support data-driven generation where one output file is generatedper
data source.
Yes. That's exactly what I feel is a major mistake on the long run.Atleast we should be able to add it later, but I guess, to ensurethat, wepretty much have to implement it. So what will happen with this,what's
your standpoint?
Furthermore, I don't think that we should have a "mode" for this,whichapplies for a whole run. I think I'm repeating myself from somemonths
ago,
but, to recap, there are simply "seeds" that cause generating anoutputfile. That "seed" can be a template, or a data file, or part ofloaded
data
(like each row in a table is a "seed"). And I believe it's worth ittoallow mixing these in the same run, because thenfreemarker-generator
will
be good enough fit for more use-cases. So for example, you have someJSONfiles in your project, each will generate a java file (typicalsource
code
generation), but you also have templates for pom.xml, etc., thatwill
generate its own output.

## 2.2 Convergence of Maven Plugin & CLI
It is on my list but does not have high priority since it shouldnot
affect Maven plugin usage (regarding backward compatibility).
But that would be a completely new pugin, I assume.

# 3. Maturity / Backward Compatibility
The project is not mature - we have no real users and little field
testing - in short I don't want to guarantee backward-compatibilityfornow because some of my decisions might turn out to be plain wrong(or
even stupid) :-)

According to https://semver.org we can release 0.x.y and follow
https://semver.org/#spec-item-4 - not so nice but pragmatic.
# 4. General Considerations
There is a danger we never get "freemarker-generator" out of thedoor("perfect is the enemy of good") therefore the whole PMC shouldlook
into the questions/answers
* "freemarker-generator" needs community usage to become reallyuseful -
the unreleased stuff does not add any value
* Is it acceptable to release a 0.X.Z where we notbackward-compatible
until 1.0.0?
With the clear warning that we won't maintain this "branch" (aseries of
backward compatible releases), etc. we can release. That, however,
discourages usage. And that's really the problem, not that we can't
release, because we can publish stuff in whatever development stage
(assuming users will be well informed). But if we are to hope forany
significant user base, I believe we should be able to promise some
stability, and maintenance of the BC branch. That's why I'm sayingthat
we
should get the core architecture about right, and the project scope.(I'mnot talking about missing features that can be added later, norabout
bugs
and other rough edges.)
* Are there any deal breakers which needs to fixed before we startwith
a first public release?
As far as I'm concerned (but I'm just one of the PMC members), ifyou
have
strong feelings for doing a release, with the proper warnings, sure,I
will
collaborate. The deal breaker is more for the users, who get no goodBC
promise or maintenance.

Thanks in advance,
Siegfried Goeschl

On 4 Jul 2020, at 1:10, Daniel Dekany wrote:
It should be in https://freemarker.apache.org/generator/. You will
need to
commit/push into
https://github.com/apache/freemarker-site/tree/asf-site
for that. (See
https://freemarker.apache.org/committer-howto.html#updating-homepage)
We need to generate the web pages somehow though. I saw you try todo
that
with Maven "site", but personally, I can't imagine that it can be
tweaked
to generate reasonable output. Not sure how others see this. Maven
"site"
is basically the Maven model dumped into HTML pages, and "reports"
like
even the log of the Rat run etc. Dozens of menus and sub-pages for
details
users mostly don't care about. The actual content users do careabout
is
stuffed under "About", and from there it's not properly navigableor
searchable, as apparently it's assumed to be a single page. I know
back
then you didn't want to use the tool that's used both for the mainweb
site
and for the FreeMarker documentation, but I think that's the most
economical solution currently (and then we also have commonstyling,
and
common place to fix whatever technical issues), so pleasereconsider
that.
(I can help setting that up, of course.)
Now, some more difficult-to-address problematic things. My problemis
that
these just weren't concluded. Discussion just died. To beconcluded
can
even mean that other PMC members say we should step over these,even
if I
disagree. But these should be understood, and considered. So,these
are:

  - Design/architectural doubts that are probably not realistic to
address
  much later:
     - Currently, data sources are just URL-s (locations), and
templates
are meant to call tool API-s to parse them. As I said backthen,
one
     consequence is that then, you can't put parsed data into the
data-model
     shared by all templates (i.e., via -m/--data-model). Because,
you have no
convenient/concise way to load the actual (parsed) data,instead
you have
to "program it" in FTL, because the assumption was that youonly
want to do
     that in templates anyway. Furthermore, as I saw it just now,
--data-model
actually supports some ad-hoc way of parsing the data pointedby
a
data-source-like URL, as JSON, YAML, maybe some more. Here'san
example of
that: `-m config=env:///DB_CONFIG#mimetype=application/json`.So
there is a
     need actually. But compared to what you can do in templeats,
it's totally
     different, and of course very restrictive. I was also looking
for a less
     "programmatic" way of loading data because even doing it in
templates is
not very friendly as it is. (For ultimate flexibility youmight
need
     program logic for sure, but certainly not just to grab a
worksheet from an
Excel file.) We also should have a POC for loading from aless
file centric
     data source, like from a DB with a SQL query, to see if the
concept works.
     - Not sure what's with template-driven VS data driven output
generation. Data driven is when you generate one output fileper
some data
     unit (like per JSON file, or per DB record), and all is
transformed by a
     common template. How would that look currently in
freemarker-cli?
- The Maven plugin provides very different functionality thanthe
CLI.
The original concept was that they are just two interfaces tothe
same
  product. That was also brought up, and I don't think it was
addressed much,
  like, why you disagree, if you do, or what's going on. Anyway,
regarding
what we have now. The naming implies that, as both are"FreeMarkerGenerator", just one is "CLI", and the other is "Maven". But asfar
as I
  see, they are two different "products" really, focusing on
different use
  cases. So, if it stays like this, then some of them have to be
renamed at
  least. Or I don't know what others think we can do. (Then it's
still
  somewhat weird to release them together.)

(I have also run through the CLI documentation, and found a lot of
less
fundamental things to address, but I don't want to overload the
thread.)

Also, if the project is released mostly as it is now, what will we
promise
regarding backward compatibility? What do we communicate about the
matureness of the project?

On Wed, Jul 1, 2020 at 8:29 AM Siegfried Goeschl <
[email protected]> wrote:
Hi folks,
I'm pretty much finished with the stuff I would like to have forthe
first
public release but there is a still a lot of work ahead
* FREEMARKER-148 [freemarker-cli] Make usage of "DataSources"more
"Freemarker" like
* FREEMARKER-150 [freemarker-cli] Setup "freemarker-generator"web
site
* Setup the release process (Daniel, I guess I need some helphere)
* An iteration of reviews and discussions

Some thought along the line

* I would like to finish FREEMARKER-148 this week
* Daniel, any suggestion where to host the public website?

Thanks in advance,

Siegfried Goeschl
--
Best regards,
Daniel Dekany
--
Best regards,
Daniel Dekany
--
Best regards,
Daniel Dekany

Re: Current state of "freemarker-generator"

Reply via email to