Hi folks,

I assembled the open discussions & TODOs - see below

Thanks in advance,

Siegfried Goeschl


1. Open Discussions
=============================================================================


1.1 Naming of CLI file
-----------------------------------------------------------------------------

It is currently named "freemarker-cli" but it was suggested to call it "freemarker-generator" - I'm happy with "freemarker-cli" since it reflects the Maven project layout


1.2 Complex Transformations using the CLI
-----------------------------------------------------------------------------

That a large topic (seed templates, seed datasources, shared datasources, aggregation versus generation) - currently low on my priority list because

* Transforming multiple templates and template directories is supported
* Personally I would not use a command-line tool for source code generation but use a plugin (to avoid extra dependencies to be installed)

Question being discussed are

* Is this an important feature?
* Can it be implemented later (if required) without breaking stuff?


1.3 Site Generation
-----------------------------------------------------------------------------

Currently Maven & Markdown is used but it is suggested to migrate to XDocBook to match the existing FreeMarker documentation.

* Daniel is having a look at it


1.4 Data Loaders
-----------------------------------------------------------------------------

Currently it is possible to materialize certain files types directly in the model using "-m" or "--data-model" parameter which parses the given content based on the content type

```
freemarker-cli --data-model post=https://jsonplaceholder.typicode.com/posts/2 -i 'post title is: ${post.title}'; echo
```

It was suggested to change to something like

```
-s 'post=json("https://jsonplaceholder.typicode.com/posts/2";)'
```

which adds more flexibility


1.5 Named URIs
-----------------------------------------------------------------------------

Currently Named URIs are supported to provide a name and additional metadats (mimetype, encoding) - the current implementation is not complete but it should look like

```
-s 'myTable=foo/bar.csv#format=EXCEL&delimiter=TAB'
```

Current suggestion is to switch to something like

```
-s 'myTable=csv("foo/bar.csv", {"format": "EXCEL", "delimiter": ";"})'
```

1.6 Exposing DataSources
-----------------------------------------------------------------------------

Currently a class `DataSources` is used which keeps a list of data sources and allows lookup by name (mixture of list and map) - we have ongoing discussions

* Does a DataSource has a name when not directly specified by the user?
* Shall we enforce unique data source names?


1.7 Making Usage More FreeMarker Friendly
-----------------------------------------------------------------------------

In FREEMARKER-148 I wrapped the `DataSources` into a `BeanModel` allowing better FTL support, e.g.

* `DataSources[0]` instead of `DataSources.get(0)`
* `DataSources["user"]` instead of `DataSource.get("user")`

Care must be taken regarding name collisions of data sources and exposed methods of `DataSources`


1.8 Overlap Of CLI And Maven Plugin
-----------------------------------------------------------------------------

Both artifacts reside in the same Maven build but currently the `Maven Plugin` is completely independent

* This might change in the future but I spend little time on the Maven plugin


1.9 Support Of Backward Compatibility
-----------------------------------------------------------------------------

I lean towards 0.X.Y release and not guaranteeing backward compatibility - little field testing and no real users.


1.10 Support Of Groups
-----------------------------------------------------------------------------

The Named URI allow to assign an optional group which can be used to lookup up data sources

```
-data-source 'myTable:data=foo/bar.csv'
```

* I implemented it since I though it would be a nice feature but not so sure about it any longer :-) * It was suggest to swap name and group, e.g. `-data-source 'data:myTable=foo/bar.csv'`


2. TODOs
=============================================================================


2.1 Discuss And Prioritise The Open Topics
-----------------------------------------------------------------------------

Discuss and define what needs to be done for the first public release.


2.2 Get The Release Process Going
-----------------------------------------------------------------------------

FreeMarker project is using `Ant` for the release process (see https://freemarker.apache.org/committer-howto.html) but that does not work out-of-the-box with the `freemarker-generator` Maven project.


2.3 Tool Access in FreeMarker Model
-----------------------------------------------------------------------------

All tools are directly exposed which adds a lot of noise - might be a map a better idea?

```
tools["csv"]
tools["gson"]
```

instead of

```
${CSVTool}
${GsonTool}
```


2.3 DataSource Access in FreeMarker Model
-----------------------------------------------------------------------------

As discussed above a `DataSources` instance is exposed - maybe it should be called `dataSources` since it might be only a dumb list/map?



On 5 Jul 2020, at 18:38, Daniel Dekany wrote:

On Sun, Jul 5, 2020 at 4:13 PM Siegfried Goeschl <
siegfried.goes...@gmail.com> wrote:

Hi Daniel,

No problem with the XDocBook if you feel strongly about it :-) Will the
Maven site also be published somehow?

Regarding releasing "freemarker-generator"

* At the end of the day it is a command line tool people might use
sporadically - having said that the Maven plugin is a slightly different
story
* As far as I'm concerned my use-cases (data-centric processing & cloud
configuration stuff) are reasonably working
* I guess we can spend another 12 months discussing assumptions,
architectural decisions and implementation details (while appreciating your
knowledge and insights)


Let's not do that, but it's not what is happening either, as far as I see. There were like 3(ish) issues of the more fundamental nature (not details) that I brought up. Actually, re-raised them now after a few months, because I see no closure on your side. I mean, you agree but can't invest into it,
or you don't think these serve real demands, or... what is it?

I also said that as far as I'm concerned, we can do a release, if users are
properly informed about what we do not promise.


* As long as there is no release & real users out there those discussion
are mostly an intellectual exercise


In general, yes, but it depends on the concrete cases. For example,
data-file-seeded generation. The Maven plugin donators did that at work. FMPP users did both directions too. Those are real users. That we don't address that can have some valid reasons, but not having real/proven value
is not one of them. Or at least I guess you will agree with that.


* There is a danger that we never get a release out as it happened with
the Maven plugin


The maven plugin was abandoned by the donators right after the donation.
Nobody cared, or had the time they may wanted to have, so nobody did
anything with it. (That's really the #1 issue with OS development, when
there's no company with paid developers behind it.)


I think the code base went a long way already so we should clearly define
what's ABSOLUTELY missing to get it out the door .


Sure, let's see what others find (I have already told my observations).


Please note that there are a couple of additional tasks actually to create
a user base

* No idea how to get the Apache release procedures working (signing,
staging, etc ..)


It's documented here, but we (or at least I) will help:
https://freemarker.apache.org/committer-howto.html
Whatever you will find missing from there, we should add.


* Need to look int Brew and Linux distributions


And Windows, if you have access... tests fail on that for example. (Some
charset or line ending mess... didn't investigate yet.)


* Getting some public awareness - blogs, presentations, ApacheCon anyone?


Tweet is where we have some people... and of course menu item on
freemarker.apache.org.

Thanks in advance,

Siegfried Goeschl


On 05.07.2020, at 15:18, Daniel Dekany <daniel.dek...@gmail.com> wrote:

Find answers inline.

On Sun, Jul 5, 2020 at 9:31 AM Siegfried Goeschl <
siegfried.goes...@gmail.com <mailto:siegfried.goes...@gmail.com>> wrote:

Hi Daniel,

thanks for your feedback ...

# 1. Site Generation

Maven is not soo bad - I uploaded the generated site to
http://www.senilesoftwareengineer.org/freemarker-generator/ - I
personally have no problem using FreeMarker documentation approach but decided no to follow it up for the time being due to time constraints.


So if I convert the markdown to XDocBook, and set up site generation for it, and thereafter you will edit XDocBook (I recommend XXE for that),
will
that be OK with you?

# 2. Design/Architectural Doubts

## 2.1 Parsed Data

Currently we can load files/URLs as "DataSource" (which need to be
processed in FTL) or we can use "-m / --data-model" to parse and expose
the content directly (see


http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/data-models.html
).


(Yeah, that's where I said that it looks like a quick ad-hoc solution,
and
as such is quite limited.)

Regarding a less programmatic way - Named URIs are used (see


http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/named-uris.html
<
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/named-uris.html
)

to pass additional metadata but the implementation requires some re-work
to pass arbitrary metadata in the fragment.

E.g. what should be possible is

```
freeMarker-cli -t some.ftl -m


user=somepath/user.csv#delimiter=TAB&format=DEFAULT&charset=UTF-16&header=false
${CSVtool.parse(DataSources["user"])}
```

The idea is to be able to pass all metadata being used to parse the data
source - that would be also helpful for defining a SQL data source.


Yes. But I guess we should break out from the preconception that a data source must be an URL, and also that therefore we have to encode all the
parsing and selection parameters into the anchor as name-value pairs
(which
also implies that you might have % escapes). Because we sometimes List
and
Map<String, Object>-like data structures as well, etc. (Also, using the anchor is kind of an abuse of the URL syntax, and &name=value is query
parameter syntax, not anchor syntax. So this is maybe confusing for
users,
like they will keep writing ? instead of #.)

My idea is that a Data Source should be the name of a Data Loader (like, "csv", "excel", "jdbc", etc.), plus its parameters. To specify that, we should use FTL method call syntax, because that syntax the users have to learn anyway, and it's still relatively expressive (and can be re-used in
actual templates - see that later):
-s 'myTable=csv("foo/bar.csv", {"format": "EXCEL", "delimiter": ";"})'

If we have this (which has basically the same meaning as it has
currently):
-s  myRawTable=foo/bar.csv
that could be just a shorthand for:
-s myRawTable='raw("foo/bar.csv")'

So "raw" data is what you pass to "tools" for parsing in the current
version.

If you prefer do parsing in the templates, then you can reuse what you
learned about -s, as it's pretty much the same:
<#assign myTable = DataLoaders.csv(DataSources.myRawTable, { "format":
"EXCEL", "delimiter": ";" })>

Also, if you decide to parse data inside the template, doing it will be
very similar as in the -s switch.


## Data versus Template Driven Generation

Currently a variety of transformations are supported on the command line
(see


http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/transformation.html
)

* single template
* multiple templates
* directory of templates

Regarding "template versus data driven" generation - CLI does not
support data-driven generation where one output file is generated per
data source.


Yes. That's exactly what I feel is a major mistake on the long run. At least we should be able to add it later, but I guess, to ensure that, we pretty much have to implement it. So what will happen with this, what's
your standpoint?

Furthermore, I don't think that we should have a "mode" for this, which applies for a whole run. I think I'm repeating myself from some months
ago,
but, to recap, there are simply "seeds" that cause generating an output file. That "seed" can be a template, or a data file, or part of loaded
data
(like each row in a table is a "seed"). And I believe it's worth it to allow mixing these in the same run, because then freemarker-generator
will
be good enough fit for more use-cases. So for example, you have some JSON files in your project, each will generate a java file (typical source
code
generation), but you also have templates for pom.xml, etc., that will
generate its own output.

## 2.2 Convergence of Maven Plugin & CLI

It is on my list but does not have high priority since it should not
affect Maven plugin usage (regarding backward compatibility).


But that would be a completely new pugin, I assume.

# 3. Maturity / Backward Compatibility

The project is not mature - we have no real users and little field
testing - in short I don't want to guarantee backward-compatibility for now because some of my decisions might turn out to be plain wrong (or
even stupid) :-)

According to https://semver.org we can release 0.x.y and follow
https://semver.org/#spec-item-4 - not so nice but pragmatic.



# 4. General Considerations

There is a danger we never get "freemarker-generator" out of the door ("perfect is the enemy of good") therefore the whole PMC should look
into the questions/answers

* "freemarker-generator" needs community usage to become really useful -
the unreleased stuff does not add any value

* Is it acceptable to release a 0.X.Z where we not backward-compatible
until 1.0.0?


With the clear warning that we won't maintain this "branch" (a series of
backward compatible releases), etc. we can release. That, however,
discourages usage. And that's really the problem, not that we can't
release, because we can publish stuff in whatever development stage
(assuming users will be well informed). But if we are to hope for any
significant user base, I believe we should be able to promise some
stability, and maintenance of the BC branch. That's why I'm saying that
we
should get the core architecture about right, and the project scope. (I'm not talking about missing features that can be added later, nor about
bugs
and other rough edges.)


* Are there any deal breakers which needs to fixed before we start with
a first public release?


As far as I'm concerned (but I'm just one of the PMC members), if you
have
strong feelings for doing a release, with the proper warnings, sure, I
will
collaborate. The deal breaker is more for the users, who get no good BC
promise or maintenance.

Thanks in advance,

Siegfried Goeschl

On 4 Jul 2020, at 1:10, Daniel Dekany wrote:

It should be in https://freemarker.apache.org/generator/. You will
need to
commit/push into
https://github.com/apache/freemarker-site/tree/asf-site
for that. (See
https://freemarker.apache.org/committer-howto.html#updating-homepage)

We need to generate the web pages somehow though. I saw you try to do
that
with Maven "site", but personally, I can't imagine that it can be
tweaked
to generate reasonable output. Not sure how others see this. Maven
"site"
is basically the Maven model dumped into HTML pages, and "reports"
like
even the log of the Rat run etc. Dozens of menus and sub-pages for
details
users mostly don't care about. The actual content users do care about
is
stuffed under "About", and from there it's not properly navigable or
searchable, as apparently it's assumed to be a single page. I know
back
then you didn't want to use the tool that's used both for the main web
site
and for the FreeMarker documentation, but I think that's the most
economical solution currently (and then we also have common styling,
and
common place to fix whatever technical issues), so please reconsider
that.
(I can help setting that up, of course.)

Now, some more difficult-to-address problematic things. My problem is
that
these just weren't concluded. Discussion just died. To be concluded
can
even mean that other PMC members say we should step over these, even
if I
disagree. But these should be understood, and considered. So, these
are:

  - Design/architectural doubts that are probably not realistic to
address
  much later:
     - Currently, data sources are just URL-s (locations), and
templates
are meant to call tool API-s to parse them. As I said back then,
one
     consequence is that then, you can't put parsed data into the
data-model
     shared by all templates (i.e., via -m/--data-model). Because,
you have no
convenient/concise way to load the actual (parsed) data, instead
you have
to "program it" in FTL, because the assumption was that you only
want to do
     that in templates anyway. Furthermore, as I saw it just now,
--data-model
actually supports some ad-hoc way of parsing the data pointed by
a
data-source-like URL, as JSON, YAML, maybe some more. Here's an
example of
that: `-m config=env:///DB_CONFIG#mimetype=application/json`. So
there is a
     need actually. But compared to what you can do in templeats,
it's totally
     different, and of course very restrictive. I was also looking
for a less
     "programmatic" way of loading data because even doing it in
templates is
not very friendly as it is. (For ultimate flexibility you might
need
     program logic for sure, but certainly not just to grab a
worksheet from an
Excel file.) We also should have a POC for loading from a less
file centric
     data source, like from a DB with a SQL query, to see if the
concept works.
     - Not sure what's with template-driven VS data driven output
generation. Data driven is when you generate one output file per
some data
     unit (like per JSON file, or per DB record), and all is
transformed by a
     common template. How would that look currently in
freemarker-cli?
- The Maven plugin provides very different functionality than the
CLI.
The original concept was that they are just two interfaces to the
same
  product. That was also brought up, and I don't think it was
addressed much,
  like, why you disagree, if you do, or what's going on. Anyway,
regarding
what we have now. The naming implies that, as both are "FreeMarker Generator", just one is "CLI", and the other is "Maven". But as far
as I
  see, they are two different "products" really, focusing on
different use
  cases. So, if it stays like this, then some of them have to be
renamed at
  least. Or I don't know what others think we can do. (Then it's
still
  somewhat weird to release them together.)

(I have also run through the CLI documentation, and found a lot of
less
fundamental things to address, but I don't want to overload the
thread.)

Also, if the project is released mostly as it is now, what will we
promise
regarding backward compatibility? What do we communicate about the
matureness of the project?

On Wed, Jul 1, 2020 at 8:29 AM Siegfried Goeschl <
siegfried.goes...@gmail.com> wrote:

Hi folks,

I'm pretty much finished with the stuff I would like to have for the
first
public release but there is a still a lot of work ahead

* FREEMARKER-148 [freemarker-cli] Make usage of "DataSources" more
"Freemarker" like
* FREEMARKER-150 [freemarker-cli] Setup "freemarker-generator" web
site
* Setup the release process (Daniel, I guess I need some help here)
* An iteration of reviews and discussions

Some thought along the line

* I would like to finish FREEMARKER-148 this week
* Daniel, any suggestion where to host the public website?

Thanks in advance,

Siegfried Goeschl



--
Best regards,
Daniel Dekany



--
Best regards,
Daniel Dekany



--
Best regards,
Daniel Dekany

Reply via email to