Hi folks,
I assembled the open discussions & TODOs - see below
Thanks in advance,
Siegfried Goeschl
1. Open Discussions
=============================================================================
1.1 Naming of CLI file
-----------------------------------------------------------------------------
It is currently named "freemarker-cli" but it was suggested to call it
"freemarker-generator" - I'm happy with "freemarker-cli" since it
reflects the Maven project layout
1.2 Complex Transformations using the CLI
-----------------------------------------------------------------------------
That a large topic (seed templates, seed datasources, shared
datasources, aggregation versus generation) - currently low on my
priority list because
* Transforming multiple templates and template directories is supported
* Personally I would not use a command-line tool for source code
generation but use a plugin (to avoid extra dependencies to be
installed)
Question being discussed are
* Is this an important feature?
* Can it be implemented later (if required) without breaking stuff?
1.3 Site Generation
-----------------------------------------------------------------------------
Currently Maven & Markdown is used but it is suggested to migrate to
XDocBook to match the existing FreeMarker documentation.
* Daniel is having a look at it
1.4 Data Loaders
-----------------------------------------------------------------------------
Currently it is possible to materialize certain files types directly in
the model using "-m" or "--data-model" parameter which parses the given
content based on the content type
```
freemarker-cli --data-model
post=https://jsonplaceholder.typicode.com/posts/2 -i 'post title is:
${post.title}'; echo
```
It was suggested to change to something like
```
-s 'post=json("https://jsonplaceholder.typicode.com/posts/2")'
```
which adds more flexibility
1.5 Named URIs
-----------------------------------------------------------------------------
Currently Named URIs are supported to provide a name and additional
metadats (mimetype, encoding) - the current implementation is not
complete but it should look like
```
-s 'myTable=foo/bar.csv#format=EXCEL&delimiter=TAB'
```
Current suggestion is to switch to something like
```
-s 'myTable=csv("foo/bar.csv", {"format": "EXCEL", "delimiter": ";"})'
```
1.6 Exposing DataSources
-----------------------------------------------------------------------------
Currently a class `DataSources` is used which keeps a list of data
sources and allows lookup by name (mixture of list and map) - we have
ongoing discussions
* Does a DataSource has a name when not directly specified by the user?
* Shall we enforce unique data source names?
1.7 Making Usage More FreeMarker Friendly
-----------------------------------------------------------------------------
In FREEMARKER-148 I wrapped the `DataSources` into a `BeanModel`
allowing better FTL support, e.g.
* `DataSources[0]` instead of `DataSources.get(0)`
* `DataSources["user"]` instead of `DataSource.get("user")`
Care must be taken regarding name collisions of data sources and exposed
methods of `DataSources`
1.8 Overlap Of CLI And Maven Plugin
-----------------------------------------------------------------------------
Both artifacts reside in the same Maven build but currently the `Maven
Plugin` is completely independent
* This might change in the future but I spend little time on the Maven
plugin
1.9 Support Of Backward Compatibility
-----------------------------------------------------------------------------
I lean towards 0.X.Y release and not guaranteeing backward compatibility
- little field testing and no real users.
1.10 Support Of Groups
-----------------------------------------------------------------------------
The Named URI allow to assign an optional group which can be used to
lookup up data sources
```
-data-source 'myTable:data=foo/bar.csv'
```
* I implemented it since I though it would be a nice feature but not so
sure about it any longer :-)
* It was suggest to swap name and group, e.g. `-data-source
'data:myTable=foo/bar.csv'`
2. TODOs
=============================================================================
2.1 Discuss And Prioritise The Open Topics
-----------------------------------------------------------------------------
Discuss and define what needs to be done for the first public release.
2.2 Get The Release Process Going
-----------------------------------------------------------------------------
FreeMarker project is using `Ant` for the release process (see
https://freemarker.apache.org/committer-howto.html) but that does not
work out-of-the-box with the `freemarker-generator` Maven project.
2.3 Tool Access in FreeMarker Model
-----------------------------------------------------------------------------
All tools are directly exposed which adds a lot of noise - might be a
map a better idea?
```
tools["csv"]
tools["gson"]
```
instead of
```
${CSVTool}
${GsonTool}
```
2.3 DataSource Access in FreeMarker Model
-----------------------------------------------------------------------------
As discussed above a `DataSources` instance is exposed - maybe it should
be called `dataSources` since it might be only a dumb list/map?
On 5 Jul 2020, at 18:38, Daniel Dekany wrote:
On Sun, Jul 5, 2020 at 4:13 PM Siegfried Goeschl <
siegfried.goes...@gmail.com> wrote:
Hi Daniel,
No problem with the XDocBook if you feel strongly about it :-) Will
the
Maven site also be published somehow?
Regarding releasing "freemarker-generator"
* At the end of the day it is a command line tool people might use
sporadically - having said that the Maven plugin is a slightly
different
story
* As far as I'm concerned my use-cases (data-centric processing &
cloud
configuration stuff) are reasonably working
* I guess we can spend another 12 months discussing assumptions,
architectural decisions and implementation details (while
appreciating your
knowledge and insights)
Let's not do that, but it's not what is happening either, as far as I
see.
There were like 3(ish) issues of the more fundamental nature (not
details)
that I brought up. Actually, re-raised them now after a few months,
because
I see no closure on your side. I mean, you agree but can't invest into
it,
or you don't think these serve real demands, or... what is it?
I also said that as far as I'm concerned, we can do a release, if
users are
properly informed about what we do not promise.
* As long as there is no release & real users out there those
discussion
are mostly an intellectual exercise
In general, yes, but it depends on the concrete cases. For example,
data-file-seeded generation. The Maven plugin donators did that at
work.
FMPP users did both directions too. Those are real users. That we
don't
address that can have some valid reasons, but not having real/proven
value
is not one of them. Or at least I guess you will agree with that.
* There is a danger that we never get a release out as it happened
with
the Maven plugin
The maven plugin was abandoned by the donators right after the
donation.
Nobody cared, or had the time they may wanted to have, so nobody did
anything with it. (That's really the #1 issue with OS development,
when
there's no company with paid developers behind it.)
I think the code base went a long way already so we should clearly
define
what's ABSOLUTELY missing to get it out the door .
Sure, let's see what others find (I have already told my
observations).
Please note that there are a couple of additional tasks actually to
create
a user base
* No idea how to get the Apache release procedures working (signing,
staging, etc ..)
It's documented here, but we (or at least I) will help:
https://freemarker.apache.org/committer-howto.html
Whatever you will find missing from there, we should add.
* Need to look int Brew and Linux distributions
And Windows, if you have access... tests fail on that for example.
(Some
charset or line ending mess... didn't investigate yet.)
* Getting some public awareness - blogs, presentations, ApacheCon
anyone?
Tweet is where we have some people... and of course menu item on
freemarker.apache.org.
Thanks in advance,
Siegfried Goeschl
On 05.07.2020, at 15:18, Daniel Dekany <daniel.dek...@gmail.com>
wrote:
Find answers inline.
On Sun, Jul 5, 2020 at 9:31 AM Siegfried Goeschl <
siegfried.goes...@gmail.com <mailto:siegfried.goes...@gmail.com>>
wrote:
Hi Daniel,
thanks for your feedback ...
# 1. Site Generation
Maven is not soo bad - I uploaded the generated site to
http://www.senilesoftwareengineer.org/freemarker-generator/ - I
personally have no problem using FreeMarker documentation approach
but
decided no to follow it up for the time being due to time
constraints.
So if I convert the markdown to XDocBook, and set up site generation
for
it, and thereafter you will edit XDocBook (I recommend XXE for
that),
will
that be OK with you?
# 2. Design/Architectural Doubts
## 2.1 Parsed Data
Currently we can load files/URLs as "DataSource" (which need to be
processed in FTL) or we can use "-m / --data-model" to parse and
expose
the content directly (see
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/data-models.html
).
(Yeah, that's where I said that it looks like a quick ad-hoc
solution,
and
as such is quite limited.)
Regarding a less programmatic way - Named URIs are used (see
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/named-uris.html
<
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/named-uris.html
)
to pass additional metadata but the implementation requires some
re-work
to pass arbitrary metadata in the fragment.
E.g. what should be possible is
```
freeMarker-cli -t some.ftl -m
user=somepath/user.csv#delimiter=TAB&format=DEFAULT&charset=UTF-16&header=false
${CSVtool.parse(DataSources["user"])}
```
The idea is to be able to pass all metadata being used to parse the
data
source - that would be also helpful for defining a SQL data source.
Yes. But I guess we should break out from the preconception that a
data
source must be an URL, and also that therefore we have to encode all
the
parsing and selection parameters into the anchor as name-value pairs
(which
also implies that you might have % escapes). Because we sometimes
List
and
Map<String, Object>-like data structures as well, etc. (Also, using
the
anchor is kind of an abuse of the URL syntax, and &name=value is
query
parameter syntax, not anchor syntax. So this is maybe confusing for
users,
like they will keep writing ? instead of #.)
My idea is that a Data Source should be the name of a Data Loader
(like,
"csv", "excel", "jdbc", etc.), plus its parameters. To specify that,
we
should use FTL method call syntax, because that syntax the users
have to
learn anyway, and it's still relatively expressive (and can be
re-used in
actual templates - see that later):
-s 'myTable=csv("foo/bar.csv", {"format": "EXCEL", "delimiter":
";"})'
If we have this (which has basically the same meaning as it has
currently):
-s myRawTable=foo/bar.csv
that could be just a shorthand for:
-s myRawTable='raw("foo/bar.csv")'
So "raw" data is what you pass to "tools" for parsing in the current
version.
If you prefer do parsing in the templates, then you can reuse what
you
learned about -s, as it's pretty much the same:
<#assign myTable = DataLoaders.csv(DataSources.myRawTable, {
"format":
"EXCEL", "delimiter": ";" })>
Also, if you decide to parse data inside the template, doing it will
be
very similar as in the -s switch.
## Data versus Template Driven Generation
Currently a variety of transformations are supported on the command
line
(see
http://www.senilesoftwareengineer.org/freemarker-generator/freemarker-generator-cli/cli/concepts/transformation.html
)
* single template
* multiple templates
* directory of templates
Regarding "template versus data driven" generation - CLI does not
support data-driven generation where one output file is generated
per
data source.
Yes. That's exactly what I feel is a major mistake on the long run.
At
least we should be able to add it later, but I guess, to ensure
that, we
pretty much have to implement it. So what will happen with this,
what's
your standpoint?
Furthermore, I don't think that we should have a "mode" for this,
which
applies for a whole run. I think I'm repeating myself from some
months
ago,
but, to recap, there are simply "seeds" that cause generating an
output
file. That "seed" can be a template, or a data file, or part of
loaded
data
(like each row in a table is a "seed"). And I believe it's worth it
to
allow mixing these in the same run, because then
freemarker-generator
will
be good enough fit for more use-cases. So for example, you have some
JSON
files in your project, each will generate a java file (typical
source
code
generation), but you also have templates for pom.xml, etc., that
will
generate its own output.
## 2.2 Convergence of Maven Plugin & CLI
It is on my list but does not have high priority since it should
not
affect Maven plugin usage (regarding backward compatibility).
But that would be a completely new pugin, I assume.
# 3. Maturity / Backward Compatibility
The project is not mature - we have no real users and little field
testing - in short I don't want to guarantee backward-compatibility
for
now because some of my decisions might turn out to be plain wrong
(or
even stupid) :-)
According to https://semver.org we can release 0.x.y and follow
https://semver.org/#spec-item-4 - not so nice but pragmatic.
# 4. General Considerations
There is a danger we never get "freemarker-generator" out of the
door
("perfect is the enemy of good") therefore the whole PMC should
look
into the questions/answers
* "freemarker-generator" needs community usage to become really
useful -
the unreleased stuff does not add any value
* Is it acceptable to release a 0.X.Z where we not
backward-compatible
until 1.0.0?
With the clear warning that we won't maintain this "branch" (a
series of
backward compatible releases), etc. we can release. That, however,
discourages usage. And that's really the problem, not that we can't
release, because we can publish stuff in whatever development stage
(assuming users will be well informed). But if we are to hope for
any
significant user base, I believe we should be able to promise some
stability, and maintenance of the BC branch. That's why I'm saying
that
we
should get the core architecture about right, and the project scope.
(I'm
not talking about missing features that can be added later, nor
about
bugs
and other rough edges.)
* Are there any deal breakers which needs to fixed before we start
with
a first public release?
As far as I'm concerned (but I'm just one of the PMC members), if
you
have
strong feelings for doing a release, with the proper warnings, sure,
I
will
collaborate. The deal breaker is more for the users, who get no good
BC
promise or maintenance.
Thanks in advance,
Siegfried Goeschl
On 4 Jul 2020, at 1:10, Daniel Dekany wrote:
It should be in https://freemarker.apache.org/generator/. You will
need to
commit/push into
https://github.com/apache/freemarker-site/tree/asf-site
for that. (See
https://freemarker.apache.org/committer-howto.html#updating-homepage)
We need to generate the web pages somehow though. I saw you try to
do
that
with Maven "site", but personally, I can't imagine that it can be
tweaked
to generate reasonable output. Not sure how others see this. Maven
"site"
is basically the Maven model dumped into HTML pages, and "reports"
like
even the log of the Rat run etc. Dozens of menus and sub-pages for
details
users mostly don't care about. The actual content users do care
about
is
stuffed under "About", and from there it's not properly navigable
or
searchable, as apparently it's assumed to be a single page. I know
back
then you didn't want to use the tool that's used both for the main
web
site
and for the FreeMarker documentation, but I think that's the most
economical solution currently (and then we also have common
styling,
and
common place to fix whatever technical issues), so please
reconsider
that.
(I can help setting that up, of course.)
Now, some more difficult-to-address problematic things. My problem
is
that
these just weren't concluded. Discussion just died. To be
concluded
can
even mean that other PMC members say we should step over these,
even
if I
disagree. But these should be understood, and considered. So,
these
are:
- Design/architectural doubts that are probably not realistic to
address
much later:
- Currently, data sources are just URL-s (locations), and
templates
are meant to call tool API-s to parse them. As I said back
then,
one
consequence is that then, you can't put parsed data into the
data-model
shared by all templates (i.e., via -m/--data-model). Because,
you have no
convenient/concise way to load the actual (parsed) data,
instead
you have
to "program it" in FTL, because the assumption was that you
only
want to do
that in templates anyway. Furthermore, as I saw it just now,
--data-model
actually supports some ad-hoc way of parsing the data pointed
by
a
data-source-like URL, as JSON, YAML, maybe some more. Here's
an
example of
that: `-m config=env:///DB_CONFIG#mimetype=application/json`.
So
there is a
need actually. But compared to what you can do in templeats,
it's totally
different, and of course very restrictive. I was also looking
for a less
"programmatic" way of loading data because even doing it in
templates is
not very friendly as it is. (For ultimate flexibility you
might
need
program logic for sure, but certainly not just to grab a
worksheet from an
Excel file.) We also should have a POC for loading from a
less
file centric
data source, like from a DB with a SQL query, to see if the
concept works.
- Not sure what's with template-driven VS data driven output
generation. Data driven is when you generate one output file
per
some data
unit (like per JSON file, or per DB record), and all is
transformed by a
common template. How would that look currently in
freemarker-cli?
- The Maven plugin provides very different functionality than
the
CLI.
The original concept was that they are just two interfaces to
the
same
product. That was also brought up, and I don't think it was
addressed much,
like, why you disagree, if you do, or what's going on. Anyway,
regarding
what we have now. The naming implies that, as both are
"FreeMarker
Generator", just one is "CLI", and the other is "Maven". But as
far
as I
see, they are two different "products" really, focusing on
different use
cases. So, if it stays like this, then some of them have to be
renamed at
least. Or I don't know what others think we can do. (Then it's
still
somewhat weird to release them together.)
(I have also run through the CLI documentation, and found a lot of
less
fundamental things to address, but I don't want to overload the
thread.)
Also, if the project is released mostly as it is now, what will we
promise
regarding backward compatibility? What do we communicate about the
matureness of the project?
On Wed, Jul 1, 2020 at 8:29 AM Siegfried Goeschl <
siegfried.goes...@gmail.com> wrote:
Hi folks,
I'm pretty much finished with the stuff I would like to have for
the
first
public release but there is a still a lot of work ahead
* FREEMARKER-148 [freemarker-cli] Make usage of "DataSources"
more
"Freemarker" like
* FREEMARKER-150 [freemarker-cli] Setup "freemarker-generator"
web
site
* Setup the release process (Daniel, I guess I need some help
here)
* An iteration of reviews and discussions
Some thought along the line
* I would like to finish FREEMARKER-148 this week
* Daniel, any suggestion where to host the public website?
Thanks in advance,
Siegfried Goeschl
--
Best regards,
Daniel Dekany
--
Best regards,
Daniel Dekany
--
Best regards,
Daniel Dekany