Hi Rob,
New code is up at github. This update brings a more ruby-esque
(idiomatic) syntax to the DSL, as well as better coverage of Text
documents, Presentations, Lists, and Tables. I have also added
documentation to about 80% of the code.
https://github.com/noah/odf-command-line-tools
Remaining todos: 1) work out additional docs and any needed function
stubs 2) cross-platform installer. I think you will agree that the
code is easily extendable as it stands.
Paradoxically, bundling jruby/java jar codes together has proven to be
the hard part, so I will focus on making it cross-platform in the next
week.
Please take a look and let me know your thoughts.
Thanks!
-Noah
On Tue, Jul 10, 2012 at 8:58 AM, Noah Tilton <[email protected]> wrote:
> Hi Rob,
>
>
> On Fri, Jul 6, 2012 at 8:17 AM, Rob Weir <[email protected]> wrote:
>> On Sun, Jul 1, 2012 at 5:30 PM, Noah Tilton <[email protected]> wrote:
>>> Hi Rob,
>>>
>>> I just pushed to Github, please see below for an explanation of the
>>> changes and some questions regarding next steps.
>>>
>>> https://github.com/noah/odf-command-line-tools
>>>
>>
>> Great. I downloaded and gave it a try. Looks good.
>>
>>> On Sun, Jun 3, 2012 at 12:47 PM, Rob Weir <[email protected]> wrote:
>>>>
>>>> OK. Thanks for trying. Maven is great for managing Java project
>>>> dependencies. Well, at least for pure Java projects targeting
>>>> standard Java outputs like JAR's, WAR's, EAR's, etc. But the
>>>> Java/JRuby combination may be more complicated.
>>>>
>>>> I'm using Ubuntu 11.04 with bash. It isn't liking parts of that
>>>> script. But I was able to modify it as follows and it worked fine:
>>>>
>>>> #!/bin/sh
>>>>
>>>> if [ ! -d ./jars ]; then
>>>>
>>>> echo "Downloading jars"
>>>> mkdir -p ./jars
>>>> cd jars
>>>>
>>>> wget
>>>> http://mirrors.gigenet.com/apache//xerces/j/binaries/Xerces-J-bin.2.11.0.tar.gz\
>>>>
>>>> http://apache.osuosl.org/incubator/odftoolkit/binaries/odftoolkit-0.5-incubating-bin.tar.gz
>>>>
>>>> for targz in *.tar.gz; do
>>>> echo "Extracting $targz"
>>>> tar zxf $targz
>>>> done
>>>>
>>>> cd ..
>>>> fi
>>>>
>>>> echo "Running test"
>>>>
>>>> JAVA_HOME="$(dirname $(dirname $(readlink -f $(which java))))"
>>>> echo "set JAVA_HOME to $JAVA_HOME"
>>>> CLASSPATH=./jars/xerces-2_11_0/xercesImpl.jar:./jars/xerces-2_11_0/xml-apis.jar:./jars/odftoolkit-0.5-incubating/simple-odf-0.7-incubating.jar:./jars/odftoolkit-0.5-incubating/odfdom-java-0.8.8-incubating.jar
>>>> export CLASSPATH
>>>>
>>>> jruby lib/oclt.rb
>>>
>>> I incorporated your portability changes to test.sh, and renamed it to
>>> setup.sh, to reflect the fact that it doesn't actually run the code --
>>> it merely downloads jars and sets environment variables. To run the
>>> code from the odf-command-line-tools directory, type:
>>>
>>> source ./setup && jruby main.rb
>>>
>>> "source" is necessary because the script needs to export shell
>>> variables in the parent shell (i.e., not a subshell).
>>>
>>> If you want to use rdoc to build the documentation you may need to run:
>>>
>>> % jruby -S gem install rdoc
>>>
>>> before running
>>>
>>> % rdoc --main *
>>>
>>> from the top level directory (odf-command-line-tools).
>>>
>>>> Two things: build env and runtime env. For runtime env we should be
>>>> cross platform, right? For bulld env, cross platform is ideal, but I
>>>> would not get bogged down on that. Linux is fine.
>>>
>>> Yes, Linux for build only; runtime will be cross-platform. Added to TODO
>>>
>>>> I'm looking forward to seeing more on the DSL.
>>>
>>> The initial version of the DSL has support for generating new Text
>>> documents; loading modifying existing Text documents, iterating over
>>> Paragraphs, changing mode and font attributes of new Paragraphs ...
>>> This first version should give you a flavor of what the DSL might
>>> eventually look like, and I hope it is general enough that we can
>>> easily add other document types (e.g., Presentation).
>>>
>>> It's made up of 2 parts: the DSL proper (lib/oclt.rb) and a client
>>> script (main.rb).
>>>
>>> Assuming you're okay with what I've done so far, I'd like the
>>> discussion to focus on how the mapping of DSL methods => SimpleAPI
>>> methods should look. A 1-1 mapping works "out of the box" with the
>>> way the DSL is currently written. I.e., any SimpleAPI TextDocument
>>> method can now be called from the DSL, and will work.
>>>
>>
>> I noticed that. I had not taken a close look at Ruby before, but I'm
>> impressed by how that kind of dispatch can be set up with only a
>> little code.
>
> I like it too!
>
>>> So if we don't want to change the interface/API, all that's left to do
>>> is write documentation and more examples. But that won't provide much
>>> of an improvement over the original Java API, only those benefits
>>> which Ruby provides at the most basic level before any "sugar" is
>>> applied. In other words, while a 1-1 mapping works, it leads to
>>> unnecessary initialization, overall verbosity, and (IMHO) is exactly
>>> what a good DSL should seek to avoid!
>>>
>>
>> Right. So is your goal to make this me more "idiomatic" Ruby? Or to
>> create a DSL that is either Ruby nor Simple API, but something more
>> oriented to text processing?
>
> Yes, idiomatic Ruby is the way to go. I'd go one step further and say
> "idiomatic Ruby DSL-esque". There are many good examples, Rake and
> Buildr come to mind. This still leaves us a lot of freedom to tweak.
>
>> When we were designing the Simply API we tried to make it as easy and
>> intuitive as possible, within the constraints of Java, and using an
>> imperative programming style.
>>
>> One idea I had last night -- what if we inverted the problem? Instead
>> of a command line script running against the document, what if we put
>> the script *inside* the document? Does that help with any kinds of
>> repetitive tasks, document automation, etc.? Now the mere location of
>> the script doesn't matter. But a side benefit is that a script inside
>> the document, inline in the text of a document, has a context that the
>> user defines naturally via their word processor:
>>
>> "This confirms your order for %widget_count% widget(s)."
>>
>> This is then processed by a command line took that evaluates
>> widget-count as a Ruby routine. Or we want to avoid the user writing
>> code, what if we ultra-simplified it? For example, a document
>> template that had some named styles that expressed both
>> appearance/presentation as well as behaviour. Trivial example: a
>> list style called "Sorted List". No word process has this, to my
>> knowledge. But we could have that defined as a style. When the user
>> applies it, nothing magic occurs. But later, on the command line, the
>> document is processed by an app that applies the behaviours implied by
>> the styles, and writes out a new document. One could do very simple,
>> single-step operations that way. But is there a way to build up more
>> complicated scenarios like this? In other words, what helps enable
>> the power-user, scripter type to get some benefit of document
>> automation?
>
> I think this idea could work. And it's consistent with the codebase
> we already have. We could define some tags and a parser that would
> call out to the jruby routines in the lib/oclt.rb file (that's just a
> sketch, but I think it works). I can see a couple of problems with
> this approach, however. First is tight coupling -- do we really want
> the business logic inline with an XML document describing the
> structure? Second is, how does something like this jive with the ODF
> standard? It seems like embedding a bunch of tags into the document
> might mess that up.
>
>>> As a next step, I propose creating a more concise set of keywords for
>>> the TextDocument API, and working from there. I'm hoping we can
>>> iterate very fast now that there is a working version. If we can come
>>> up with (or better yet, find) some conventions for how to map a
>>> standard API to a DSL, I can use that and hit the list when I have
>>> specific questions.
>>>
>>> For an example of what I'm talking about, do we really need to say
>>> "set_horizontal_alignment(HorizontalAlignmentType)" or will "center"
>>> suffice? Should that really be done instead as a parameter in
>>> add_paragraph()? (e.g., doc.add_paragraph(:alignment => :center)).
>>> Should we let it work both ways for flexibility
>>>
>>
>> Right.
>>
>>> At the same time, we don't want to take the sugar too far -- the
>>> interface should be logical to end users without being too "cute". A
>>> couple of examples of things I changed/added are:
>>>
>>> page_break instead of add_page_break; paragraphs as a new iterable method.
>>>
>>
>> Right. All else being equal, it is best to be idiomatic Ruby.
>>
>> Another question: Are there any other Ruby libraries that it would
>> make sense to bring in to the mix? For example, any that are text or
>> language related, or even NLP? Google Translate interface? Stuff
>> like that that would radically extend the capabilities with little
>> incremental coding on your part.
>
> The code is an internal DSL whose host language is jRuby. Therefore,
> any C Ruby code can be used -- just require a Gem, external library,
> or part of the standard library. And because it's jRuby, Java
> libraries and core are fair game, and really anything that can be
> compiled into Java bytecode should work. Very flexible.
>
>>> The heavy lifting in the code is currently being done in the file
>>> lib/oclt.rb, specifically the method_missing call. Inside I have some
>>> logic that tests whether a certain method exists at runtime, and if it
>>> does, delegates appropriately. If the method doesn't exist the
>>> program bombs out, but there's no reason not to make it do something
>>> more intelligent, like search the SimpleAPI for a method which is
>>> similar to or begins with the given method name, and then perhaps give
>>> the user an informative error message.
>>>
>>> More formally, I think the code as written is (loosely) an
>>> implementation of the Adapter pattern
>>> (http://en.wikipedia.org/wiki/Adapter_pattern) but using open classes
>>> rather than an explicit adapter class. In ruby this can be done a
>>> number of ways, I was going for brevity.
>>>
>>> I invite your feedback, and the rest of the community too. Thanks and
>>> enjoy the 4th!
>>>
>>>
>>> --
>>> Noah
>
> I'm going to keep hacking on the DSL in the main branch, and work out
> a more idiomatic syntax and more examples. I'll aim for 100% Simple
> API coverage (in terms of idiomatic functionality but with a more
> idiomatic API; although the Simple API calls will still be available).
> In addition, I will create a separate branch for your "embedded tags"
> idea and see how it looks. I've got a dead laptop at home, but I'll
> try to get you something by the end of the week.
>
> Cheers,
>
> --
> Noah
--
Noah