Jason,

Don't worry, hindsight is always 20/20.  But, it takes very good
planning and a lot of time to do it right the first time.

It always gets better the more you work it.

James

On 4/11/2011 11:13 PM, Jason Baldridge wrote:
> Thanks everyone for your thoughts. I think the first step is to refactor the
> package sticking with Java and then we'll see about moving to a Scala/Java
> mix after that (but only for the opennlp machine learning package, currently
> opennlp-maxent).
>
> I was actually sort of appalled looking through the code yesterday and
> seeing so many global variables used all over the place, making it hard to
> know exactly what every method had access to. I think this was sort of an
> artifact of how I used Trove functions a loooong time ago to enable quick
> iteration over the data structures (which required some objects to be
> global). That is obviously gone now, but the global variables didn't go
> away... hope I'll find time to improve things over the next 5-6 months.
>
> Jason
>
> On Mon, Apr 11, 2011 at 7:27 AM, Tommaso Teofili
> <[email protected]>wrote:
>
>> Hi Jason,
>> I personally have some Scala experience while working with Clerezza [1]
>> which uses both Java and Scala but what I think is that, while Scala is
>> perfectly ok with existing Java standards and allowing functional/dynamic
>> programming, it raises the barrier for new users/devs a little bit.
>> So I am not so sure that a Scala implementation should totally replace an
>> existing one, maybe a graceful introduction would be more welcome.
>> My 2 cents,
>> Tommaso
>>
>>
>> [1] : http://incubator.apache.org/clerezza
>>
>> 2011/4/10 Jason Baldridge <[email protected]>
>>
>>> It's been a while since I posted these request for input... Does anyone
>>> have
>>> any thoughts on it? Is anyone else interested in Scala being part of
>>> OpenNLP?
>>>
>>> Jason
>>>
>>> On Tue, Mar 22, 2011 at 10:16 AM, Jason Baldridge
>>> <[email protected]>wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Jorn and I have had a little discussion about a topic I brought up with
>>> him
>>>> that I'd like to get everyone's thoughts on. I'm including our
>>> conversation
>>>> below, but the gist of it is this:
>>>>
>>>>  - I've been switching to development in Scala. At this point, I
>>> personally
>>>> see little point in coding in Java given that Scala is available (and
>>> very
>>>> very nice) and it plays very well with existing Java -- I'm very happy
>>> with
>>>> this for several projects I'm working on, including TextGrounder<
>>> http://code.google.com/p/textgrounder/>and
>>>> Junto <http://code.google.com/p/junto/>. So, I'd like to see Scala
>>> making
>>>
>>>> its way into OpenNLP.
>>>>  - We need to reorganize the maxent code into the new package
>>> opennlp.ml
>>>>  - I'd like to create the new package, retaining the Java code as is,
>>> make
>>>> a first release, and then allow Scala code to mix in with the Java from
>>> that
>>>> point on
>>>>  - A number of issues come up with this, including using another build
>>> tool
>>>> like SBT instead of Maven and ensuring we are Apache compliant and so
>>> on.
>>>> So, this is really just a feeler to see what you all think and see if
>>> you
>>>> have any enthusiasm, reservations or suggestions. Thanks!
>>>>
>>>> Jason
>>>>
>>>>
>>>> Forwarded conversation
>>>> Subject: opennlp.ml + Scala?
>>>> ------------------------
>>>>
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 1:28 PM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>> Hi Jorn,
>>>>
>>>> I've changed over to doing nearly all my coding in Scala, generally
>>>> transitioning Java codebases to Scala by writing everything new in Scala
>>> and
>>>> using the existing Java classes as they are. I would like to do this as
>>> part
>>>> of the new opennlp.ml, as I'm not inclined to write any new Java code
>>>> unless absolutely necessary, and I would very much like to create that
>>> new
>>>> and improved package. What do you think of this?
>>>>
>>>> Jason
>>>>
>>>> --
>>>> Jason Baldridge
>>>> Assistant Professor, Department of Linguistics
>>>> The University of Texas at Austin
>>>> http://www.jasonbaldridge.com
>>>>
>>>> ----------
>>>> From: *Jörn Kottmann* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 2:24 PM
>>>> To: Jason Baldridge <[email protected]>
>>>>
>>>>
>>>>  Hmm, yeah, if we would rewrite it I think it is something we could
>>>> consider, but in our case we just need
>>>> to do some reshaping of the existing code and a little refactoring here
>>> and
>>>> there. That is one reason
>>>> I believe we should be conservative and not use it in this case.
>>>>
>>>> Other issues I see is that it will be a message to the mahout people
>>> that
>>>> we do not want to collaborate,
>>>> which in fact I believe is something we should do to get map reduce
>>>> training support one day.
>>>> The people in the team might not be familiar with scala, which could
>>>> further limit the man power
>>>> which is available for the re-factoring. Just my 2 cents.
>>>>
>>>> I believe we should also do the maxent refactoring slowly and first do
>>>> everything inside the current
>>>> structures, and then when everythign is in place do the last changes
>>> which
>>>> break backward compatibilty.
>>>>
>>>> Anyway we should start a discussion about the future of OpenNLP, which
>>>> features do we want
>>>> to implement for the next few versions? Which new components would be
>>> nice
>>>> to have?
>>>> I believe there are quit some people who are willing to pick up tasks
>>> but
>>>> are simply not
>>>> aware about the possibility.
>>>>
>>>> Jörn
>>>>
>>>> ----------
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 3:29 PM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hmm... what if we did the first refactoring into opennlp.ml with pure
>>> Java
>>>> but the new package structure, then make a first release and then start
>>>> bringing in Scala?
>>>>
>>>>
>>>> Good points. However, I'm finding that Scala plays *very* nicely with
>>> Java
>>>> (including allowing Java to use Scala classes), so that could be mostly
>>>> transparent to users of the package, maintaining the API pretty much as
>>> it
>>>> is. So, I *think* we could continue to play nicely with Mahout folks.
>>>>
>>>> Also, after coding for a while in Scala, I can't help but feel that Java
>>>> the language is dead, while the JVM lives gloriously on. :) I think
>>> there is
>>>> a lot of momentum to Scala in general, and my feeling is that it is very
>>>> friendly for Java programmers. (Though I had experience in functional
>>>> programming before, so a lot of concepts came easily to me that could be
>>>> more unusual for others.)
>>>>
>>>>
>>>> What do you mean by "current structures"? Do you mean to keep the
>>> classes
>>>> as they are now, but just switch the package organization first?
>>>>
>>>>
>>>> Yes, perhaps we should do that once the release is all done? (Thanks for
>>>> all your hard work on that, btw!)
>>>>
>>>> Also, perhaps we should bring up the Scala question on the mailing list?
>>> I
>>>> wanted to ask you first to see if you had strong objections first, but
>>> since
>>>> you don't it might be good to sound out the community.
>>>>
>>>> Jason
>>>>
>>>>
>>>> ----------
>>>> From: *Jörn Kottmann* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 3:38 PM
>>>> To: Jason Baldridge <[email protected]>
>>>>
>>>>
>>>> I actually think just doing it for maxent/ml doesn't really makes sense,
>>> if
>>>> we want to switch the programming
>>>> language its for entire code base. Then we speak about the migration of
>>>> like 400 classes from java
>>>> to scala, does that really makes sense? Just doing a little scala
>>> doesn't
>>>> sounds reasonable for me.
>>>>
>>>> Sure move it to the mailing list.
>>>>
>>>> Jörn
>>>>
>>>> ----------
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 5:44 PM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>> But, the great thing about Scala is that you can mix Scala and Java and
>>> not
>>>> have to do one or the other -- so I don't think we'd need to do a full
>>>> migration.  Anyway, I'll bring it up on the list!
>>>>
>>>> ----------
>>>> From: *Jörn Kottmann* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 5:54 PM
>>>> To: Jason Baldridge <[email protected]>
>>>>
>>>>
>>>> Yeah, but then still most of the code will remain to be pure java mixed
>>>> with a little scala, but you have
>>>> to deal with the extra complexity for having a little scala, e.g. more
>>>> complex build tooling, you need
>>>> extra IDE support, more complicated compatibility issues, etc.
>>>>
>>>> Jörn
>>>>
>>>> ----------
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Mon, Mar 21, 2011 at 7:39 PM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>> The build is *really* easy with SBT (which can incorporate maven and ivy
>>>> dependency declarations). The idea would be to transition to Scala so
>>> that
>>>> it would eventually be mostly scala, if not all scala. A standard jar is
>>>> still distributed.
>>>>
>>>> ----------
>>>> From: *Jörn Kottmann* <[email protected]>
>>>> Date: Tue, Mar 22, 2011 at 4:33 AM
>>>> To: Jason Baldridge <[email protected]>
>>>>
>>>>
>>>> We are using maven right now, and it does a lot of more than just
>>> putting
>>>> together a jar file
>>>> e.g.:
>>>> - Making a release, with code signing, tagging in our SCM, producing rat
>>>> reports, etc.
>>>> - Deploying artifacts to the Apache repository
>>>> - Building our documentation
>>>> - Testing
>>>> - Optionally it can run code quality tools like find bugs or a test
>>>> coverage tools
>>>>
>>>> Jörn
>>>>
>>>> ----------
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Tue, Mar 22, 2011 at 9:11 AM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> These might need some looking into, but are probably doable.
>>>>
>>>>
>>>> These are builtin targets for SBT.
>>>>
>>>> -j
>>>>
>>>> ----------
>>>> From: *Jörn Kottmann* <[email protected]>
>>>> Date: Tue, Mar 22, 2011 at 9:20 AM
>>>> To: Jason Baldridge <[email protected]>
>>>>
>>>>
>>>>  Our entire build system was just rewritten to meet Apache rules and
>>>> standards, if we
>>>> do that again now it will set the project back for like a month or so.
>>>>
>>>> Jörn
>>>>
>>>> ----------
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Tue, Mar 22, 2011 at 9:33 AM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>> Fair enough. I will still bring it up as it now actually pains me to
>>> code
>>>> in Java. ;)
>>>>
>>>> Oh, here is how to deploy artifacts:
>>>>
>>>> http://henkelmann.eu/2010/11/14/sbt_hudson_with_test_integration
>>>>
>>>> I think the others would be straightforward. Possibly one of the bigger
>>>> sticking points would be IDE integration -- I use Emacs and it all works
>>>> very well for me, but I don't know how it is for Eclipse and NetBeans
>>> folks.
>>>> ----------
>>>> From: *Jörn Kottmann* <[email protected]>
>>>> Date: Tue, Mar 22, 2011 at 9:40 AM
>>>> To: Jason Baldridge <[email protected]>
>>>>
>>>>
>>>> I didn't say its not possible to rewrite our build with SBT, but I
>>> strongly
>>>> believe that is an effort which
>>>> will take quite some time e.g. a month just to get a build which is as
>>> good
>>>> as our maven build we just
>>>> finished.
>>>> All the people have to install the scala plugins into their IDEs to get
>>>> proper support, which is
>>>> of course also possible.
>>>>
>>>> Yeah bring it up on the mailing list.
>>>>
>>>> Jörn
>>>>
>>>> ----------
>>>> From: *Jason Baldridge* <[email protected]>
>>>> Date: Tue, Mar 22, 2011 at 9:46 AM
>>>> To: Jörn Kottmann <[email protected]>
>>>>
>>>>
>>>> Sounds good. And I find that it is often straightforward to take Maven
>>>> specifications and either use them directly from SBT or translate them
>>> into
>>>> the SBT definitions.  Perhaps we could start this with opennlp.ml and
>>> then
>>>> see how it goes before doing it in the main OpenNLP code.
>>>>
>>>>
>>>>
>>>> --
>>>> Jason Baldridge
>>>> Assistant Professor, Department of Linguistics
>>>> The University of Texas at Austin
>>>> http://www.jasonbaldridge.com
>>>>
>>>
>>>
>>> --
>>> Jason Baldridge
>>> Assistant Professor, Department of Linguistics
>>> The University of Texas at Austin
>>> http://www.jasonbaldridge.com
>>>
>>
>

Reply via email to