[Joda-interest] Feature request: CharSequence instead of Strings for parsing…

Viktor Hedefalk Thu, 03 Feb 2011 06:21:11 -0800

Hi,

I'm trying to use joda-time from within a Scala parser-combinator. I'm
writing a parser for the output of a command line interface and as
part of this I need to parse some dates. I would really like to use
joda-time for this since I really don't want to reinvent the wheel.
There is however an impedance mismatch since the combinator-parser
library traverses CharSequences and the joda-time api wants Strings as
input. The combinator parsers are stream-based so there is really no
chance they could work on Strings.


Scala's combinator-parsers are a really nifty piece of software since
it allows me to write really concise production-rules for my grammar,
but I couldn't find any established way to parse dates from standard
patterns. The basic building blocks are mostly just regex, but it's
very easy to combine these into more advanced parsers. And it is also
very easy to write your own custom parsers, but they have to work on
CharSequences.

Basically, what I wanted to do was just to wrap the
dateFormat.parseInto()-method in a Scala-parser to be able to use it
as a building-block for other parsers.

When looking at the joda-time source, I could not really find any real
reason why it had to work with Strings. So I made an attempt to change
it to work with CharSequences instead. I cloned the unofficial mirror
over at github and made my changes. The output of this is over here:

https://github.com/hedefalk/joda-time/commit/ef3bdafd89b334fb052ce0dd192613683b3486a4

It was quick and dirty but less than an hour of work. Risking that
this email turns long, It allows me to write a Scala-parser like this

trait DateParsers extends RegexParsers {
  def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
    val dateFormat = DateTimeFormat.forPattern(pattern);

    def jodaParse(text: CharSequence, offset: Int) = {
      val mutableDateTime = new MutableDateTime
      val newPos = dateFormat.parseInto(mutableDateTime, text, offset);
      (mutableDateTime.toDateTime, newPos)
    }

    def apply(in: Input) = {
      val source = in.source
      val offset = in.offset
      val start = handleWhiteSpace(source, offset)
      val (dateTime, endPos) = jodaParse(source, start)
      if (endPos >= 0)
        Success(dateTime, in.drop(endPos - offset))
      else
        Failure("Failed to parse date", in.drop(start - offset))
    }
  }
}

which can then be used in production rules like this:


def changeset: Parser[ChangeSet] = changesetIdRow ~ opt(tagRow) ~
userRow ~ dateRow ~ summaryRow ^^
    {
      case changesetIdVal ~ tagOption ~ userVal ~ dateVal ~ summaryVal =>
        new ChangeSet(changesetIdVal, tagOption, userVal, dateVal, summaryVal)
    }

  private[this] def changesetIdRow = "changeset:" ~> changesetId
  private[this] def tagRow = "tag:" ~> ".*".r
  private[this] def userRow = "user:" ~> ".*".r
  private[this] def dateRow = "date:" ~> dateTime("EEE MMM d HH:mm:ss yyyy Z")
  private[this] def summaryRow = "summary:" ~> ".*".r


Anyway, I just wanted to show that this would be really useful for me.
By simply making joda-time parse CharSequences instead of Strings it
becomes a lot easier to use as part of a stream-based
parsing-framework.

Would it be possible to make a change like this to joda-time? As I
said, the patch linked above is just a quick and dirty fix, but I
wanted to see that it worked. All tests pass and the only thing that
really had to be replaced was the regionMatches()-method on String.

Cheers,
Viktor

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Joda-interest mailing list
Joda-interest@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/joda-interest

[Joda-interest] Feature request: CharSequence instead of Strings for parsing…

Reply via email to