I am trying to parse out  JIL(Job Information Language) scripts that happen
to have Name:Value pairs. Perhaps Tika is an overkill but wanted to use its
parsing ability and SAX event firing to make life easier. Could you please
point me to some examples of custom ContentHandler if you happen to know.

thanks


On Mon, Feb 10, 2014 at 2:27 PM, Ken Krugler <[email protected]>wrote:

>
> On Feb 10, 2014, at 11:22am, Rupak Khurana <[email protected]>
> wrote:
>
> Hello
>
> I have a plain text file that has several "Name : Value"  pairs that I
> want to parse out. Note this is not a XML or HTML file. Hoping that the
> startElement SAX event is fired whenever any "Name" element is encountered.
> Is there any ContentHandler that can do this? Currently with
> BodyContentHandler, I just get <body> All Name:Value pairs </body>. I am
> not sure it ElementMappingContentHandler can do the trick and how to use
> it? Any pointers please.
>
>
> If it's just plain text, then why do you want to deal with SAX events? Is
> it that the file is too big?
>
> In any case, I imagine you could get the desired behavior by implementing
> your own ContentHandler.
>
> -- Ken
>
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
>
>
>
>
>
>

Reply via email to