I was aware that XSLT 2.0 might be able to do this, and that it might even be possible now with XSLT extensions. But I felt like this was sufficient to warrant a specifically focused transformer at least for now.
I would be interested to see what your appraoch was though. I'm using the ORO regex stuff right now, and am considering extracting out a base transformer to handle the XPath subset stuff. It works the way I want, allowing multiple rules to apply, etc. Now it is just formalizing the criteria for rule applicability and configuration.....and of course cleanup and documentation... > ---------- > From: Marc Portier[SMTP:[EMAIL PROTECTED]] > Reply To: [EMAIL PROTECTED] > Sent: Friday, May 10, 2002 3:42 AM > To: [EMAIL PROTECTED] > Subject: RE: A Transformer in progress.... > > > > I'm working on a Transformer that processes specifically text nodes > and > > > using regular expressions, wraps matched portions of a node in > > a tag. I'm > > > really just getting started on it - I have the basics working, but > still > > > need to be able to specify the rules in an external file, etc. > > It has been > > > an interesting excercise so far, and the intent is to be able to > detect > > > things like dates, currency amounts, and units of measure in a > > text node, > > > and mark them for later processing. > > > > > > I am planning on allowing rules to be specified in an external file > > > identified at componenet configuration, or directly in the component > > > configuration. I am also planning on allowing the "replacement" to be > a > > > complete fragement with groups from the matched expression > referencable > > > (and replaced) in etiher attribute values or text nodes. (currently I > > > merely enclose the match in a tag). > > > > > > Before I move on, has anyone else already done something like this? > Does > > > anyone (other than me) think it would be useful? > > > > > did something similar around january, after that the interest/needs kinda > shifted, > so I didn't continu on it since, I still plan on taking it up round the > summer or so > (if you want I can make my current stuff available, and join in some > discussions) > > the biggest difference however is that you're assuming input has good but > too little markup > (so you go for a transformer) > > while we were scratching the itch of pure text input and/or bad markup > present > (like HTML that jTidy can't handle, or even now when there is nekoHTML: > whenever the > regex approach is easier then the XSLT afterwards cause of the mess in > the > HTML) > (so we chose a generator as lifeform) > > at the time we started we colided with some joint thinking activity about > adding > regex kind of support inside XSLT2.0, see for some of those discussions: > http://www.biglist.com/lists/xsl-list/archives/200201/msg00488.html > > must say, I didn't follow the further development of xslt2.0 since, so > maybe > someone else > could comment on any future for this kind of stuff inside the spec (and > thus > impls like > xalan or saxon) > > > > > Ok...now to my real question... > > > > > > Does anyone know of existing code that I can use to track and > > identify if > > > the current point in the SAX stream matches a simplified XPath > > expression? > > > I would really like to apply expression rule set based on an > > XPath subset. > > Elas, don't know of such a beast (would be nice though) > > So you got me triggered about thinking about this :-) > Can we define 'Simplified' xpath ? > SAX gives you a timely snapshot of current position in the XML file > so you would easily be able to have some kind of match for a simple > hierarchy of elements (wouldn't even need to be root based I guess, > even slide in some attribute tests should be possible, also position > evaluation as long as it's not =last()... ) > > SAX just doesn't allow to look into the future so a lot of xpath you > will not be able to do. > > > > > > > Any comments or suggestions? > > You'll have to take a trade-off between how far you can cripple xpath to > still > support your needs, yet still can beat the maybe awkward but not totally > wrong > approch of building a temp DOM tree internally in your cocoon transformer > to > allow real xpath stuff upon? (after all something similar happens to some > extend > inside the xslt process he) > > > > > > > Thanks! > > please keep us posted of your findings and progress > -marc= > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, email: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]