Re: [Digester] Re: Unknown nodes in digester?
(this is pretty much just thinking out load but...) i'm in the process of reviewing (and improving) Kelvin Tan's wildcard tail patch (allowing stuff like a/a/*). this could help with this circumstance - but i'm now a bit struck by the thought that making ExtendedBaseRules any more complex isn't really going to do much other than slow down the matching process. rather than go on forever expanding EBRs, i'd favour an implementation of STX or XPath. any volunteers? but this kind of problem could be solved by a mix and match approach. DepthMatchingRules would be easy to write, easy to use and quick. maybe we could write a Rules implementation (SplitRules? FilterRules?) that allowed different Rules implementations to be used to match different subtrees. so, you might use EBRs by default but for matches in the children of a/b/c, DepthMatchingRules or STXRules would be used. so path a/b/d would be matched by one Rules instance but a/d/c/d would be matched by another. we could make this more powerful by removing the head of the tree allowing disparate subtrees to be matched by the sames Rules implementation (if the user so wished). comments? - robert On Wednesday, March 12, 2003, at 01:37 AM, Schnitzer, Jason D (US SSA) wrote: Ok, Thank you so much for all the suggestions here is what I did that works for how I need it. Note this is just pseudo code. I am sure I could do it better. I just wanted to prove it could be done. If someone had detailed questions I can copy more of the code in.. // Driver setup // other rules . ValueRule valueRule = new ValueRule(); digester.addRule(/SomePath/To/What/I/Am/IntererestedIn/value, valueRule); ... The supporting classes. public class ValueRule extends Rule { . public void begin(String namespace, String name, Attributes attributes) throws java.lang.Exception { // Set up a new content handler to work in these situations.. // The content handler will be responsible for setting back the current // content handler so we can resume the normally scheduled program ContentHandler oldHandler = getDigester().getXMLReader().getContentHandler(); saxValueParser = new SaxValueParser(oldHandler, getDigester().getXMLReader()); this.getDigester().getXMLReader().setContentHandler(saxValueParser); super.begin(namespace, name, attributes); } public void end(String namespace, String name) throws java.lang.Exception { Object peekObj = digester.peek(); ((CastAway)peekObj).setValue(saxValueParser.getValue(); } ... } public class SaxValueParser extends DefaultHandler { . // Creates up a string that represents all the parts of the xml contained. public void endElement(String uri, String localName, String qName) throws SAXException { // don't forget to set the old content handler back } . } Good Luck, Jason -Original Message- From: Simon Kitching [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 11, 2003 4:23 PM To: Jakarta Commons Users List Subject: Re: [Digester] Re: Unknown nodes in digester? On Wed, 2003-03-12 at 12:18, Schnitzer, Jason D (US SSA) wrote: ROOT USERDEFINED UNKNOWN1/UNKNOWN1 /USERDEFINED /ROOT I would like a way to store the unknown1/unknown1 in a string inside of my class... So if I did How about using NodeCreateRule, which deals with DOM nodes? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[Digester] Re: Unknown nodes in digester?
Hello, Thanks for all the work creating Digester it works great! However, I have a similar problem to an earlier poster. I need to preserve the data. So to continue on with a previous example I found in the archive. ROOT USERDEFINED UNKNOWN1/UNKNOWN1 /USERDEFINED /ROOT I would like a way to store the unknown1/unknown1 in a string inside of my class... So if I did someClass.getUserDefined() It would return a string unknown1/unknown1 So I would like the rule: call-method-rule pattern=USERDEFINED methodname=setUserDefined paramcount=0/ To match all the way to the end tag of USERDEFINED with all the data inside of it. If there are no matching child patterns to intercept it. Is there a way to accomplish this? Thanks, Jason -Original Message- From: Craig R. McClanahan [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 22, 2003 1:34 AM To: Jakarta Commons Users List Subject: Re: Unknown nodes in digester? On Wed, 22 Jan 2003, Bill Chmura wrote: Date: Wed, 22 Jan 2003 00:43:20 -0500 From: Bill Chmura [EMAIL PROTECTED] Reply-To: Jakarta Commons Users List [EMAIL PROTECTED] To: 'jakarta Commons Users List' [EMAIL PROTECTED] Subject: Unknown nodes in digester? Hello, I am not sure what tool to use for what I need... I've used the DOM before, JAXB but I hear digester is pretty good. I've done some reading, but was hoping someone could give me some advice. I need to read a number of small XML documents. The kicker is that internally I will know ahead of time what 70% of the tags are, but there is the possibility for unknown tags to be within a known tag. Can digester be configured to handle this? In general, Digester works on a matching principle -- it assumes you know the element nesting pattern you are looking for. So, whether it's useful to you or not for your task is how far ahead of time you know what the element names will be -- if you have some sort of information that says an UNKNOWN1 will be nested inside a USERDEFINED inside a ROOT, then you can dynamically construct the matching patterns for your processing rules. It's really impossible, though, to give you much more help without understanding what you actually want to *do* with the data that is parsed. For example, if you want random access to the nodes, you probably want to use some sort of DOM-based solution -- anything that is SAX based (including Digester) is not going to be very helpful. Something like: ROOT USERDEFINED UNKNOWN1/UNKNOWN1 - ? UNKNOWN2/UNKNOWN2 - ? /USERDEFINED /ROOT Thanks! Bill Craig - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [Digester] Re: Unknown nodes in digester?
On Tue, 11 Mar 2003, Schnitzer, Jason D (US SSA) wrote: Date: Tue, 11 Mar 2003 15:18:35 -0800 From: Schnitzer, Jason D (US SSA) [EMAIL PROTECTED] Reply-To: Jakarta Commons Users List [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: [Digester] Re: Unknown nodes in digester? Hello, Thanks for all the work creating Digester it works great! Glad to hear it! However, I have a similar problem to an earlier poster. I need to preserve the data. So to continue on with a previous example I found in the archive. ROOT USERDEFINED UNKNOWN1/UNKNOWN1 /USERDEFINED /ROOT I would like a way to store the unknown1/unknown1 in a string inside of my class... So if I did someClass.getUserDefined() It would return a string unknown1/unknown1 So I would like the rule: call-method-rule pattern=USERDEFINED methodname=setUserDefined paramcount=0/ To match all the way to the end tag of USERDEFINED with all the data inside of it. If there are no matching child patterns to intercept it. Is there a way to accomplish this? Digester's standard pattern matching is really oriented towards pulling out what you *do* know, rather than what you don't :-). With the standard rules, you could absorb non-element body content inside your user-defined element, but not nested XML elements. You might try playing with the ExtendedBaseRules class (configure it on your Digester by calling setRules()) and playing with the * and !* matching patterns to see if you can get what you want. Thanks, Jason Craig -Original Message- From: Craig R. McClanahan [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 22, 2003 1:34 AM To: Jakarta Commons Users List Subject: Re: Unknown nodes in digester? On Wed, 22 Jan 2003, Bill Chmura wrote: Date: Wed, 22 Jan 2003 00:43:20 -0500 From: Bill Chmura [EMAIL PROTECTED] Reply-To: Jakarta Commons Users List [EMAIL PROTECTED] To: 'jakarta Commons Users List' [EMAIL PROTECTED] Subject: Unknown nodes in digester? Hello, I am not sure what tool to use for what I need... I've used the DOM before, JAXB but I hear digester is pretty good. I've done some reading, but was hoping someone could give me some advice. I need to read a number of small XML documents. The kicker is that internally I will know ahead of time what 70% of the tags are, but there is the possibility for unknown tags to be within a known tag. Can digester be configured to handle this? In general, Digester works on a matching principle -- it assumes you know the element nesting pattern you are looking for. So, whether it's useful to you or not for your task is how far ahead of time you know what the element names will be -- if you have some sort of information that says an UNKNOWN1 will be nested inside a USERDEFINED inside a ROOT, then you can dynamically construct the matching patterns for your processing rules. It's really impossible, though, to give you much more help without understanding what you actually want to *do* with the data that is parsed. For example, if you want random access to the nodes, you probably want to use some sort of DOM-based solution -- anything that is SAX based (including Digester) is not going to be very helpful. Something like: ROOT USERDEFINED UNKNOWN1/UNKNOWN1 - ? UNKNOWN2/UNKNOWN2 - ? /USERDEFINED /ROOT Thanks! Bill Craig - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [Digester] Re: Unknown nodes in digester?
On Wed, 2003-03-12 at 12:18, Schnitzer, Jason D (US SSA) wrote: ROOT USERDEFINED UNKNOWN1/UNKNOWN1 /USERDEFINED /ROOT I would like a way to store the unknown1/unknown1 in a string inside of my class... So if I did How about using NodeCreateRule, which deals with DOM nodes? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [Digester] Re: Unknown nodes in digester?
Ok, Thank you so much for all the suggestions here is what I did that works for how I need it. Note this is just pseudo code. I am sure I could do it better. I just wanted to prove it could be done. If someone had detailed questions I can copy more of the code in.. // Driver setup // other rules . ValueRule valueRule = new ValueRule(); digester.addRule(/SomePath/To/What/I/Am/IntererestedIn/value, valueRule); ... The supporting classes. public class ValueRule extends Rule { . public void begin(String namespace, String name, Attributes attributes) throws java.lang.Exception { // Set up a new content handler to work in these situations.. // The content handler will be responsible for setting back the current // content handler so we can resume the normally scheduled program ContentHandler oldHandler = getDigester().getXMLReader().getContentHandler(); saxValueParser = new SaxValueParser(oldHandler, getDigester().getXMLReader()); this.getDigester().getXMLReader().setContentHandler(saxValueParser); super.begin(namespace, name, attributes); } public void end(String namespace, String name) throws java.lang.Exception { Object peekObj = digester.peek(); ((CastAway)peekObj).setValue(saxValueParser.getValue(); } ... } public class SaxValueParser extends DefaultHandler { . // Creates up a string that represents all the parts of the xml contained. public void endElement(String uri, String localName, String qName) throws SAXException { // don't forget to set the old content handler back } . } Good Luck, Jason -Original Message- From: Simon Kitching [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 11, 2003 4:23 PM To: Jakarta Commons Users List Subject: Re: [Digester] Re: Unknown nodes in digester? On Wed, 2003-03-12 at 12:18, Schnitzer, Jason D (US SSA) wrote: ROOT USERDEFINED UNKNOWN1/UNKNOWN1 /USERDEFINED /ROOT I would like a way to store the unknown1/unknown1 in a string inside of my class... So if I did How about using NodeCreateRule, which deals with DOM nodes? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]