IMHO it should be merged into xpath component adding a parameter. Howeevr the other solution you speak about is interesting too: creating a dedicated component aroud stax which is a way to parse xml a bit particular compared to dom or sax.
- Romain 2011/5/23 Xavier Coulon <[email protected]> > Hello, > > I haven't posted any thing about the component below on JIRA yet, as I was > thinking a bit more about it this week-end... > Should it be a separate component as it is shown below (named "stax" > because > of the underlying technology it uses), or should it be merged with the > actual "xpath" component ? The former solution may seem a bit confusing to > the API users, the latter would require more work but would be cleaner. > > What do you think about it ? > Regards, > Xavier > > On Wed, May 18, 2011 at 5:11 PM, Xavier Coulon <[email protected]> wrote: > > > Hello, > > > > As a complement the contribution of Romain (who is a collegue of mine, > but > > in a different team), I would like to submit another component to the > Camel > > project. This component splits XML inputs with streaming, which, > according > > to the documentation, is not possible yet. The rule for splitting is an > > XPath expression, and the input source can be a GenericFile or an > > inputstream. > > > > The code is based on 3 classes, so I put it directly in this message (I > > just excluded the JUnit tests here): > > > > public class StaxExpressionBuilder implements Expression { > > > > private static final Logger LOGGER = LoggerFactory > > .getLogger(StaxExpressionBuilder.class); > > > > /** The XPath value that inputstream elements must match to be splitted. > > */ > > private final String path; > > > > public StaxExpressionBuilder(String path) { > > this.path = path; > > } > > > > @SuppressWarnings("unchecked") > > @Override > > public <T> T evaluate(Exchange exchange, Class<T> type) { > > try { > > Endpoint fromEndpoint = exchange.getFromEndpoint(); > > fromEndpoint.getEndpointKey(); > > Object body = exchange.getIn().getBody(); > > InputStream inputStream = null; > > if (body instanceof GenericFile) { > > GenericFile<File> file = (GenericFile<File>) body; > > inputStream = new FileInputStream(file.getFile()); > > } > > if (inputStream != null) { > > return (T) new StaxIterator(inputStream, path); > > } > > LOGGER.error("No inputstream for message body of type " > > + body.getClass().getCanonicalName()); > > } catch (FileNotFoundException e) { > > LOGGER.error("Failed to read incoming file", e); > > } catch (XMLStreamException e) { > > LOGGER.error( > > "Failed to create STaX iterator on incoming file message", > > e); > > } > > return null; > > } > > } > > > > -------------------------------- > > public class StaxIterator implements Iterator<String> { > > > > private final AtomicInteger counter = new AtomicInteger(0); > > private static final Logger LOGGER = LoggerFactory > > .getLogger(StaxIterator.class); > > > > private final XMLEventReader eventReader; > > private final XPathLocation currentLocation = new XPathLocation(); > > private final List<String> matchPathes; > > private final XMLInputFactory inputFactory = > XMLInputFactory.newInstance(); > > > > private String nextItem = null; > > > > public StaxIterator(InputStream inputStream, String pathes) > > throws XMLStreamException { > > this.matchPathes = new ArrayList<String>(); > > for (String path : pathes.split("\\|")) { > > this.matchPathes.add(path.trim()); > > } > > this.eventReader = inputFactory.createXMLEventReader(inputStream); > > this.nextItem = readNextItem(); > > } > > > > @Override > > public boolean hasNext() { > > return (nextItem != null); > > } > > > > @Override > > public String next() { > > String currentItem = this.nextItem; > > this.nextItem = readNextItem(); > > return currentItem; > > } > > > > private String readNextItem() { > > try { > > StringBuilder itemBuilder = null; > > boolean found = false; > > String item = null; > > while (eventReader.hasNext() && !found) { > > XMLEvent event = eventReader.nextEvent(); > > if (event.isStartElement()) { > > StartElement element = event.asStartElement(); > > String localName = element.getName().getLocalPart(); > > currentLocation.appendSegment(localName); > > if (currentLocation.matches(matchPathes)) { > > itemBuilder = new StringBuilder(); > > } > > startRecording(itemBuilder, element); > > } else if (event.isCharacters()) { > > record(itemBuilder, event.asCharacters()); > > } else if (event.isEndElement()) { > > // If we reach the end of an item element we stop recording. > > endRecordingElement(itemBuilder, event.asEndElement()); > > if (currentLocation.matches(matchPathes)) { > > found = true; > > item = itemBuilder.toString(); > > counter.incrementAndGet(); > > } > > currentLocation.removeLastSegment(); > > } > > } > > return item; > > } catch (XMLStreamException e) { > > LOGGER.error("Failed to read item #" + counter.get() > > + " from inputstream", e); > > return null; > > } > > } > > > > private void endRecordingElement(StringBuilder itemBuilder, > > EndElement endElement) { > > if (itemBuilder == null) { > > return; > > } > > itemBuilder.append("</").append(endElement.getName().getLocalPart()) > > .append(">"); > > } > > > > private void record(StringBuilder itemBuilder, Characters characters) { > > if (itemBuilder == null) { > > return; > > } > > itemBuilder.append(characters.getData()); > > } > > > > private void startRecording(StringBuilder itemBuilder, StartElement > > element) { > > if (itemBuilder == null) { > > return; > > } > > itemBuilder.append("<").append(element.getName().getLocalPart()); > > @SuppressWarnings("unchecked") > > Iterator<Attribute> attributes = element.getAttributes(); > > while (attributes.hasNext()) { > > Attribute attr = attributes.next(); > > itemBuilder.append(" ").append(attr.getName()).append("=\"") > > .append(attr.getValue()).append("\""); > > } > > itemBuilder.append(">"); > > } > > > > @Override > > public void remove() { > > throw new UnsupportedOperationException( > > "remove() method is not supported by this Iterator, in the context of > > StAX input reading only."); > > } > > } > > > > -------------------------------- > > public class XPathLocation { > > > > private static final String NODE_SEPARATOR = "/"; > > > > private static final String DOUBLE_NODE_SEPARATOR = "//"; > > > > /** location with initial value. */ > > private String location = NODE_SEPARATOR; > > > > /** > > * Constructor > > */ > > public XPathLocation() { > > super(); > > } > > > > /** > > * Full Constructor. > > * > > * @param value > > * initial value > > */ > > public XPathLocation(String value) { > > super(); > > this.location = value; > > } > > > > public String getLocation() { > > return location; > > } > > > > public String appendSegment(String segment) { > > location = new StringBuilder(location).append(NODE_SEPARATOR) > > .append(segment).toString(); > > location = location.replaceAll("//", "/"); > > return location; > > } > > > > public String removeLastSegment() { > > location = StringUtils.substringBeforeLast(location, NODE_SEPARATOR); > > if (location.isEmpty()) { > > location = NODE_SEPARATOR; > > } > > return location; > > } > > > > /** > > * Returns true if one of the given pattern matches the current location, > > * false otherwise > > * > > * @param orPatterns > > * the given patterns > > * @return true or false > > */ > > public boolean matches(final List<String> orPatterns) { > > for (String pattern : orPatterns) { > > if (matches(pattern)) { > > return true; > > } > > } > > return false; > > } > > > > /** > > * Returns true if the given pattern matches the current location, false > > * otherwise > > * > > * @param pattern > > * the given pattern > > * @return true or false > > */ > > public boolean matches(final String pattern) { > > if (pattern == null || pattern.isEmpty()) { > > return false; > > } else if (pattern.startsWith(NODE_SEPARATOR)) { > > return matchStartWith(pattern); > > } else if (pattern.contains(DOUBLE_NODE_SEPARATOR)) { > > return matchContains(pattern); > > } else { > > String lastSegments = StringUtils.substringAfterLast(location, > > pattern + NODE_SEPARATOR); > > return (!lastSegments.isEmpty()) && location.endsWith(lastSegments) > > && !lastSegments.contains(NODE_SEPARATOR); > > } > > } > > > > private boolean matchContains(String pattern) { > > String firstSegments = StringUtils.substringBefore(pattern, > > DOUBLE_NODE_SEPARATOR) + NODE_SEPARATOR; > > String lastSegments = NODE_SEPARATOR > > + StringUtils.substringAfter(pattern, DOUBLE_NODE_SEPARATOR); > > > > return location.contains(firstSegments) > > && location.endsWith(lastSegments) > > && location.indexOf(lastSegments, firstSegments.length()) >= (location > > .indexOf(firstSegments) + firstSegments.length() - NODE_SEPARATOR > > .length()); > > } > > > > private boolean matchStartWith(String pattern) { > > if (pattern.startsWith(DOUBLE_NODE_SEPARATOR)) { > > return location.endsWith(StringUtils.substringAfter(pattern, > > DOUBLE_NODE_SEPARATOR)); > > } else { > > return pattern.equals(location); > > } > > } > > } > > -------------------------------- > > > > In the code, here is how he use it: > > > > public class MyRouteBuilder > > extends RouteBuilder { > > > > @Override > > public void configure() { > > > from(file:..).*split(stax("//foo/bar")).streaming()*.to(...); > > } > > > > private Expression stax(String path) { > > return new StaxExpressionBuilder(path); > > } > > } > > > > Here's how it works : > > - when splitting the incoming message body, the stax() method returns a > new > > type of Iterator. > > - when streaming, the iterator's next() method is called. Using StAX > > inside, it moves into the inputstream and keeps track of the element > > locations it traverses. > > - when an element's location matches the given XPathLocation, the > iterator > > 'records' the inputstream content and returns it at the end of the > element. > > > > Note that the stax() method is part of my RouteBuilder, but it could be > > moved to the RouteBuilder super class for a generic usage. > > > > > > What do you think about it ? > > Is this something you're interested in ? > > > > Best regards, > > Xavier > > > > On Fri, May 13, 2011 at 8:21 AM, Romain Manni-Bucau < > [email protected] > > > wrote: > > > >> Hi, > >> > >> thank you Richard and Claus for your feedbacks. > >> > >> I modified the classloading stuff, the NPE catch and added the XMLUtil > >> class > >> to get the tag name. > >> > >> I added support for input stream as input (adding some converters) but > the > >> problem is that camel already have a lot of converters and you can load > >> back > >> the whole file very fast if you don't take care. > >> > >> - Romain > >> > >> 2011/5/13 Claus Ibsen <[email protected]> > >> > >> > Hi > >> > > >> > Yeah it does look very cool. Good work. > >> > > >> > Would be great if the StaxComponent could also cater for non file > >> > based inputs. You may have the message body as a Source already. But > >> > that can always be improved. > >> > > >> > And yes as Richard mention the class loading should use the > >> > ClassResolver. You can get it from the CamelContext. exchange -> camel > >> > context -> class resolver. > >> > > >> > And the stuff that finds the annotations. We may have some common code > >> > for that. Or later refactor that into a util class. > >> > > >> > Anyway keep it up. > >> > > >> > > >> > On Fri, May 13, 2011 at 1:29 AM, Richard Kettelerij > >> > <[email protected]> wrote: > >> > > Hi Romain, > >> > > > >> > > Nice work. I've taken a look at your component. A few minor > >> suggestions > >> > for > >> > > improvement, in case you want to contribute it to Apache: > >> > > > >> > > - The component currently uses getContextClassLoader().loadClass() > for > >> > > classloading. Camel actually has a abstraction to make this portable > >> > across > >> > > various runtime environments. You can just replace it with > >> > > org.apache.camel.spi.ClassResolver().resolveClass(). > >> > > > >> > > - Avoid catching the NullPointException in the > >> > StAXJAXBIteratorExpression. > >> > > > >> > > - Do you plan to add a DSL method for the StAXJAXBIteratorExpression > >> > > (requires patching camel-core)? So you can write for example > >> > > "split(stax(Record.class))" in your route. > >> > > > >> > > Regards, > >> > > Richard > >> > > > >> > > On Thu, May 12, 2011 at 5:55 PM, Romain Manni-Bucau > >> > > <[email protected]>wrote: > >> > > > >> > >> Hi all, > >> > >> > >> > >> i worked a bit around stax (thanks to claus for its advices). > >> > >> > >> > >> You can find what i've done here: > >> > >> > http://code.google.com/p/rmannibucau/source/browse/camel/camel-stax/ > >> > >> > >> > >> The test show what can be done with it: > >> > >> > >> > >> > >> > > >> > http://code.google.com/p/rmannibucau/source/browse/camel/camel-stax/src/test/java/org/apache/camel/stax/test/StAXRouteTest.java > >> > >> > >> > >> - validation using sax (just need a converter) > >> > >> - parsing using a sax contenthandler and a stax stream reader (a > >> > simple > >> > >> component) > >> > >> - parsing of sub tree to get jaxb objects using a stax event > reader > >> > for > >> > >> the whole tree and jaxb for the sub objects > >> > >> > >> > >> > >> > >> - Romain > >> > >> > >> > > > >> > > >> > > >> > > >> > -- > >> > Claus Ibsen > >> > ----------------- > >> > FuseSource > >> > Email: [email protected] > >> > Web: http://fusesource.com > >> > CamelOne 2011: http://fusesource.com/camelone2011/ > >> > Twitter: davsclaus > >> > Blog: http://davsclaus.blogspot.com/ > >> > Author of Camel in Action: http://www.manning.com/ibsen/ > >> > > >> > > > > > > > > -- > > Xavier > > > > > > -- > Xavier >
