I don't know about your use-case of zip-archive usage, but it looks like a
typical Data Integration scenario.
Perhaps you should take a look at Apache Camel:
http://camel.apache.org/zip-dataformat.html
Greetings,
-Greg
2012/12/7 Robby Pelssers robby.pelss...@nxp.com
Hi guys,
Not sure if we have a zip-archive generator already
http://cocoon.apache.org/2.1/userdocs/ziparchive-serializer.html
but it would be very cool to have one. Let me explain the use case:
!--
{1}: a URI pointing to a zip containing XML documents
--
map:match pattern=processzip/**
map:generate src={1} type=zip/
map:transform src=processfiles.xslt/
Now a lot of options
- write results to disk
- just serialize result
- zip transformed files again
...
/map:match
So what should this ziparchive generator do? It should let us peak into
the ziparchive and return URI's for all entries
zip:archive xmlns:zip=http://apache.org/cocoon/zip-archive/1.0;
zip:entry
name=jar:file:/C:/data/productinformation.zip!/products/PH3330L.xml/
...
zip:entry name=jar:file:/C:/data/
productinformation.zip!/packages/SOT669.xml/
...
/zip:archive
Or
zip:archive xmlns:zip=http://apache.org/cocoon/zip-archive/1.0;
zip:entry name=jar:
http://www.mydomain.com/data/productinformation.zip!/products/PH3330L.xml
/
...
zip:entry name=jar:
http://www.mydomain.com/data/productinformation.zip!/packages/SOT669.xml/
...
/zip:archive
So if you add a transformer in that pipeline you can use the XSLT document
function to fetch the documents and process them individually.
I'm only not sure about how to implement this efficiently. I don't want
to make requests in case of a HTTP URI:
- 1 used by the ziparchive-generator to produce the XML above
- 1 request per invocation of the document function
So maybe caching can resolve this or are there better options?
Robby