zip-archive generator (reverse of zip-archive serializer)

2012-12-07 Thread Robby Pelssers
Hi guys,

Not sure if we have a zip-archive generator already 
http://cocoon.apache.org/2.1/userdocs/ziparchive-serializer.html

but it would be very cool to have one.  Let me explain the use case:

!-- 
   {1}: a URI pointing to a zip containing XML documents
--
map:match pattern=processzip/**
  map:generate src={1} type=zip/
  map:transform src=processfiles.xslt/
   Now a lot of options
- write results to disk 
- just serialize result
- zip transformed files again
 ...
/map:match

So what should this ziparchive generator do?  It should let us peak into the 
ziparchive and return URI's for all entries

zip:archive xmlns:zip=http://apache.org/cocoon/zip-archive/1.0;
  zip:entry 
name=jar:file:/C:/data/productinformation.zip!/products/PH3330L.xml/
  ...
  zip:entry name=jar:file:/C:/data/ 
productinformation.zip!/packages/SOT669.xml/
   ...
/zip:archive

Or

zip:archive xmlns:zip=http://apache.org/cocoon/zip-archive/1.0;
  zip:entry 
name=jar:http://www.mydomain.com/data/productinformation.zip!/products/PH3330L.xml/
  ...
  zip:entry 
name=jar:http://www.mydomain.com/data/productinformation.zip!/packages/SOT669.xml
 /
   ...
/zip:archive


So if you add a transformer in that pipeline you can use the XSLT document 
function to fetch the documents and process them individually.

I'm only not sure about how to implement this efficiently. I  don't want to 
make requests in case of a HTTP URI:
- 1 used by the ziparchive-generator to produce the XML above
- 1 request per invocation of the document function

So maybe caching can resolve this or are there better options?

Robby


Re: zip-archive generator (reverse of zip-archive serializer)

2012-12-07 Thread gelo1234
I don't know about your use-case of zip-archive usage, but it looks like a
typical Data Integration scenario.
Perhaps you should take a look at Apache Camel:
http://camel.apache.org/zip-dataformat.html

Greetings,
-Greg

2012/12/7 Robby Pelssers robby.pelss...@nxp.com

 Hi guys,

 Not sure if we have a zip-archive generator already
 http://cocoon.apache.org/2.1/userdocs/ziparchive-serializer.html

 but it would be very cool to have one.  Let me explain the use case:

 !--
{1}: a URI pointing to a zip containing XML documents
 --
 map:match pattern=processzip/**
   map:generate src={1} type=zip/
   map:transform src=processfiles.xslt/
Now a lot of options
 - write results to disk
 - just serialize result
 - zip transformed files again
  ...
 /map:match

 So what should this ziparchive generator do?  It should let us peak into
 the ziparchive and return URI's for all entries

 zip:archive xmlns:zip=http://apache.org/cocoon/zip-archive/1.0;
   zip:entry
 name=jar:file:/C:/data/productinformation.zip!/products/PH3330L.xml/
   ...
   zip:entry name=jar:file:/C:/data/
 productinformation.zip!/packages/SOT669.xml/
...
 /zip:archive

 Or

 zip:archive xmlns:zip=http://apache.org/cocoon/zip-archive/1.0;
   zip:entry name=jar:
 http://www.mydomain.com/data/productinformation.zip!/products/PH3330L.xml
 /
   ...
   zip:entry name=jar:
 http://www.mydomain.com/data/productinformation.zip!/packages/SOT669.xml/
...
 /zip:archive


 So if you add a transformer in that pipeline you can use the XSLT document
 function to fetch the documents and process them individually.

 I'm only not sure about how to implement this efficiently. I  don't want
 to make requests in case of a HTTP URI:
 - 1 used by the ziparchive-generator to produce the XML above
 - 1 request per invocation of the document function

 So maybe caching can resolve this or are there better options?

 Robby